Skip to content
Permalink
master
Switch branches/tags
Go to file
 
 
Cannot retrieve contributors at this time

Couger research paper

2021.06.18 : Leveraging Multi scale Backbone with Multilevel supervision for Thermal Image Super Resolution

This paper proposes an attention-based multi-level model with a multi-scale backbone for thermal image superresolution. The model leverages the multi-scale backbone as well. The thermal image dataset is provided by PBVS 2020 in their thermal image super-resolution challenge. This dataset contains the images with three different resolution scales(low, medium, high). However, only the medium and high-resolution images are used to train the proposed architecture to generate the super-resolution images in x2, x4 scales. The proposed architecture is based on the Res2net blocks as the backbone of the network. Along with this, the coordinate convolution layer and dual attention are also used in the architecture. Further, multi-level supervision is implemented to supervise the output image resolution similarity with the real image at each block during training. To test the robustness of the proposed model, we evaluated our model on the Thermal-6 dataset. The results show that our model is efficient to achieve state-ofthe-art results on the PBVS dataset. Further the results on the Thermal-6 dataset show that the model has a decent generalization capacity.

2020.07.10 : Expectation and Reaction as Intention for Conversation System

Intention plays an import role in human daily conversation. Conventionally, human intention exerts influence on conversation contents and atmosphere. Although dialogue systems that involve emotion awareness are popular, implementation of human intention on artificial intelligence does not draw much attention of researchers. The reason is that intention is usually not a spontaneous response of external stimulus, but a self-generated desire and expectation. Moreover, internal intentions are not subjected to external signals that can be observed by third parties. In this research, we experimentally used “reaction” and “expectation” factors to represent intention at a text level and created intentional conversation model based on transformer model. Preliminary results were given to show that applying intention is able to help the a dialogue system address a higher level of engagement in the conversation.

2020.06.16 : A Multi-Level Supervision Model: A novel approach for Thermal Image Super Resolution

This paper proposes a novel architecture for thermal image super-resolution. The proposed architecture is based on the residual blocks as the base units of the network. Along with this, the coordinate convolution layer and the convolutional block attention Module (CBAM) are also used in the architecture. Further, the multi-level supervision is implemented to supervise the output image resolution similarity with the real image at each block during training. To test the robustness of the proposed model, we evaluated our model on the Thermal-6 dataset [13]. The results show that our model is efficient to achieve the state of art results on the PBVS’2020 dataset. Further the results on the Thermal-6 dataset show that the model has a decent generalization capacity.

2019.10.16 : Eyenet: Attention based Convolutional Encoder-Decoder Network for Eye Region Segmentation

With the immersive development in the field of augmented and virtual reality, accurate and speedy eye-tracking is required. Facebook Research has organized a challenge, named OpenEDS Semantic Segmentation challenge for per-pixel segmentation of the key eye regions: the sclera, the iris, the pupil, and everything else (background). Our model, named EyeNet, includes modified residual units as the backbone, two types of attention blocks and multi-scale supervision for segmenting the aforesaid four eye regions. Our proposed model achieved a total score of 0.974(EDS Evaluation metric) on test data, which demonstrates superior results compared to the baseline methods

2019.05.21 : SkeletonNet: Shape Pixel to Skeleton Pixel

Deep Learning for Geometric Shape Understating has organized a challenge for extracting different kinds of skeletons from the images of different objects. This competition is organized in association with CVPR 2019. In our proposed architecture, unlike the plain decoder in the traditional U net, we have designed the decoder in the format of HED architecture, wherein we have introduced 4 side layers and fused them to one dilation convolutional layer to connect the broken links of the skeleton. Our proposed architecture achieved the F1 score of 0.77 on test data.