Skip to content

Nyandwi/learning-visual-attention-for-robotic-vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Learning Visual Attention for Robotic Vision

Abstract

Visual saliency maps represent the attention-grabbing area of an image based on the human vision nerve system. Existing deep learning architectures used to predict saliency maps from images face an underlying challenge of generalization. In this study, we explore the use of image transformation techniques and deeper model architectures, such as ConvNext, a mixture of CNNs and Transformers as feature extractors as a way of making the salency map predictions of the DeepGazeIIE model more generalizable. This is done with aim of developing a deep learning.

Learn more from the project report.

Contributors: Denis Musinguzi, Kevin Sebineza, Jeande Dieu Nyandwi, Muhammed Danso.

Acknowledgments

This project is a part of Introduction to Deep Learning(11-785). We thank the course staffs(instructors and TAs) for supporting us over the whole semester. We also thank authors of the baseline models we used for open-sourcing their codes.

About

Learning Visual Attention for Robotic Vision

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •