Addressing a fundamental limitation in deep vision models: lack of spatial attention

July 1, 2024

Manuscript is avaiable at https://arxiv.org/pdf/2407.01782v1.

The Proposed Model:

cnn1.py --> a model that uses F.conv2d

cnn2.py --> a model that uses sequential conv2d by looping over the image

cnn3.py --> a model that uses sequential conv2d by looping over the image and skipping the location where the has not been a change

First run cnn1.py to train a model (uncomment some lines). Then you can run three CNNs and compare the run time. Please study the code to understand how the change is implemented. For example, in cnn3, first conv layer is initialized with its previous output and only locations with change in the image are updated. This layer then knows which locations it has updated and from that generates a change map to send to the fist pooling layer and so on and on ...

The DemoSegmenter.ipynb notebook illustrated the main concept behind the second proposed solution based on semantic segmentation.

@misc{borji2024attention,
    title={Addressing a fundamental limitation in deep vision
models: lack of spatial attention},
    author={Ali Borji},
    year={2024},
    eprint={2407.01782},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
DemoSegmenter.ipynb		DemoSegmenter.ipynb
LICENSE		LICENSE
README.md		README.md
Screenshot 2024-07-01 154531.png		Screenshot 2024-07-01 154531.png
cnn1.py		cnn1.py
cnn2.py		cnn2.py
cnn3.py		cnn3.py
model.png		model.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Addressing a fundamental limitation in deep vision models: lack of spatial attention

Manuscript is avaiable at https://arxiv.org/pdf/2407.01782v1.

About

Releases

Packages

Languages

License

aliborji/spatial_attention

Folders and files

Latest commit

History

Repository files navigation

Addressing a fundamental limitation in deep vision models: lack of spatial attention

Manuscript is avaiable at https://arxiv.org/pdf/2407.01782v1.

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages