Skip to content

Minimal code to illustrate the lack of spatial attention in CNNs

License

Notifications You must be signed in to change notification settings

aliborji/spatial_attention

Repository files navigation

Addressing a fundamental limitation in deep vision models: lack of spatial attention

July 1, 2024

Manuscript is avaiable at https://arxiv.org/pdf/2407.01782v1.

The Proposed Model:

model

cnn1.py --> a model that uses F.conv2d

cnn2.py --> a model that uses sequential conv2d by looping over the image

cnn3.py --> a model that uses sequential conv2d by looping over the image and skipping the location where the has not been a change

First run cnn1.py to train a model (uncomment some lines). Then you can run three CNNs and compare the run time. Please study the code to understand how the change is implemented. For example, in cnn3, first conv layer is initialized with its previous output and only locations with change in the image are updated. This layer then knows which locations it has updated and from that generates a change map to send to the fist pooling layer and so on and on ...


The DemoSegmenter.ipynb notebook illustrated the main concept behind the second proposed solution based on semantic segmentation.


@misc{borji2024attention,
    title={Addressing a fundamental limitation in deep vision
models: lack of spatial attention},
    author={Ali Borji},
    year={2024},
    eprint={2407.01782},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

About

Minimal code to illustrate the lack of spatial attention in CNNs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published