Personal Project in Computer Vision related attention network implementation
In this project, I implemented an attention network on the CRNN network where CNN as an image encoder, RNN for a text encoder, and LSTM as a sequence text reminder, with the addition of implementing an attention network that is useful for spreading feature vectors at the LSTM layer successfully reducing the loss value (epoch 1 : 3.18632) to (epoch 25: 2.20923) with a reduced loss margin of 0.997 at 25 epochs for almost 2 hours using a Tesla P4 GPU and 2 workers.
Sample 1 | Sample 2 | Sample 3 |
---|---|---|