Smart App for Visually Impaired of No name team in IT hackathon 🤡 🤡
We use Flickr 30k images for training the model .
We build the model with VGG16 the attention model, our model is based on the idea of CVPR 2016.
Use train_model.py
to train the model.
We train the model on NVIDIA Tesla T4 GPU and takes 9 hours for train 30 epochs and get the momment evaluation.
Below is our evaluation:
Metrics | Score |
---|---|
BLEU | 0.71 |
CrossentropyLoss | 0.38 |
All package to run the server in the requirement.txt
, run the following command to install.
pip install -r requirement.txt
On AWS, we set up the Auto Scaling group with the following architecture to help scale the product. Moreover, we need to setup the Message Broker base Redis to help scaling with the large number of concurrent user.
We hope this architecture can help everyone in deploy their model. More information can follow on AWS.
OCR server:
- To run the server for OCR:
cd ./ocr-api
sudo python3 server.py
- To run the captioning server:
sudo python3 server.py
The server will run on the port 80, let deploy on 2 different instance and setup nginx with uwsgi for proxy server.
Our app demo is written by React-native, all of the implementation can be seen on the following link Github
All of the model, architecture was implemented by No Name team, thanks to all for the great IT hackathon 2022 !!! 🤤 🤤 🤤