- KNN model for image captioning. Get the features of images using SURF or GIST algorithm, and feed into knn model.
- For prediction, find the closest image based on features.
- use BLEU score to choose one best caption from the captions of closest images.
- Use VGG16 or VGG19 CNN to extraction features from images.
- Use LSTM model to generate the captions
- run image_knn.py
- run image_rnn.py
- run image_rnn_predict.py
- use 16 layer version of CNN to extract features
- pre-trained model put pr-trained model file in model folder
- to present results, we use Flask and show the result as web pages.
- run python app.py, and use 127.0.0.1:4555 to see the results in browser.
- OpenCV: for feature extraction(SIFT, SURF, ORB) run install-opencv.sh
- GIST: A wrapper for Lear's GIST implementation written in C. follow the instruction: here
- Tensorflow
- Keras
- Flask