# Argparser, Flask, RESTplus

## Case Study: Tagging Influencers on Instagram

### Text-based keyword extraction and categorization
Task: Collect the 12 most recent captions of an influencer
1. Preprocessing
    - remove all hashtag, emojis, URLs, HTML, non-English words and numbers
2. Keyword Extraction Algorithm in NLP
    - LDA, TF-IDF, TextRank
3. keyword-list = keyword extraction algorithms + top `n` words + top `m` hashtags
4. Remove non-informative words from the keyword-list
    - people's names, cities, adverbs, adjectives, colors, time, etc.
5. Keyword to category mapping: Word2Vec (GloVe)

Problem: What if the influencer didn't write a caption for their image? 

### Image-based keyword extraction and categorization
- Use Google's Vision API to get the top `n` labels per image 

### Integration of the tagging API with the MuseFind Platform
- The big file, `glove.840B.300d.txt` is stored in **AWS S3**
- The tagging algorithm GloVe read by Boto and `smart_open`
    - [GloVe](https://nlp.stanford.edu/projects/glove/)
        - unsupervised learning algorithm for obtaining vector representations for words (Stanford)
        - model for distributed word representation
        - coined from Global Vectors
    - [Boto](https://github.com/boto/boto3)
        - AWS SDK for Python, allows Python developers to write software that makes use of services like Amazon S3 and Amazon EC2
    - [`smart_open`](https://github.com/RaRe-Technologies/smart_open)
        - Python 3 library for **efficient streaming of very large files** from/to storages such as S3, GCS, Azure Blob Storage, HDFS, WebHDFS, HTTP, HTTPS, SFTP, or local filesystem
        - supports transparent, on-the-fly (de-)compression for a variety of different formats
- The algorithm would be live (the interface is the Flask RESTful API) on EC2 and deployed by Docker
- The response of the algorithm is returned to the Platform API in JSON format
- Response is saved in a database managed by the Platform API 

<img width=400 src="https://github.com/Make-School-Courses/DS-2.3-Data-Science-in-Production/raw/0c911026bff4ffb8926f997b5fac8e330365d290/Lessons/Images/MuseFind.png">

## Argparser
https://www.youtube.com/watch?v=cdblJqEUDNo

### Argparser with Flask

In [None]:
from flask import Flask, request, jsonify
from flask_restplus import Api, Resource, fields

app = Flask(__name__)

api = Api(app, version='1.0', title='Add API', description='argparse with flask demonstration')
ns = api.namespace('add', description='Add two numbers')
single_parser = api.parser()
single_parser.add_argument('n', type=int, required=True, help= 'first number')
single_parser.add_argument('m', type=int, required=True, help= 'second number')

@ns.route("/")
class Addition(Resource):
    @api.doc(parser=single_parser, description='Enter Two Integers')
    def get(self):
        args = single_parser.parse_args()
        n1 = args.n
        m1 = args.m
        r = get_sum(n1, m1)
        return jsonify({'add': r})

if __name__ == "__main__":
    app.run(host='0.0.0.0', port=3000)