## [Running Flask and FastAPI on Google Colab](https://medium.datadriveninvestor.com/flask-on-colab-825d2099d9d8)

In [1]:
!pip install fastapi nest-asyncio pyngrok uvicorn

Collecting fastapi
[?25l  Downloading https://files.pythonhosted.org/packages/4e/b9/a91a699f5c201413b3f61405dbccc29ebe5ad25945230e9cec98fdb2434c/fastapi-0.65.1-py3-none-any.whl (50kB)
[K     |██████▍                         | 10kB 16.5MB/s eta 0:00:01[K     |████████████▉                   | 20kB 9.2MB/s eta 0:00:01[K     |███████████████████▎            | 30kB 8.0MB/s eta 0:00:01[K     |█████████████████████████▊      | 40kB 7.5MB/s eta 0:00:01[K     |████████████████████████████████| 51kB 2.2MB/s 
Collecting pyngrok
[?25l  Downloading https://files.pythonhosted.org/packages/6b/4e/a2fe095bbe17cf26424c4abcd22a0490e22d01cc628f25af5e220ddbf6f0/pyngrok-5.0.5.tar.gz (745kB)
[K     |████████████████████████████████| 747kB 7.3MB/s 
[?25hCollecting uvicorn
[?25l  Downloading https://files.pythonhosted.org/packages/c8/de/953f0289508b1b92debdf0a6822d9b88ffb0c6ad471d709cf639a2c8a176/uvicorn-0.13.4-py3-none-any.whl (46kB)
[K     |████████████████████████████████| 51kB 5.3MB/s 
[?

In [2]:
from fastapi import FastAPI
import nest_asyncio
from pyngrok import ngrok
import uvicorn

app = FastAPI()

@app.get('/index')
async def home():
  return "Hello World"

ngrok_tunnel = ngrok.connect(8000)
print('Public URL:', ngrok_tunnel.public_url)
print('Public docs URL:', ngrok_tunnel.public_url + "/docs")
nest_asyncio.apply()
uvicorn.run(app, port=8000)




INFO:     Started server process [59]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)


Public URL: http://13a7db702ebc.ngrok.io
Public docs URL: http://13a7db702ebc.ngrok.io/docs


INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [59]


## [Deploying Transformer Models](https://chatbotslife.com/deploying-transformer-models-1350876016f)


In [3]:
%%bash
pip install -qq transformers
#pip install torch torchvision
pip install "fugashi[unidic-lite]" 
pip install ipadic

Collecting fugashi[unidic-lite]
  Downloading https://files.pythonhosted.org/packages/55/9c/009da34dd111e84f54eef833c84afb5c744a0306af8546014a958e1967a0/fugashi-1.1.0-cp37-cp37m-manylinux1_x86_64.whl (486kB)
Collecting unidic-lite; extra == "unidic-lite"
  Downloading https://files.pythonhosted.org/packages/55/2b/8cf7514cb57d028abcef625afa847d60ff1ffbf0049c36b78faa7c35046f/unidic-lite-1.0.8.tar.gz (47.4MB)
Building wheels for collected packages: unidic-lite
  Building wheel for unidic-lite (setup.py): started
  Building wheel for unidic-lite (setup.py): finished with status 'done'
  Created wheel for unidic-lite: filename=unidic_lite-1.0.8-cp37-none-any.whl size=47658825 sha256=157022620f15e35c3d06b6734a8c565734002f50fea16b49ca9e3c2d727ed352
  Stored in directory: /root/.cache/pip/wheels/20/48/8d/b66d8361a27f58f41ec86640e4fd2640de0403a6367511eab7
Successfully built unidic-lite
Installing collected packages: unidic-lite, fugashi
Successfully installed fugashi-1.1.0 unidic-lite-1.0.8
Col

In [33]:
import torch
from transformers import (
    pipeline,
    AutoModelForMaskedLM,
    AutoTokenizer
)

class NLP:
    def __init__(self):
        self.gen_tokenizer = AutoTokenizer.from_pretrained('cl-tohoku/bert-base-japanese-whole-word-masking')
        self.gen_model = AutoModelForMaskedLM.from_pretrained('cl-tohoku/bert-base-japanese-whole-word-masking')
         
    def generate(self, prompt="The epistemelogical limit"):
        #inputs = self.gen_tokenizer.encode(prompt, return_tensors="pt", max_length=512, truncation=True)
        #with torch.no_grad():
        #    summary_ids = self.gen_model.generate(inputs) #, max_length=512, min_length=5, length_penalty=5., num_beams=2)
        #    summary = self.gen_tokenizer.decode(summary_ids[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
        #    return summary

        # 入力テキストのエンコード
        input_ids = self.gen_tokenizer.encode(f'吾輩は{self.gen_tokenizer.mask_token}である。名前はまだ無い。', return_tensors='pt', max_length=512, truncation=True)
        #input_ids = self.gen_tokenizer.encode(prompt, return_tensors='pt', max_length=512, truncation=True)
        print('input_ids:', self.gen_tokenizer.convert_ids_to_tokens(input_ids[0].tolist()))
        
        # マスクインデックスの取得
        masked_index = torch.where(input_ids == self.gen_tokenizer.mask_token_id)[1].tolist()[0]
        print('masked_index:', masked_index)
        
        # マスクトークンの予測
        result = self.gen_model(input_ids)
        pred_ids = result[0][:, masked_index].topk(5).indices.tolist()[0]

        output = []
        for pred_id in pred_ids:
            output_ids = input_ids.tolist()[0]
            output_ids[masked_index] = pred_id
            #print(self.gen_tokenizer.decode(output_ids))
            #print(self.gen_tokenizer.decode(output_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True))
            output.append(self.gen_tokenizer.decode(output_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True))
        return output


In [35]:
# from nlp import NLP
nlp = NLP()
#print(nlp.sentiments("うほほーい、大好き♡"))
for s in nlp.generate():
    print(s)

Some weights of the model checkpoint at cl-tohoku/bert-base-japanese-whole-word-masking were not used when initializing BertForMaskedLM: ['cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


input_ids: ['[CLS]', '吾', '##輩', 'は', '[MASK]', 'で', 'ある', '。', '名前', 'は', 'まだ', '無い', '。', '[SEP]']
masked_index: 4
吾輩 は 猫 で ある 。 名前 は まだ 無い 。
吾輩 は 犬 で ある 。 名前 は まだ 無い 。
吾輩 は 人間 で ある 。 名前 は まだ 無い 。
吾輩 は 狼 で ある 。 名前 は まだ 無い 。
吾輩 は 私 で ある 。 名前 は まだ 無い 。


In [36]:
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
#from app.nlp import NLP
import nest_asyncio
from pyngrok import ngrok
import uvicorn

class Message(BaseModel):
    input: str
    output: str = None

app = FastAPI()
nlp = NLP()

origins = [
    "http://localhost",
    "http://localhost:3000",
    "http://127.0.0.1:3000"
]

app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,
    allow_methods=["POST"],
    allow_headers=["*"],
)

@app.post("/generative/")
async def  generate(message: Message):
    message.output  = nlp.generate(prompt=message.input)
    return {"output" : message.output}

ngrok_tunnel = ngrok.connect(8000)
print('Public URL:', ngrok_tunnel.public_url)
print('Public docs URL:', ngrok_tunnel.public_url + "/docs")
nest_asyncio.apply()
uvicorn.run(app, port=8000)

Some weights of the model checkpoint at cl-tohoku/bert-base-japanese-whole-word-masking were not used when initializing BertForMaskedLM: ['cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Public URL: http://5d97a0bc7adc.ngrok.io
Public docs URL: http://5d97a0bc7adc.ngrok.io/docs


INFO:     Started server process [59]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)


INFO:     124.32.186.1:0 - "GET /docs HTTP/1.1" 200 OK
INFO:     124.32.186.1:0 - "GET /openapi.json HTTP/1.1" 200 OK
input_ids: ['[CLS]', '吾', '##輩', 'は', '[MASK]', 'で', 'ある', '。', '名前', 'は', 'まだ', '無い', '。', '[SEP]']
masked_index: 4
INFO:     124.32.186.1:0 - "POST /generative/ HTTP/1.1" 200 OK
INFO:     124.32.186.1:0 - "POST /generative/ HTTP/1.1" 422 Unprocessable Entity
input_ids: ['[CLS]', '吾', '##輩', 'は', '[MASK]', 'で', 'ある', '。', '名前', 'は', 'まだ', '無い', '。', '[SEP]']
masked_index: 4
INFO:     124.32.186.1:0 - "POST /generative/ HTTP/1.1" 200 OK
input_ids: ['[CLS]', '吾', '##輩', 'は', '[MASK]', 'で', 'ある', '。', '名前', 'は', 'まだ', '無い', '。', '[SEP]']
masked_index: 4
INFO:     124.32.186.1:0 - "POST /generative/ HTTP/1.1" 200 OK


INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [59]


```
% curl -X 'POST' \
  'http://5d97a0bc7adc.ngrok.io/generative/' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "input": "",
  "output": ""
}'

{"output":["吾輩 は 猫 で ある 。 名前 は まだ 無い 。","吾輩 は 犬 で ある 。 名前 は まだ 無い 。","吾輩 は 人間 で ある 。 名前 は まだ 無い 。","吾輩 は 狼 で ある 。 名前 は まだ 無い 。","吾輩 は 私 で ある 。 名前 は まだ 無い 。"]}
```