<a href="https://colab.research.google.com/github/matthewchung74/inference_nbs/blob/main/huggingface/question_answering_inference.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

this_is_main_branch

In [None]:
from pathlib import Path

def create_requirements_file():
    if not Path("requirements.txt").exists():
        requirements = ["transformers"]
        with open('requirements.txt', 'w') as filehandle:
            for listitem in requirements:
                filehandle.write('%s\n' % listitem)
    
create_requirements_file()

In [None]:
!pip install -r requirements.txt

Collecting transformers
[?25l  Downloading https://files.pythonhosted.org/packages/d8/b2/57495b5309f09fa501866e225c84532d1fd89536ea62406b2181933fb418/transformers-4.5.1-py3-none-any.whl (2.1MB)
[K     |████████████████████████████████| 2.1MB 8.1MB/s 
Collecting tokenizers<0.11,>=0.10.1
[?25l  Downloading https://files.pythonhosted.org/packages/ae/04/5b870f26a858552025a62f1649c20d29d2672c02ff3c3fb4c688ca46467a/tokenizers-0.10.2-cp37-cp37m-manylinux2010_x86_64.whl (3.3MB)
[K     |████████████████████████████████| 3.3MB 37.6MB/s 
Collecting sacremoses
[?25l  Downloading https://files.pythonhosted.org/packages/08/cd/342e584ee544d044fb573ae697404ce22ede086c9e87ce5960772084cad0/sacremoses-0.0.44.tar.gz (862kB)
[K     |████████████████████████████████| 870kB 37.2MB/s 
Building wheels for collected packages: sacremoses
  Building wheel for sacremoses (setup.py) ... [?25l[?25hdone
  Created wheel for sacremoses: filename=sacremoses-0.0.44-cp37-none-any.whl size=886084 sha256=3cbd6992f29

In [None]:
!wget https://ml-inference.s3-us-west-2.amazonaws.com/hugging_face_question_answering_training.zip

--2021-04-14 17:46:39--  https://ml-inference.s3-us-west-2.amazonaws.com/hugging_face_question_answering_training.zip
Resolving ml-inference.s3-us-west-2.amazonaws.com (ml-inference.s3-us-west-2.amazonaws.com)... 52.218.242.249
Connecting to ml-inference.s3-us-west-2.amazonaws.com (ml-inference.s3-us-west-2.amazonaws.com)|52.218.242.249|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 244710271 (233M) [application/zip]
Saving to: ‘hugging_face_question_answering_training.zip’


2021-04-14 17:46:42 (66.0 MB/s) - ‘hugging_face_question_answering_training.zip’ saved [244710271/244710271]



In [None]:
!unzip hugging_face_question_answering_training.zip

Archive:  hugging_face_question_answering_training.zip
   creating: model/
  inflating: model/config.json       
   creating: model/tokenizer/
  inflating: model/tokenizer/vocab.txt  
  inflating: model/pytorch_model.bin  


In [None]:
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, AutoConfig
import torch
from pathlib import Path

model_path = Path("model")

tokenizer_new = AutoTokenizer.from_pretrained(str(model_path/"tokenizer"), config=AutoConfig.from_pretrained(model_path))
model_new = AutoModelForQuestionAnswering.from_pretrained(model_path)
model_new.cpu();

In [None]:
!pip install -q git+https://github.com/matthewchung74/inference_params.git

  Building wheel for inference-params (setup.py) ... [?25l[?25hdone


In [None]:
from inference_params.inference_params import inference_test, FieldType, inference_predict

# text input will be the label for the app input
input = {"question": FieldType.Text, "context": FieldType.Text}
# result will be the label for the app output
output = {"result": FieldType.Text}

@inference_predict(input=input, output=output)
def predict(inputs):
    question = inputs["question"]
    context = inputs["context"]
    inputs = tokenizer_new.encode_plus(question, context,add_special_tokens=True, return_tensors="pt")
    input_ids = inputs["input_ids"].tolist()[0]

    output = model_new(**inputs)
    answer_start_scores = output.start_logits
    answer_end_scores = output.end_logits

    answer_start = torch.argmax(answer_start_scores)
    answer_end = torch.argmax(answer_end_scores) + 1

    answer = tokenizer_new.convert_tokens_to_string(tokenizer_new.convert_ids_to_tokens(input_ids[answer_start:answer_end]))
    return {"result": answer}

In [None]:
context = r"""
🤗 Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose
architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural
Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between
TensorFlow 2.0 and PyTorch.
"""

inputs = {"question":"How many pretrained models are available in Transformers?",
          "context":context}

print(inputs["question"])
print(predict(inputs))

inputs = {"question":"What does Transformers provide?",
          "context":context}

print(inputs["question"])
print(predict(inputs))

inputs = {"question":"Transformers provides interoperability between which frameworks?",
          "context":context}

print(inputs["question"])
print(predict(inputs))

How many pretrained models are available in Transformers?
({'result': 'over 32 +'}, 0.23412585258483887)
wrote results to result.json duration: 0.239064 seconds
What does Transformers provide?
({'result': 'general - purpose architectures'}, 0.22880816459655762)
Transformers provides interoperability between which frameworks?
({'result': 'tensorflow 2. 0 and pytorch'}, 0.22966623306274414)


In [None]:
inference_test(predict_func=predict, params=inputs)
!cat result.json

wrote results to result.json duration: 0.234966 seconds
{"result": "tensorflow 2. 0 and pytorch"}