```markdown
Copyright 2018-2019 IBM Corp. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
```

# MAX Question Answering Demo

This notebook will demonstrate how to make a request to the MAX Question Answering model API and how to use the response object received. By default, the notebook uses the [hosted demo instance](http://max-question-answering.max.us-south.containers.appdomain.cloud), but you can use a locally running instance as well by following the instructions on the main README. This notebook can be found on the Model Asset Exchange [GitHub repo](https://github.com/IBM/MAX-Question-Answering/tree/master/) under the `samples/` directory and assumes the same folder [structure](https://github.com/IBM/MAX-Question-Answering/tree/master/samples) is maintained. 

In [2]:
import requests 
import json
import pprint

In [3]:
pp = pprint.PrettyPrinter()

url = 'http://max-question-answering.max.us-south.containers.appdomain.cloud/model/predict'

# To run the model locally uncomment the line below after setting up the local Docker container
# url = 'http://localhost:5000/model/predict'

def predict(input_file):
    with open(input_file, 'rb') as file:
        json_data = json.load(file)
        r = requests.post(url=url, json=json_data).json()
    
        return r

# Run inference on a sample file

## Preview the File

We first preview the file to see the paragraph and the questions.
The input JSON field `paragraphs` contains a list of `paragraph` dictionaries. Each `paragraph` dictionary has a `context` field and a list of `questions`. The `context` is the body of the paragraph and the `questions` are based on this context/body.

In [4]:
json_path = 'small-dev.json'
with open(json_path) as f:
    input_to_model = json.load(f)
pp.pprint(input_to_model)

{'paragraphs': [{'context': 'Super Bowl 50 was an American football game to '
                            'determine the champion of the National Football '
                            'League (NFL) for the 2015 season. The American '
                            'Football Conference (AFC) champion Denver Broncos '
                            'defeated the National Football Conference (NFC) '
                            'champion Carolina Panthers 24–10 to earn their '
                            'third Super Bowl title. The game was played on '
                            "February 7, 2016, at Levi's Stadium in the San "
                            'Francisco Bay Area at Santa Clara, California. As '
                            'this was the 50th Super Bowl, the league '
                            'emphasized the "golden anniversary" with various '
                            'gold-themed initiatives, as well as temporarily '
                            'suspending the tradition of na

In [5]:
# Preview paragraph
pp.pprint(input_to_model['paragraphs'][0]['context'])

('Super Bowl 50 was an American football game to determine the champion of the '
 'National Football League (NFL) for the 2015 season. The American Football '
 'Conference (AFC) champion Denver Broncos defeated the National Football '
 'Conference (NFC) champion Carolina Panthers 24–10 to earn their third Super '
 "Bowl title. The game was played on February 7, 2016, at Levi's Stadium in "
 'the San Francisco Bay Area at Santa Clara, California. As this was the 50th '
 'Super Bowl, the league emphasized the "golden anniversary" with various '
 'gold-themed initiatives, as well as temporarily suspending the tradition of '
 'naming each Super Bowl game with Roman numerals (under which the game would '
 'have been known as "Super Bowl L"), so that the logo could prominently '
 'feature the Arabic numerals 50.')


In [6]:
# Preview questions
for q in input_to_model['paragraphs'][0]['questions']:
    print(q)

Which NFL team represented the AFC at Super Bowl 50?
Which NFL team represented the NFC at Super Bowl 50?
Where did Super Bowl 50 take place?
Which NFL team won Super Bowl 50?
What color was used to emphasize the 50th anniversary of the Super Bowl?
What was the theme of Super Bowl 50?
What day was the game played on?
What is the AFC short for?


## Run Inference

We now pass the file to the MAX model via the predict function defined earlier and print out the answers.

In [7]:
answers = predict('small-dev.json')
pp.pprint(answers)

{'predictions': [['Denver Broncos',
                  'Carolina Panthers',
                  "Levi's Stadium in the San Francisco Bay Area at Santa "
                  'Clara, California',
                  'Denver Broncos',
                  'gold',
                  'golden anniversary"',
                  'February 7, 2016',
                  'American Football Conference']],
 'status': 'ok'}


# Run inference on in-memory data

Now, say you wanted to pass in your own data, you can do so by directly sending in a POST request.

In [8]:
info = "The choice among algorithm categories can partially be made based on the user persona's ability to intervene at different parts of a machine learning pipeline.  If the user is allowed to modify the training data, then pre-processing can be used.  If the user is allowed to change the learning algorithm, then in-processing can be used.  If the user can only treat the learned model as a black box without any ability to modify the training data or learning algorithm, then only post-processing can be used.  AIF360 recommends the earliest mediation category in the pipeline that the user has permission to apply because it gives the most flexibility and opportunity to correct bias as much as possible. If possible, all algorithms from all permissible categories should be tested because the ultimate performance depends on dataset characteristics: there is no one best algorithm independent of dataset."
data ={
   "paragraphs":[
      {
         "context":info,
         "questions":[
            "When can preprocessing be used?"
            ]
      }
   ]
}


In [9]:
answers = requests.post(url=url, json=data)
pp.pprint(answers.json())

{'predictions': [['If the user is allowed to modify the training data']],
 'status': 'ok'}


You can also write the above in-memory data to disc in the same JSON format and then run inference similar to the first use case. We can do that by first saving the object to disc and then calling the `predict` function. _Note_: The file would need to be in the same JSON format as `../samples/small-dev.json`.

In [10]:
data_path = "example-data.json"
with open(data_path, "w") as f:
    json.dump(data, f)

In [11]:
answers = predict('example-data.json')
pp.pprint(answers)

{'predictions': [['If the user is allowed to modify the training data']],
 'status': 'ok'}
