# MLMAC: Machine Learning Model Attribution Challenge

This notebook will get you up and running with the fine-tuned model API and the as well as the base models.

Challenge details available on [kaggle](https://www.kaggle.com/competitions/ml-model-attribution-2/).

See the official terms of service at [mlmac.io/terms](https://mlmac.io/terms).

## Initial Setup

Install Dependencies

In [None]:
!pip install transformers > /dev/null
!pip install sentencepiece > /dev/null

and import dependencies

In [None]:
import pickle
import requests
import time

from pprint import pprint
from transformers import pipeline
from transformers.pipelines.conversational import Conversation

## Model API Helper

Let's setup a helper class for interacting with the remote models. This let's us interact with the them in a more natural way and handles some common errors. It will also cache your queries (more on that later).

Example usage:

```
input = "The machine learning model attribution challenge is"
ft_model = Model("mlmac", MLMAC_API_TOKEN, 0)
output = ft_model(input)
```

```
{'status': 'failed', 'result': {'error': 'currently loading', 'estimated_time': 125.36245727539062}, 'queries': {'0': 15, '1': 8, '4': 3, '10': 4}}
attempt 1/10; waiting for 20 seconds
{'status': 'failed', 'result': {'error': 'currently loading', 'estimated_time': 125.36245727539062}, 'queries': {'0': 15, '1': 8, '4': 3, '10': 4}}
attempt 2/10; waiting for 20 seconds
{'status': 'failed', 'result': {'error': 'currently loading', 'estimated_time': 125.36245727539062}, 'queries': {'0': 15, '1': 8, '4': 3, '10': 4}}
attempt 3/10; waiting for 20 seconds
{'generated_text': 'The machine learning model attribution challenge is a good '
                   'start.'}
```

In [None]:
class Model:
  def __init__(self, api, api_token, model_id, use_cache=True):
    self.api = api
    self.api_token = api_token
    self.model_id = model_id
    self.use_cache = use_cache
    self.cache = {}

    if api == "hf":
      self.api_url = f"https://api-inference.huggingface.co/models/model-attribution-challenge/{model_id}"
    elif api == "mlmac":
      self.api_url = f"https://api.mlmac.io:8080/query?model={model_id}"

  def __call__(self, input, max_retries=10, params={}, options={}):
    if self.use_cache and input in self.cache:
      return self.cache[input]
    
    if self.api == "hf":
      payload = {"inputs": input, "parameters": params, "options": options}
    elif self.api == "mlmac":
      payload = {"input": input}

    headers = {"Authorization": f"Bearer {self.api_token}"}

    for retry in range(max_retries):
      response = requests.post(self.api_url, json=payload, headers=headers)
    
      if response.status_code == 200:
        if self.api == "hf":
          result = response.json()
        elif self.api == "mlmac":
          result = response.json().get("result")

        self.cache[input] = result

        return result
      elif response.status_code == 503:
        print(response.json())
        print(f"attempt {retry+1}/{max_retries}; waiting for 20 seconds")
        time.sleep(20.0)
      else: # error
        raise Exception(response.text)
    
    raise Exception(f"Failed after {max_retries} attempts")

## Fine-Tuned Models Setup

You will interact with the fine-tuned models via the mlmac.io API.

### MLMAC API Setup

Retrieve your API token [here](https://api.mlmac.io:8080/github/auth), enter it in the code block below, and run the code block.

In [None]:
MLMAC_API_TOKEN = ""

To verify your API token is working, let's hit the `status` endpoint. 
The `status` endpoint is useful to check your total number of queries. Requests to this endpoint do not count as queries. You can also see your status at [mlmac.io/status](https://mlmac.io/status).

Run the code block below. It will create a `status` helper function and execute it. You should see something like this:
```
{'api_key': 'your_api_key',
 'created': '2022-07-14 20:47:41.339519',
 'name': 'your_github_username',
 'queries': {'0': 15, '1': 8, '10': 4, '4': 3},
 'total_queries': 30}
```

In [None]:
def status(api_token):
  response = requests.get(f"https://api.mlmac.io:8080/status", headers={"Authorization": f"Bearer {api_token}"})
  return response.json()

status(MLMAC_API_TOKEN)

### Fine-Tuned Models

You can go ahead and instantiate a class for each fine-tuned model for easy access later.

In [None]:
ft_models = [Model("mlmac", MLMAC_API_TOKEN, idx) for idx in range(24)]

And try it out (note this will use a query). You should see something like:
```
{'generated_text': 'The machine learning model attribution challenge is a good '
                   'start.'}
```

In [None]:
input = "The machine learning model attribution challenge is"
output = ft_models[0](input)

pprint(output)

### Query Caching

The `Model` class stores (query, response pairs) in the `cache` member variable. This helps you avoid making extra non-useful queries.

In [None]:
print("total_queries", status(MLMAC_API_TOKEN)["total_queries"])
pprint(ft_models[0](input))
print("total_queries", status(MLMAC_API_TOKEN)["total_queries"])
print("cache:")
pprint(ft_models[0].cache)

### Saving and Restoring Query Responses

You may want to save your query/response pairs

In [None]:
with open("./mlmac_ft_models.pkl", "wb") as f:
  pickle.dump(ft_models, f)

The file can be downloaded (and later re-uploaded) via the Files menu in the colab sidebar.
You can restore your saved models like this:

In [None]:
with open("./mlmac_ft_models.pkl", "rb") as f:
  ft_models = pickle.load(f)

Confirm that you can see your cached responses:

In [None]:
ft_models[0].cache

If you like you can mount your drive and save your pkl there for easy storage and retrievel, but [be careful who you colab with](https://medium.com/mlearning-ai/careful-who-you-colab-with-fa8001f933e7). 😈

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
with open("/content/drive/MyDrive/mlmac_ft_models.pkl", "wb") as f:
  pickle.dump(ft_models, f)

In [None]:
with open("/content/drive/MyDrive/mlmac_ft_models.pkl", "rb") as f:
  ft_models = pickle.load(f)

## Base Models Setup

We can interact with the base models in two ways: via the Hugging Face API (similar to MLMAC), or directly via their python interface.

### Hugging Face API Setup

Similar to the setup for the MLMAC API, we set our API Token and use our helper class to instantiate an interface to each base model.

You can create an API token [here](https://huggingface.co/settings/tokens). You'll need a Hugging Face account.

In [None]:
HF_API_TOKEN = ""

base_model_names = ["bert-base-cased", "bert-base-chinese", "bert-base-uncased",
                    "bloom-2b5", "bloom-350m","bloom-560m", 
                    "codegen-350M-multi", "DialoGPT-large", "distilgpt2",
                    "fairseq-dense-125M", "german-gpt2", "gpt-neo-125M", "gpt2",
                    "gpt2-chinese-cluecorpussmall", "gpt2-xl", "openai-gpt",
                    "opt-350m", "roberta-base", "xlnet-base-cased",]

base_models = {model_name: Model("hf", HF_API_TOKEN, model_name, use_cache=False) for model_name in base_model_names}

Let's test it out:

In [None]:
input = "The machine learning model attribution challenge is"
output = base_models["gpt2"](input)

pprint(output)

Some base models perform the "Fill-Mask" task and require `[MASK]` as part of their input.

In [None]:
input = "The machine learning model attribution challenge is [MASK]."
output = base_models["bert-base-cased"](input)

pprint(output)

You can also pass parameters and options to huggingface as documented [here]("https://huggingface.co/docs/api-inference/detailed_parameters").

In [None]:
params = {"do_sample": True, "num_return_sequences": 3}
options = {"use_cache": False}
output = base_models["gpt2"](input, params=params, options=options)

pprint(output)

### Run Base Models in Colab

You may find it useful to have full access to and control over the base models. You can load the models directly using the transformers library: 

In [None]:
input = "The machine learning model attribution challenge is"

base_model = pipeline("text-generation", model="model-attribution-challenge/gpt2")
output = base_model(input)

pprint(output)