[![Github](https://img.shields.io/github/stars/lab-ml/python_autocomplete?style=social)](https://github.com/lab-ml/python_autocomplete)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lab-ml/python_autocomplete/blob/master/notebooks/evaluate.ipynb)

# Evaluate a model trained on predicting Python code

This notebook evaluates a model trained on Python code.

Here's a link to [training notebook](https://github.com/lab-ml/python_autocomplete/blob/master/notebooks/train.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lab-ml/python_autocomplete/blob/master/notebooks/train.ipynb)

### Install dependencies

In [1]:
!pip install labml labml_python_autocomplete

Collecting labml_python_autocomplete
  Downloading labml_python_autocomplete-0.0.5-py3-none-any.whl (14 kB)
Collecting labml
  Downloading labml-0.4.101-py3-none-any.whl (101 kB)
[K     |████████████████████████████████| 101 kB 445 kB/s ta 0:00:01
[?25hCollecting labml-helpers>=0.4.70
  Downloading labml_helpers-0.4.74-py3-none-any.whl (15 kB)
Collecting labml-nn>=0.4.70torch
  Downloading labml_nn-0.4.86-py3-none-any.whl (145 kB)
[K     |████████████████████████████████| 145 kB 2.0 MB/s eta 0:00:01
[?25hCollecting einops
  Downloading einops-0.3.0-py2.py3-none-any.whl (25 kB)
Installing collected packages: labml, labml-helpers, einops, labml-nn, labml-python-autocomplete
  Attempting uninstall: labml
    Found existing installation: labml 0.4.54
    Uninstalling labml-0.4.54:
      Successfully uninstalled labml-0.4.54
Successfully installed einops-0.3.0 labml-0.4.101 labml-helpers-0.4.74 labml-nn-0.4.86 labml-python-autocomplete-0.0.5


Imports

In [47]:
import string

import torch
from torch import nn

import numpy as np

from labml import experiment, logger, lab
from labml_helpers.module import Module
from labml.analytics import ModelProbe
from labml.logger import Text, Style, inspect
from labml.utils.pytorch import get_modules
from labml.utils.cache import cache
from labml_helpers.datasets.text import TextDataset

from python_autocomplete.train import Configs
from python_autocomplete.evaluate import Predictor
from python_autocomplete.evaluate.beam_search import NextWordPredictionComplete

We load the model from a training run. For this demo I'm loading from a run I trained at home.

[![View Run](https://img.shields.io/badge/labml-experiment-brightgreen)](https://web.lab-ml.com/run?uuid=39b03a1e454011ebbaff2b26e3148b3d)

If you have a locally trained model load it directly with:

```python
run_uuid = 'RUN_UUID'
checkpoint = None # Get latest checkpoint
```

`load_bundle` will download an archive with a saved checkpoint (pretrained model).

In [2]:
# run_uuid = 'a6cff3706ec411ebadd9bf753b33bae6'
# checkpoint = None

run_uuid, checkpoint = experiment.load_bundle(
    lab.get_path() / 'saved_checkpoint.tar.gz',
    url='https://github.com/lab-ml/python_autocomplete/releases/download/0.0.5/bundle.tar.gz')

We initialize `Configs` object defined in [`train.py`](https://github.com/lab-ml/python_autocomplete/blob/master/python_autocomplete/train.py).

In [3]:
conf = Configs()

Create a new experiment in evaluation mode. In evaluation mode a new training run is not created. 

In [4]:
experiment.evaluate()

Load custom configurations/hyper-parameters used in the training run.

In [5]:
custom_conf = experiment.load_configs(run_uuid)
custom_conf

{'epochs': 32,
 'is_token_by_token': True,
 'mem_len': 256,
 'model': 'transformer_xl_model',
 'n_layers': 6,
 'optimizer.learning_rate': 0.000125,
 'optimizer.optimizer': 'AdamW',
 'state_updater': 'transformer_memory',
 'text.batch_size': 12,
 'text.is_shuffle': False,
 'text.seq_len': 256,
 'text.tokenizer': 'bpe'}

Set the custom configurations

In [6]:
# custom_conf['device.use_cuda'] = False

In [7]:
experiment.configs(conf, custom_conf)

Set models for saving and loading. This will load `conf.model` from the specified run.

In [8]:
experiment.add_pytorch_models({'model': conf.model})

Specify which run to load from

In [9]:
experiment.load(run_uuid, checkpoint)

Start the experiment

In [10]:
experiment.start()

<labml.internal.experiment.watcher.ExperimentWatcher at 0x7f8eaf318310>

Initialize the `Predictor` defined in [`evaluate.py`](https://github.com/lab-ml/python_autocomplete/blob/master/python_autocomplete/evaluate.py).

We load `stoi` and `itos` from cache, so that we don't have to read the dataset to generate them. `stoi` is the map for character to an integer index and `itos` is the map of integer to character map. These indexes are used in the model embeddings for each character.

In [11]:
p = Predictor(conf.model, conf.text.tokenizer,
              state_updater=conf.state_updater,
              is_token_by_token=conf.is_token_by_token)

Set model to evaluation mode

In [12]:
_ = conf.model.eval()

Setup probing to extract attentions

In [13]:
probe = ModelProbe(conf.model)

A python prompt to test completion.

In [20]:
PROMPT = """from torch import nn

from labml_helpers.module import Module
from labml_nn.lstm import LSTM


class LSTM(Module):
    def __init__(self, *,
                 n_tokens: int,
                 embedding_size: int,
                 hidden_size int,
                 n_layers int):
        """

Get a token. `get_token` predicts character by character greedily (no beam search) until it find and end of token character (non alpha-numeric character).

In [21]:
stripped, prompt = p.rstrip(PROMPT)
rest = PROMPT[len(stripped):]
prediction_complete = NextWordPredictionComplete(rest, 5)
prompt = torch.tensor(prompt, dtype=torch.long).unsqueeze(-1)

In [22]:
%%time
predictions = p.get_next_word(prompt, None, rest, [1.], prediction_complete, 5)
predictions.sort(key=lambda x: -x[0])
[(pred.prob, pred.text[len(rest):]) for pred in predictions]

CPU times: user 193 ms, sys: 47.4 ms, total: 241 ms
Wall time: 218 ms


[(0.07585359086023313, 'super'),
 (0.0010850478390376612, '"""\n        '),
 (0.0007989550358615816, '        ')]