# Advanced usage - Model export and import

Following the previous [model training tutorial] () we now have a working model located in ``/work/my_model-1.0``. We'll learn how to export a model as a zip file and then import it locally.


## A look at NLP-Cube's model storage format

NLP-Cube's model library is located in the hidden ``.nlpcube`` folder located in the user's home directory. This "root" folder contains the ``embeddings`` folder where all embeddings are stored, as well as any number of other folders, each in the format ``model_name-model_version``, where the actual models are stored. This is the reason why we created a **``my_model-1.0``** folder: the part before the hyphen is the **model's name** we'll call during runtime, and **1.0** is the version of this model. This way we can have several versions of models for a language, and using NLP-Cube's .load() API call with the default version parameter will ensure we'll always load the latest version for any given model. 

There are only two rules for model naming: 
1. The name can only contain a single hyphen that divides the name from the version.
2. The version must be a number (with or without a comma)

We first look at how to export a locally-trained model, then how to import it, and finally we look in how we can manage our model library programatically.

## 1. Export a locally-trained model 

There are two steps to package a model: first we need to create a metadata file which will let NLP-Cube instances know about the model that will be loaded (like path to embeddings, etc.) and secondly zip everything in a single file. 

### 1.1. Create metadata for the model

In ``NLP-Cube/examples`` we have a ``metadata.json`` template file. This file is used by NLP-Cube to store data about a packaged model that we use and can further redistribute. Copy this file to ``my_model-1.0`` and edit it. Let's say we edited the file as:
```
{
    "embeddings_file_name": "wiki.en.vec",
    "embeddings_remote_link": "https://s3-us-west-1.amazonaws.com/fasttext-vectors/wiki.en.vec",
    "language": "UD_English",
    "language_code": "my_model",
    "model_build_date": "2018-10-19",
    "model_build_source": "UD_English-ParTuT",
    "model_version": 1.0,
    "notes": "Source: ud-treebanks-v2.2",
    "token_delimiter": " "
}
```
Let's look at each line in detail (in bold where non-optional parameters are required):
1. **``embeddings_file_name``** will be the local file name of the embeddings. This can be shared among versions and models so as not to keep multiple copies of the same embedding. If in doubt, just keep the original file name.
2. **``embeddings_remote_link``** is the web link for NLP-Cube to automatically download an embedding file. As we only want to distribute the model and not the (very large!) embedding file, use this to point to an online file - in this case we're using the FastText wiki embeddings for English; otherwise leave empty, but be sure to manually copy the embedding file to ``/home/__your-username__/.nlpcube/embeddings/`` and give it the same name as ``embeddings_file_name``.
4. **``language_code``** is the short code for the model.
5. **``model_version``** is the version (read as a float number) of the model. Note that the model will be named **``language_code-model_version.zip``**, and will be loaded as ``.load("language_code")``.
6. **``token_delimiter``** is used to differentiate between languages that use space as a delimiter of words and those which don't (e.g. Japanese, Chinese). Set this to "" (empty string) for these particular languages, otherwise keep a singe space as in the default example. 
7. ``language``, ``model_build_date``, ``model_build_source``,  and ``notes`` are for reference purposes. Set them to whatever you like, and access them in python with the metadata.info() function.


### 1.2. Export the model as a zip file

Run the ``export_model.py`` script located in the ``NLP-Cube/scripts`` folder.
As in the previous tutorial we trained only a tokenizer and a tagger, we'll only pass the --tokenizer and --tagger flags (so the packager won't look for other components), along with the path to our model:


In [41]:
! python3 /work/NLP-Cube/scripts/export_model.py /work/my_model-1.0 --tokenizer --tagger

[dynet] random seed: 2748650582
[dynet] allocating memory: 512MB
[dynet] memory allocation done.
Usage: python3 export_model.py path-to-my-model --tokenizer(optional) --compound-word-expander(optional) --lemmatizer(optional) --tagger(optional) --parser(optional)
Example: 'python3 export_model.py path-to-my-model --tokenizer --tagger' will create a zip file named 'language_code-model_version.zip' (taken from the metadata.json) containing a tokenizer and a tagger.

	Model folder: /work/my_model-1.0
	Use tokenizer: True
	Use compound word expander: False
	Use lemmatizer: False
	Use tagger: True
	Use parser: False

	Writing model to temp folder: /tmp/tmpgr_njhs5/my_model-1.0
	Tokenizer model found.
	Tagger model found.
	Compressing model ...
	Cleaning up ...
Model packaged successfully as: /work/my_model-1.0.zip


We now have a ``my_model-1.0.zip`` file located in ``/work``. This file can now be redistributed to anybody to import and use. _Note the embeddings that should be either online or manually copied locally (details above)._


## 2. Import a .zip model

We just received a .zip model and we want to import it in NLP-Cube's model library. In ``NLP-Cube/scripts`` run the ``import_model.py`` file with the path to the .zip file:

In [10]:
! cd /work/NLP-Cube/scripts && python3 import_model.py /work/my_model-1.0.zip

Importing model /work/my_model-1.0.zip

Model extracted successfully.
Checking for associated vector embeddings file [wiki.en.vec] ...
Embeddings downloaded successfully.                                                  Model /work/my_model-1.0.zip was successfully imported.


Now the model is imported and from now on we can load it by simply calling ``cube.load("my_model")``

## 3. Manage NLP-Cube's model library programatically

Either modify the contents of ``~/.nlpcube`` manually, or import a ModelStore object and instantiate it:

```
from cube.io_utils.model_store import ModelStore
model_store_object = ModelStore()
```

To **view local models**, run:
```
local_models = model_store_object.list_local_models()
print(local_models)
```

To **view available online models**, run:
```
online_models = model_store_object.list_online_models()
print(online_models)
```

To **delete a local model**, run:
``
model_store_object.delete_model(lang_code = "my_model", version = "1.0")
``
or adjust parameters accordingly. Note that the associated embeddings file will be deleted too, unless used by at least one other existing model. 

