# Use a locally trained model

Congratulations, you just trained your own model. We'll assume that we have a model folder named 'my_model-1.0' located in '/work'. (See the previous tutorial on how to [train your models](#) to put things into context)

This notebook looks at how to train a model using the Universal Dependencies Corpus. 
We will learn how to (1) download the UD Corpus, (2) train a tokenizer and a tagger model on a specific language and then (3) pack it all up in a zip model that we'll use locally.

This notebook is run on a 18.04 Ubuntu, with Python3 installed. Assume we are working in folder ``/work``. Also, let's assume that NLP-Cube is installed locally in ``/work/NLP-Cube``. If you do not have NLP-Cube installed locally (not using pip install, but direclty cloning the github repo), please first follow the [local install guide](#).


### Using NLP-Cube from your own project

Let's create a "project" named ``my_project`` with a single ``main.py`` file, and place it in ``/work``.

The folder should look like:

In [1]:
! ls /work/my_project

main.py


The file has the following code:

```
import sys, os

root_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
nlp_cube_path = os.path.join(root_path,"NLP-Cube")
os.sys.path.insert(0, nlp_cube_path)

my_model_root = "/work" 
local_embeddings_file = os.path.join("/work","wiki.en.vec")

from cube.api import Cube
cube=Cube(verbose=True)
cube.load(language_code="my_model", version="1.0", local_models_repository=my_model_root, local_embeddings_file=local_embeddings_file)

text = "I'm a success today because I had a friend who believed in me and I didn't have the heart to let him down. This is a quote by Abraham Lincoln."
sentences = cube(text)
for sentence in sentences:
    print()
    for token in sentence:
        line = ""
        line += str(token.index) + "\t"
        line += token.word + "\t"
        line += token.lemma + "\t"
        line += token.upos + "\t"
        line += token.xpos + "\t"
        line += token.attrs + "\t"
        line += str(token.head) + "\t"
        line += token.label + "\t"
        line += token.deps + "\t"
        line += token.space_after
        print(line)
```

Running it will print:

In [22]:
! cd /work/my_project && python3 main.py

[dynet] random seed: 3654745557
[dynet] allocating memory: 512MB
[dynet] memory allocation done.
	Loading embeddings... 
	Loading tokenization model ...
	Lemmatizer is not available on this model. 
	Loading tagger model ...
	Parsing is not available on this model... 
Model loading complete.


1	I	_	PRON	PRP	Case=Nom|Number=Sing|Person=1|PronType=Prs	0	_	_	SpaceAfter=No
2	'm	_	AUX	VBP	Mood=Ind|Tense=Pres|VerbForm=Fin	0	_	_	_
3	a	_	DET	DT	Definite=Ind|PronType=Art	0	_	_	_
4	success	_	NOUN	NN	Number=Sing	0	_	_	_
5	today	_	NOUN	NN	Number=Sing	0	_	_	_
6	because	_	SCONJ	IN	_	0	_	_	_
7	I	_	PRON	PRP	Case=Nom|Number=Sing|Person=1|PronType=Prs	0	_	_	_
8	had	_	VERB	VBD	Mood=Ind|Tense=Past|VerbForm=Fin	0	_	_	_
9	a	_	DET	DT	Definite=Ind|PronType=Art	0	_	_	_
10	friend	_	NOUN	NN	Number=Sing	0	_	_	_
11	who	_	PRON	WP	PronType=Rel	0	_	_	_
12	believed	_	VERB	VBD	Mood=Ind|Tense=Past|VerbForm=Fin	0	_	_	_
13	in	_	ADP	IN	_	0	_	_	_
14	me	_	PRON	PRP	Case=Acc|Number=Sing|Person=1|PronType=Prs	0	_	_	_
15	and	_	C

That's it. You just tokenized and tagged your own set of sentences.

---

#### A few details

Below we'll analyze each line of the script:

We start with:
```
import sys, os

root_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
nlp_cube_path = os.path.join(root_path,"NLP-Cube")
os.sys.path.insert(0, nlp_cube_path)
```
where we manually insert the path to NLP-Cube. Remember that ``NLP-Cube``, ``my_project`` and ``my_model-1.0`` are all located in ``/work``. Simply compose the path from your script to NLP-Cube, or, use any other method for your script to have access to NLP-Cube's libraries (like extending NLP-Cube or including it in your own project as a package). 

```
my_model_root = "/work" 
local_embeddings_file = os.path.join("/work","wiki.en.vec")
```
NLP-Cube's model library is structured as a "root" folder with subfolders as models. This is why we need to specify ``/work`` as the model root folder, and let NLP-Cube then select ``my_model-1.0`` as the model we want. Also, for locally trained models we need to specify a path to our vector embeddings file (because we don't have a metadata.json file yet - we'll see in the next [tutorial](#) how to do that). Our embeddings file is simply ``/work/wiki.en.vec``.

```
from cube.api import Cube
cube=Cube(verbose=True)
cube.load(language_code="my_model", version="1.0", local_models_repository=my_model_root, local_embeddings_file=local_embeddings_file)
```

we import the Cube object, initialize it, and load our model. Notice that we specify which model we want by name directly in ``language_code="my_model"`` and its version in the ``version="1.0"``. If we don't specify a version, it will automatically pick the latest (biggest number) version. 

Because we trained a local model, we need to specify the local model repository, in this case simply ``/work``, and the full path to the local embeddings file.

Next, we calling ``sentences = cube(text)`` will automatically process the text.

---

In the following tutorial we'll cover how to [export your model for others to use](#) in their own NLP-Cube installation. 
