# Spacy Tutorial continued

This walkthrough is based on [this spaCy tutorial](https://github.com/explosion/spaCy/blob/master/examples/training/train_textcat.py).

Train a convolutional neural network text classifier on the
IMDB dataset, using the `TextCategorizer` component. The dataset will be loaded
automatically via Thinc's built-in dataset loader. The model is added to
`spacy.pipeline`, and predictions are available via `doc.cats`.

## Set Up Environment

This notebook has been tested with the following package versions:  
(you may need to change `pip` to `pip3`, depending on your own Python environment)

In [None]:
# Python >3.5
!pip install verta
!pip install spacy==2.1.6
!python -m spacy download en

## Set Up Verta

In [3]:
HOST = 'sandbox.app.verta.ai'

PROJECT_NAME = 'Film Review Classification'
EXPERIMENT_NAME = 'spaCy CNN'

In [4]:
# import os
# os.environ['VERTA_EMAIL'] = 
# os.environ['VERTA_DEV_KEY'] = 

In [5]:
from verta import Client
from verta.utils import ModelAPI

client = Client(HOST, use_git=False)

proj = client.set_project(PROJECT_NAME)
expt = client.set_experiment(EXPERIMENT_NAME)
run = client.set_experiment_run()

set email from environment
set developer key from environment
connection successfully established
set existing Project: Film Review Classification
set existing Experiment: spaCy CNN
created new ExperimentRun: Run 1525815658107422336361


---

## Imports

In [6]:
import random

import six

import numpy as np
import thinc.extra.datasets
import spacy
from spacy.util import minibatch, compounding

# Reconstitute a run

In [11]:
run = expt.expt_runs.find("id == 'some-id'")[0]

In [12]:
# test the logged model
print("Loading from verta..")
nlp2 = run.get_model('final_model')

Loading from verta..


NameError: name 'test_text' is not defined

In [13]:
test_text = "I would definitely watch this again!"
doc2 = nlp2(test_text)
print(test_text)
print(doc2.cats)

I would definitely watch this again!
{'POSITIVE': 0.995653510093689, 'NEGATIVE': 0.004346499219536781}


In [14]:
run.log_metric("val_metric_direct", 0.5)

---

# Test on a deployed model

Click the link above to view your Experiment Run in the Verta Web App, and deploy it.  
Once it's ready, you can make predictions against the deployed model.

In [15]:
from verta._demo_utils import DeployedModel

deployed_model = DeployedModel(HOST, run.id)

In [25]:
deployed_model.predict(["I would definitely watch this again!"])

['POSITIVE']

In [17]:
 train_data, _ = thinc.extra.datasets.imdb()

In [21]:
import time
ctr = 0
live_metric = 0
for row in train_data:
    print(row[:100])
    prediction = deployed_model.predict([row[0]])
    print("prediction:", prediction)
    time.sleep(0.5)
    ctr += 1
    if ctr > 10:
        break
    if ((row[1] == 0) and (prediction == "NEGATIVE")) or ((row[1] == 1) and (prediction == "POSITIVE")):
        live_metric += 1

run.log_metric("val_metric_deployed", live_metric * 1.0 / ctr)
        

("The pakage implies that Warren Beatty and Goldie Hawn are pulling off a huge bank robbery, but that's not what I got out of it! I didn't get anything! In the first half there's a new character (without introduction) in every other scene. The first half-hour is completely incomprehensible, the rest is just one long, annoying, underlit chase scene. There's always an irritating sound in the background whether it's a loud watch ticking, a blaring siren, a train whistling, or even the horrible score by Quincy Jones. There are a lot of parts that are laughably bad, too. Like, the bad guys chasing Beatty on thin ice with a CAR! Or, the police arriving at the scene roughly fifteen times. I really hated this movie!", 0)
prediction: ['NEGATIVE']
('The most positive thing I can say for this dull witted local "comedy" production is that it\'s inoffensive. In fact it\'s so astonishingly bland that one wonders how many dozens of re-writes by committee it went through to have such a complete remova

prediction: ['POSITIVE']
("This was a very entertaining movie and I really enjoyed it, I don't normally rent movies like these (ie. indie flicks) however, I was attracted to the film because it had an incredible cast which included Jamie Kennedy, whom I have loved since the Scream trilogy. The movie director took a risk (and it is a risky risk) in telling the lives of many (and I mean MANY) different people and having the intertwine at various intervals. Taking that risk was a good idea because it's end result is an exceedingly good film. \n\n\n\nThe film has a few MAIN characters; Dwight (Jamie Kennedy) - a disgruntled fortune cookie writer whose relationship with his girlfriend is on the rocks because of an argument. Wallace Gregory (John Carroll Lynch) - an airplane loader/technician who has a love for all living things (except, perhaps meter maids) and who despite his good heart has an increasing amount of bad luck. Cyr (Brian Cox) - the owner of a Chinese restaurant/donut shop who

---