## **Turi Text Classifier**

Welcome to the text classification model quickstart on Skafos! The purpose of this notebook is to get you going end-to-end. Below we will do the following:

1. Load labeled text training data.
2. Build a sentiment classification model.
3. Convert the model to CoreML format and save it to the Skafos framework.

The example is based on [Turi Create's Text Classifier](https://apple.github.io/turicreate/docs/userguide/text_classifier/).

---

Execute each cell one-by-one, by selecting the cell and do one of the following:
-  Clicking the "play" button at the top of this frame.
-  Typing 'Control + Enter' or 'Shift + Enter'.



In [None]:
# If this is your first time in the JupyterLab workspace - install external dependencies
from utilities.dependencies import install
install(timeout=500)

# No need to do this in the future for this notebook

In [None]:
# Import necessary libraries
from skafossdk import *
import turicreate as tc

In [None]:
# Initialize Skafos
ska = Skafos()

### 1. **Load the data**
The training data for this example is Yelp review data, paired with sentiment scores. The data is randomly split into train and test sets, where 80% of the data is used for training, and 20% is used for model evaluation.

In [None]:
# Load data from Turi Create's website
data = tc.SFrame('https://static.turi.com/datasets/regression/yelp-data.csv')

# Make a train-test split
train_data, test_data = data.random_split(0.8)

In [None]:
# Take a look at the data
train_data.head(5)

### 2. **Build the model**
We use the `tc.text_classifier.create` function and specify the data, target variable, features, and a few other arguments needed to properly train the model. To understand more about this specific function, check out the [Turi Create Documentation](https://apple.github.io/turicreate/docs/api/generated/turicreate.text_classifier.create.html#turicreate.text_classifier.create).

In [None]:
# Train a sentiment classification model, this takes approximately 15 minutes using CPU
model = tc.text_classifier.create(
    dataset=train_data,
    target='stars',
    features=['text'],
    drop_stop_words=True,
    max_iterations=10
)

In [None]:
# Now that the model is trained, we can evaluate against a test set
test_predictions = model.predict(test_data)
accuracy = tc.evaluation.accuracy(test_data['stars'], test_predictions)
print(f'Sentiment model has a testing accuracy of {accuracy*100} % !', flush=True)

In [None]:
# Classify a new example of text - try different text values here
example_text = {"text": ["I really love it. It filled me with joy and was awesome."]}
example_prediction = model.classify(tc.SFrame(example_text))
print(example_prediction, flush=True)

### 3. **Deliver the model**
Once your model has been created, it must be converted to CoreML and saved to the Skafos framework. Once saved, if you wish to push to your iOS devices, you can use the `.deliver()` method below. We've left that commented out for now.

In [None]:
# Specify the CoreML model name
model_name = 'TextClassifier'
coreml_model_name = model_name + '.mlmodel'

# Export the trained model to CoreML format
res = model.export_coreml(coreml_model_name) 

# Save model asset to Skafos
ska.asset_manager.save(
    name=model_name,              # Name used to load or deliver asset, also used within the Swift SDK.
    files=coreml_model_name,      # File or list of files to bundle together as a versioned asset.
    tags=['latest'],              # User-defined tags to help distinguish your asset.
    access='public'               # Asset access- public/private.
)

In [None]:
# Deliver asset to devices (push)
#ska.asset_manager.deliver(
#  name=model_name,                # Name used to load or deliver asset, also used within the Swift SDK.
#  tag='latest',                   # User-defined tags to help distinguish your asset.
#  dev=True                        # Push asset through Apple's APNS dev or prod server
#)