# Text Classification: Sentiment
Trains a model to classify user text as positive (5), negative (1), or in between.

Below we do the following:

1. Setup the training environment.
2. Load labeled text training data.
3. Build a sentiment classification model.
4. Convert the model to CoreML and download it.

The example is based on [Turi Create's Text Classifier](https://github.com/apple/turicreate/tree/master/userguide/text_classifier).

## Environment Setup
All we need to do is install the turicreate library to get started. This example **doesn't** use a GPU for training.

In [0]:
# Install turicreate 
!pip install turicreate==5.4.1

## Data Preparation and Model Training
The training data for this example is Yelp review data, paired with sentiment scores. The data is randomly split into train and test sets, where 80% of the data is used for training, and 20% is used for model evaluation.

Sentiment classification is the task of assigning a positivity/negativity rating on a scale from 1-5 over a piece of text. A 5 means the text is positive, and a 1 means the text is negative.

In [0]:
# Import turicreate
import turicreate as tc

In [0]:
# Load data from Turi Create's website
data = tc.SFrame('https://static.turi.com/datasets/regression/yelp-data.csv')

# Rename target column for standardization
data['label'] = data['stars'].astype(str)

# Make a train-test split
train_data, test_data = data.random_split(0.8)

In [0]:
# Train a sentiment classification model - this may take a few minutes to train
model = tc.text_classifier.create(
    dataset=train_data,
    target='label',
    features=['text'],
    drop_stop_words=True
)

# Text Classifier Training Docs:
# https://apple.github.io/turicreate/docs/api/generated/turicreate.text_classifier.create.html#turicreate.text_classifier.create

## Model Evaluation


In [0]:
# Now that the model is trained, we can evaluate against a test set
test_predictions = model.predict(test_data)
accuracy = tc.evaluation.accuracy(test_data['label'], test_predictions)
print(f'Sentiment model has a testing accuracy of {accuracy*100} % !', flush=True)

In [0]:
# Classify a new example of text - try different text values here
example_text = {"text": ["I really love it. It filled me with joy and was super awesome."]}
example_prediction = model.classify(tc.SFrame(example_text))
print(example_prediction, flush=True)

## Model Export and Download
We convert the model to CoreML format so that it can run on an iOS device. Then we download it locally so it can be delivered to your apps with **[Skafos](https://skafos.ai)**.

In [0]:
# Specify the CoreML model name
model_name = 'TextClassifier'
coreml_model_name = model_name + '.mlmodel'

# Export the trained model to CoreML format
res = model.export_coreml(coreml_model_name) 


In [0]:
# Download the model you just trained!! This may take a few moments
from google.colab import files
files.download(coreml_model_name)