# Build a cuisine recommender

## Build your model

Building applied ML systems is an important part of leveraging these technologies for your business systems. You can use models within your web applications (and thus use them in an offline context if needed) by using Onnx.

## Exercise - train classification model

First, train a classification model using the cleaned cuisines dataset we used. 

1. Start by importing useful libraries:


In [7]:
!pip install skl2onnx
import pandas as pd 




[notice] A new release of pip available: 22.3.1 -> 23.1.2
[notice] To update, run: python.exe -m pip install --upgrade pip


   You need '[skl2onnx](https://onnx.ai/sklearn-onnx/)' to help convert your Scikit-learn model to Onnx format.

1. Then, work with your data in the same way you did in previous lessons, by reading a CSV file using `read_csv()`:

In [12]:
data = pd.read_csv('cleaned_cuisines.csv')
data.head()

Unnamed: 0.1,Unnamed: 0,cuisine,almond,angelica,anise,anise_seed,apple,apple_brandy,apricot,armagnac,...,whiskey,white_bread,white_wine,whole_grain_wheat_flour,wine,wood,yam,yeast,yogurt,zucchini
0,0,indian,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1,indian,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,2,indian,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,3,indian,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,4,indian,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0


1. Remove the first two unnecessary columns and save the remaining data as 'X':

In [13]:
X = data.iloc[:,2:]
X.head()

Unnamed: 0,almond,angelica,anise,anise_seed,apple,apple_brandy,apricot,armagnac,artemisia,artichoke,...,whiskey,white_bread,white_wine,whole_grain_wheat_flour,wine,wood,yam,yeast,yogurt,zucchini
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0


In [14]:
y = data[['cuisine']]
y.head()

Unnamed: 0,cuisine
0,indian
1,indian
2,indian
3,indian
4,indian


### Commence the training routine

We will use the 'SVC' library which has good accuracy.

1. Import the appropriate libraries from Scikit-learn:

In [15]:
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score
from sklearn.metrics import accuracy_score,precision_score,confusion_matrix,classification_report

1. Separate training and test sets:

In [16]:
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3)

1. Build an SVC Classification model as you did in the previous lesson:

In [17]:
model = SVC(kernel='linear', C=10, probability=True,random_state=0)
model.fit(X_train,y_train.values.ravel())

1. Now, test your model, calling `predict()`:

In [18]:
y_pred = model.predict(X_test)

1. Print out a classification report to check the model's quality:

In [19]:
print(classification_report(y_test,y_pred))

              precision    recall  f1-score   support

     chinese       0.66      0.71      0.68       231
      indian       0.88      0.88      0.88       241
    japanese       0.74      0.75      0.75       227
      korean       0.85      0.75      0.80       249
        thai       0.76      0.78      0.77       251

    accuracy                           0.78      1199
   macro avg       0.78      0.77      0.78      1199
weighted avg       0.78      0.78      0.78      1199



Make sure to do the conversion with the proper Tensor number. This dataset has 380 ingredients listed, so you need to notate that number in `FloatTensorType`:

1. Convert using a tensor number of 380.

In [20]:
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType

initial_type = [('float_input', FloatTensorType([None, 380]))]
options = {id(model): {'nocl': True, 'zipmap': False}}

1. Create the onx and store as a file **model.onnx**:

In [None]:
onx = convert_sklearn(model, initial_types=initial_type, options=options)
with open("./model.onnx", "wb") as f:
    f.write(onx.SerializeToString())