## Starting a Multiclass Classification Project

**Author**: Thodoris Petropoulos

**Label**: Modeling Options
### Scope
The scope of this notebook is to provide instructions on how to initiate a DataRobot project for a Multiclass Classification target using the Python API.

### Background
Multiclass classification is the task of classifying the elements of a given set into more than two groups.

Examples:

- A customer would be more interested in one of A,B,C,D... products.
- A patient has one of A,B,C,D... diseases.
- A customer would have a higher propensity to respond to one of A,B,C,D... campaigns.

Most commonly, the target column will have values:

- AAA/BBB/CCC/...(example text)
- 0/1/2/3/4/...

### Requirements

- Python version 3.7.3
-  DataRobot API version 2.19.0. 
Small adjustments might be needed depending on the Python version and DataRobot API version you are using.

Full documentation of the Python package can be found here: https://datarobot-public-api-client.readthedocs-hosted.com

#### Import Libraries

In [1]:
%run ../../../forth.py

reDef unknown
reDef -->
p e f o r t h    v1.26
source code http://github.com/hcchengithub/peforth
Type 'peforth.ok()' to enter forth interpreter, 'exit' to come back.



In [2]:
import datarobot as dr
import pandas as pd
import numpy as np

#### Import Dataset
We will be loading the iris dataset. A very simple Multiclass classification dataset that is available through sk-learn.

In [2]:
from sklearn.datasets import load_iris
data = load_iris()

df = pd.DataFrame(np.c_[data['data'], data['target']],
                  columns= np.append(data['feature_names'], ['target']))
df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,0.0
1,4.9,3.0,1.4,0.2,0.0
2,4.7,3.2,1.3,0.2,0.0
3,4.6,3.1,1.5,0.2,0.0
4,5.0,3.6,1.4,0.2,0.0


#### Connect to DataRobot
Connect to DataRobot using your credentials and your endpoint. Change input below accordingly.

In [None]:
dr.Client(token='YOUR_API_KEY}', 
          endpoint='YOUR_DATAROBOT_HOSTNAME')

#### Initiate Project
I will be initiating a project calling the method <code>dr.Project.start</code>:
* project_name: Name of project
* source_data: Data source (Path to file or pandas dataframe)
* target: String with target variable name
* worker_count: Amount of workers to use
* metric: Optimization metric to use

If your target is categorical and has a cardinality of up to 10, we will automatically select a Multiclass target_type and that argument is not needed when calling Project.start. However, if the target is numerical and you would like to force it to be seen as a Multiclass project in DataRobot, you can specify the target_type as seen below:

In [None]:
project = dr.Project.start(project_name='MyMulticlassClassificationProject',
                        sourcedata= df,
                        target='target',
                        target_type = dr.enums.TARGET_TYPE.MULTICLASS)

project.wait_for_autopilot() #Wait for autopilot to complete