![title](datarobot.jpg)

# DataRobot Python API Starter Activity 
# Part 1 — Modeling

## Pre-requisites


Before starting these exercises, be sure that you:
- Install the [DataRobot Python client](https://docs.datarobot.com/en/docs/api/api-quickstart/index.html#install-the-client) in your Python environment
- Install [`pandas`](https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html) in your Python environment
- Have access to a DataRobot account
    - If you don't, you can create a [trial account](https://www.datarobot.com/trial/)
    


## Learning objectives

By the end of this activity, you will be able to use Python to:

- Connect your DataRobot client
- Create a project
- Set the target feature for a project
- Start autopilot to build the default set of models
- Deploy the recommended model
- Request predictions in both batch and real-time modes


## Activity Goal

The goal in this activity is to predict the quality rating for a particular wine based on various characteristics of the wine, such as its acidity and alcohol and sugar content.

We have provided you with a dataset containing characteristics (features) and ratings for hundreds of wines. Using DataRobot, you will train a model based on this dataset. Then you will pass a different set of wines and their characteristics to DataRobot, which will score the input data using the model and return a predicted value for each wine's quality.

The wine data comes from the University of California, Irvine Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/wine+quality. 

Citation: *P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.*

---

---

**Tip**: Before you start running any code, we recommend you enable Jupyter interactive shell feature, which allows you to see all of the output a particular Jupyter cell produces rather than just the last output. This will be particularly helpful as you complete your exercises.

In [7]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

## 1. Before you begin

To use Python with DataRobot, you first need to establish a connection between your application or notebook and the DataRobot server. </br>

**NOTE**: If you are using [DataRobot Notebooks](https://docs.datarobot.com/en/docs/dr-notebooks/index.html) you can skip this section.

### 1a. Follow these configuration steps, to start using DataRobot via Python

To use Python with DataRobot, you first need to establish a connection between your application or notebook and the DataRobot server. </br>
This requires the following steps: </br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;i. Gather your DataRobot application <b>endpoint URL</b> </br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ii. Gather the <b>API access token</b> for your DataRobot account </br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;iii. Save 1 & 2 for your notebook to access

<b>Finding your endpoint URL: <b>
* Your [endpoint](https://docs.datarobot.com/en/docs/api/api-quickstart/index.html#retrieve-the-api-endpoint) URL depends on the type of your DataRobot account:

    * If you have a US **Managed AI Cloud** account: use `https://app.datarobot.com/api/v2`

    * If you have a European **Managed AI Cloud** account: use `https://app.eu.datarobot.com/api/v2`
    
    * If you are on a **self-managed or on-premises system**, find the host name you use to access your DataRobot account and use `https://<your-host-name>/api/v2`


<b>Finding your API access token </b>
    
* To retrieve your access token, log into your DataRobot application account, and then select **Profile Settings** (top right icon) -> **Developer Tools**. Create a new API token here if you haven't created one already.

* See [here](https://docs.datarobot.com/en/docs/api/api-quickstart/index.html#create-a-datarobot-api-key) for a visual guide in the documentation

* See [this video](https://datarobot.wistia.com/medias/29y85hz4qw) for a guide 

<b>Save your credentials for your notebook to access </b>

* You can specify your credentials in [a number of ways](https://docs.datarobot.com/en/docs/api/api-quickstart/index.html#configure-api-authentication). 
* For this activity, specify them in the provided file called `drconfig.yaml`, which should be in the same directory as this Jupyter notebook.
* The `.yaml` file is a text file containing two lines. Edit the file to add your credentials and save the file.

`token: "<YOUR_API_TOKEN>"`  
`endpoint: "<YOUR_ENDPOINT>"`


You will use this .yaml file you just prepared in step 3 below

### 1b. Jupyter Notebook Tip: 

Before you start running any code, we recommend you enable Jupyter interactive shell feature, which allows you to see all of the output a particular Jupyter cell produces rather than just the last output. This will be particularly helpful as you complete your exercises.

In [8]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

---

## 2. Import libraries

The DataRobot Python API package is called `datarobot`. Take a look at the [API documentation](https://datarobot-public-api-client.readthedocs-hosted.com/en/latest-release/index.html), and keep this page open so that you can refer to the documentation throughout the lesson.

For convenience, import it using the identifier `dr`.

In [9]:
import datarobot as dr

This activity uses the Python `pandas` data analysis package. You do not have to use `pandas` to use the DataRobot Python API, but it is a convenient way to work with datasets, so we will use it here.

In [10]:
import pandas as pd

## 3. Connect the application or notebook to DataRobot

Next, in order for your client to connect to DataRobot, you need to [create and configure](https://datarobot-public-api-client.readthedocs-hosted.com/en/latest-release/autodoc/api_reference.html#module-datarobot.client) a global `Client` object, which will be used by any Python Client API calls that need to connect to the DataRobot server.  

Now create a new `Client` object based on the credentials in the configuration YAML file you edited above.

**NOTE**: If you are using [DataRobot Notebooks](https://docs.datarobot.com/en/docs/dr-notebooks/index.html) you can skip this.

In [11]:
# Remove or comment out this line if you are using DataRobot Notebooks
dr.Client(config_path = 'drconfig.yaml')

<datarobot.rest.RESTClientObject at 0x1e3efb87b90>

***

## 4. Import & explore the training data using [pandas](https://pandas.pydata.org/)

Create a pandas DataFrame based on the wine quality training dataset we've provided and view the first few rows.

**NOTE**: If you are using DataRobot Notebooks, you can create the dataframe with:  
```
trainingData = pd.read_csv('https://datarobot.box.com/shared/static/3r98mr6joss5h1shmytcxeqseihc6ic7.csv')
```

In [12]:
trainingData = pd.read_csv('./datasets/winequality-white-training.csv')
trainingData.head()

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality
0,7.0,0.27,0.36,20.7,0.045,45.0,170.0,1.001,3.0,0.45,8.8,6
1,6.3,0.3,0.34,1.6,0.049,14.0,132.0,0.994,3.3,0.49,9.5,6
2,8.1,0.28,0.4,6.9,0.05,30.0,97.0,0.9951,3.26,0.44,10.1,6
3,7.2,0.23,0.32,8.5,0.058,47.0,186.0,0.9956,3.19,0.4,9.9,6
4,7.2,0.23,0.32,8.5,0.058,47.0,186.0,0.9956,3.19,0.4,9.9,6


---

## 5. Training Models with DataRobot

### 5a. Create a DataRobot project to begin modeling


A [`Project`](https://datarobot-public-api-client.readthedocs-hosted.com/en/v2.25.0/entities/project.html) object contains a dataset as well as the models trained from that dataset. You need a project before you can build a model.

Every project requires a name. Here we set a project name that includes the creation date  a project to help easily distinguish projects from one another. 


In [13]:
from datetime import date
projectName = 'Python wine quality ' + date.today().strftime(format = "%Y-%m-%d")
projectName

'Python wine quality 2025-06-09'

There are a number of ways to create a project. Here we use the `Project.create` method, passing the DataFrame created above and the name. When you create a project, it uploads the dataset to DataRobot and performs the initial [exploratory data analysis](https://app.datarobot.com/docs/modeling/reference/model-detail/eda-explained.html#eda1); the operation may take a minute to complete. 

**Tip**: position the cursor after `dr.Project` and use `SHIFT+TAB+TAB` to view the various attributes of a `Project` object.

In [14]:
project = dr.Project.create (
    sourcedata = trainingData,
    project_name = projectName
)

print(project.id, project.project_name)

6845d71fa05ad2afbacf0d91 Python wine quality 2025-06-09


Your work in the API is reflected in the GUI as well. For any project created via the API, you can print the hyperlink to take you directly to the project in the GUI. If you leave this page open, you can return to it to view and verify the results of your work using the API.

In [15]:
project.get_uri()

'https://app.datarobot.com/projects/6845d71fa05ad2afbacf0d91/models'

`Project` objects give you many ways to explore and interact with projects and datasets. Going into depth on these capabilities is beyond the scope of this activity, but as an example, try viewing the project's list of features.

In [16]:
project.get_features()

[Feature(alcohol),
 Feature(chlorides),
 Feature(citric acid),
 Feature(density),
 Feature(fixed acidity),
 Feature(free sulfur dioxide),
 Feature(pH),
 Feature(quality),
 Feature(residual sugar),
 Feature(sulphates),
 Feature(total sulfur dioxide),
 Feature(volatile acidity)]

### 5b. Build models for the project

Now that you have the data uploaded, you can start DataRobot Autopilot to perform the [second round of exploratory data analysis](https://app.datarobot.com/docs/modeling/reference/model-detail/eda-explained.html#eda2) and [train an initial set of models](https://app.datarobot.com/docs/modeling/reference/model-detail/model-ref.html)

There are a number of ways start the modeling process. Here we use `.analyze_and_model()` method.
This tells DataRobot which feature to use as the target feature -- that is, the feature the models will predict.

Autopilot runs asynchronously. It takes about a minute to kick off Autopilot, after which the API call will return, but Autopilot keeps running.

By default, this starts Autopilot using ["quick" mode](https://app.datarobot.com/docs/modeling/reference/model-detail/model-ref.html#quick-autopilot), which builds a limited set of common models based
on the informative features of the data.

In [17]:
project.analyze_and_model(target = 'quality')

Project(Python wine quality 2025-06-09)

_Optional_: Use the hyperlink from above to open project in the DataRobot web UI and click the **Models** tab. This will display the models that have been built so far and show the Autopilot status on the right.


You can track the Autopilot process in your notebook using `wait_for_autopilot`, which blocks the application until the models are complete.

The process can take a little time.

In [18]:
project.wait_for_autopilot()

In progress: 2, queued: 6 (waited: 0s)
In progress: 2, queued: 6 (waited: 1s)
In progress: 2, queued: 6 (waited: 3s)
In progress: 2, queued: 6 (waited: 4s)
In progress: 2, queued: 6 (waited: 6s)
In progress: 2, queued: 6 (waited: 9s)
In progress: 2, queued: 6 (waited: 13s)
In progress: 2, queued: 6 (waited: 21s)
In progress: 2, queued: 4 (waited: 35s)
In progress: 2, queued: 4 (waited: 56s)
In progress: 2, queued: 2 (waited: 77s)
In progress: 2, queued: 2 (waited: 98s)
In progress: 2, queued: 0 (waited: 119s)
In progress: 2, queued: 0 (waited: 141s)
In progress: 2, queued: 14 (waited: 162s)
In progress: 2, queued: 14 (waited: 183s)
In progress: 2, queued: 12 (waited: 204s)
In progress: 1, queued: 12 (waited: 225s)
In progress: 1, queued: 10 (waited: 247s)
In progress: 2, queued: 9 (waited: 268s)
In progress: 2, queued: 7 (waited: 289s)
In progress: 1, queued: 6 (waited: 310s)
In progress: 1, queued: 5 (waited: 331s)
In progress: 2, queued: 3 (waited: 353s)
In progress: 1, queued: 3 (wa

*Optional*: While you wait, use the hyperlink from above to view the project in the Web UI and select **Models**, where can see the models that have been built so far as they complete.

---

While you are waiting, _go back to the main course content_, and we'll meet back here in about 15 minutes.

*When Autopilot is finished, you are ready to open the Part 2 notebook to complete the activity.*