# Введение в работу с Azure Machine Learning (AML)

---

## Введение
[Azure Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/overview-what-is-azure-ml) предоставляет возможности по работе с данными, версионированию датасетов, моделей и логированию результатов обучения. Весь этот функционал доступен из Python SDK. В данном ноутбуке мы рассмотрим:
* подключение к AML Workspace
* доступ к датасету
* создание эксперимента и отслеживание метрик
* выбор лучшей модели

Для работы вам потребуется [**Azure ML SDK**](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/?view=azure-ml-py). AML SDK предустановлен в следующих ***conda*** окружениях:
* с префиксом `azureml_py36_*` - Python 3.6
* `py37_default`- Python 3.7

***Окружение `azureml_py36_automl` является рекомендованным.***

Для каждой команды мы заранее создали Azure ML Workspace и зарегистрировали в нем датасет `train_ds`, содержащий исходные данные.

---

## Подключение к AML Workspace

[`Workspace`](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) — основной объект для хранения и доступа к экспериментам, датасетам и другим артифактам вашей работы.
Импортируем SDK и подключимся к командному Workspace. 

*Ячейка ниже не требует редактирования. После авторизации возвращает номер вашей команды и объект Workspace.*

In [1]:
import azureml.core
from azureml.core.authentication import InteractiveLoginAuthentication
from azureml.core import Experiment, Workspace

# Check core SDK version number
print("You are currently using version", azureml.core.VERSION, "of the Azure ML SDK")
print("")

# Log In to Azure ML Workspace
interactive_auth = InteractiveLoginAuthentication(tenant_id="76f90eb1-fb9a-4446-9875-4d323d6455ad")

ws = Workspace.from_config(auth=interactive_auth)
print('Workspace name: ' + ws.name, sep='\n')

You are currently using version 1.5.0 of the Azure ML SDK

Workspace name: team19


---

## Доступ к датасету

Для доступа к датасету `train_ds` используем метод [`get_by_name`](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.dataset.dataset?view=azure-ml-py#get-by-name-workspace--name--version--latest--). Он вернет объект класса [`TabularDataset`](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py#methods). Который далее можно материализовать в Pandas Dataframe.

**!NB** Помните о том, что вы используете shared вычислительные ресурсы на команду. Импортировать весь датасет целиком каждому участнику может быть не оптимально.

Для предобработки и анализа части датасета можно использовать следующие методы:
* [take(count)](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py#take-count-)
* [take_sample(probability, seed=None)](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py#take-sample-probability--seed-none-)
* [skip(count)](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.data.tabulardataset?view=azure-ml-py#skip-count-)

In [2]:
from azureml.core import Dataset

# get the dataset from Azure ML Workspace
aml_dataset = Dataset.get_by_name(ws, 'train_ds', version='latest')
pdf = aml_dataset.take_sample(probability=1).to_pandas_dataframe()

In [9]:
import pandas as pd
pd.set_option('display.max_columns', 500)

In [31]:
display(pdf[:10]) # gender, group
# todo scale cols (expected value, variation)

Unnamed: 0,CardHolder,age,cheque_count_12m_g20,cheque_count_12m_g21,cheque_count_12m_g25,cheque_count_12m_g32,cheque_count_12m_g33,cheque_count_12m_g38,cheque_count_12m_g39,cheque_count_12m_g41,cheque_count_12m_g42,cheque_count_12m_g45,cheque_count_12m_g46,cheque_count_12m_g48,cheque_count_12m_g52,cheque_count_12m_g56,cheque_count_12m_g57,cheque_count_12m_g58,cheque_count_12m_g79,cheque_count_3m_g20,cheque_count_3m_g21,cheque_count_3m_g25,cheque_count_3m_g42,cheque_count_3m_g45,cheque_count_3m_g52,cheque_count_3m_g56,cheque_count_3m_g57,cheque_count_3m_g79,cheque_count_6m_g20,cheque_count_6m_g21,cheque_count_6m_g25,cheque_count_6m_g32,cheque_count_6m_g33,cheque_count_6m_g38,cheque_count_6m_g39,cheque_count_6m_g40,cheque_count_6m_g41,cheque_count_6m_g42,cheque_count_6m_g45,cheque_count_6m_g46,cheque_count_6m_g48,cheque_count_6m_g52,cheque_count_6m_g56,cheque_count_6m_g57,cheque_count_6m_g58,cheque_count_6m_g79,children,crazy_purchases_cheque_count_12m,crazy_purchases_cheque_count_1m,crazy_purchases_cheque_count_3m,crazy_purchases_cheque_count_6m,crazy_purchases_goods_count_12m,crazy_purchases_goods_count_6m,disc_sum_6m_g34,food_share_15d,food_share_1m,gender,group,k_var_cheque_15d,k_var_cheque_3m,k_var_cheque_category_width_15d,k_var_cheque_group_width_15d,k_var_count_per_cheque_15d_g24,k_var_count_per_cheque_15d_g34,k_var_count_per_cheque_1m_g24,k_var_count_per_cheque_1m_g27,k_var_count_per_cheque_1m_g34,k_var_count_per_cheque_1m_g44,k_var_count_per_cheque_1m_g49,k_var_count_per_cheque_3m_g24,k_var_count_per_cheque_3m_g27,k_var_count_per_cheque_3m_g32,k_var_count_per_cheque_3m_g34,k_var_count_per_cheque_3m_g41,k_var_count_per_cheque_3m_g44,k_var_count_per_cheque_6m_g24,k_var_count_per_cheque_6m_g27,k_var_count_per_cheque_6m_g32,k_var_count_per_cheque_6m_g44,k_var_days_between_visits_15d,k_var_days_between_visits_1m,k_var_days_between_visits_3m,k_var_disc_per_cheque_15d,k_var_disc_share_12m_g32,k_var_disc_share_15d_g24,k_var_disc_share_15d_g34,k_var_disc_share_15d_g49,k_var_disc_share_1m_g24,k_var_disc_share_1m_g27,k_var_disc_share_1m_g34,k_var_disc_share_1m_g40,k_var_disc_share_1m_g44,k_var_disc_share_1m_g49,k_var_disc_share_1m_g54,k_var_disc_share_3m_g24,k_var_disc_share_3m_g26,k_var_disc_share_3m_g27,k_var_disc_share_3m_g32,k_var_disc_share_3m_g33,k_var_disc_share_3m_g34,k_var_disc_share_3m_g38,k_var_disc_share_3m_g40,k_var_disc_share_3m_g41,k_var_disc_share_3m_g44,k_var_disc_share_3m_g46,k_var_disc_share_3m_g48,k_var_disc_share_3m_g49,k_var_disc_share_3m_g54,k_var_disc_share_6m_g24,k_var_disc_share_6m_g27,k_var_disc_share_6m_g32,k_var_disc_share_6m_g34,k_var_disc_share_6m_g44,k_var_disc_share_6m_g46,k_var_disc_share_6m_g49,k_var_disc_share_6m_g54,k_var_discount_depth_15d,k_var_discount_depth_1m,k_var_sku_per_cheque_15d,k_var_sku_price_12m_g32,k_var_sku_price_15d_g34,k_var_sku_price_15d_g49,k_var_sku_price_1m_g24,k_var_sku_price_1m_g26,k_var_sku_price_1m_g27,k_var_sku_price_1m_g34,k_var_sku_price_1m_g40,k_var_sku_price_1m_g44,k_var_sku_price_1m_g49,k_var_sku_price_1m_g54,k_var_sku_price_3m_g24,k_var_sku_price_3m_g26,k_var_sku_price_3m_g27,k_var_sku_price_3m_g32,k_var_sku_price_3m_g33,k_var_sku_price_3m_g34,k_var_sku_price_3m_g40,k_var_sku_price_3m_g41,k_var_sku_price_3m_g44,k_var_sku_price_3m_g46,k_var_sku_price_3m_g48,k_var_sku_price_3m_g49,k_var_sku_price_3m_g54,k_var_sku_price_6m_g24,k_var_sku_price_6m_g26,k_var_sku_price_6m_g27,k_var_sku_price_6m_g32,k_var_sku_price_6m_g41,k_var_sku_price_6m_g42,k_var_sku_price_6m_g44,k_var_sku_price_6m_g48,k_var_sku_price_6m_g49,main_format,mean_discount_depth_15d,months_from_register,perdelta_days_between_visits_15_30d,promo_share_15d,response_att,response_sms,response_viber,sale_count_12m_g32,sale_count_12m_g33,sale_count_12m_g49,sale_count_12m_g54,sale_count_12m_g57,sale_count_3m_g24,sale_count_3m_g33,sale_count_3m_g57,sale_count_6m_g24,sale_count_6m_g25,sale_count_6m_g32,sale_count_6m_g33,sale_count_6m_g44,sale_count_6m_g54,sale_count_6m_g57,sale_sum_12m_g24,sale_sum_12m_g25,sale_sum_12m_g26,sale_sum_12m_g27,sale_sum_12m_g32,sale_sum_12m_g44,sale_sum_12m_g54,sale_sum_3m_g24,sale_sum_3m_g26,sale_sum_3m_g32,sale_sum_3m_g33,sale_sum_6m_g24,sale_sum_6m_g25,sale_sum_6m_g26,sale_sum_6m_g32,sale_sum_6m_g33,sale_sum_6m_g44,sale_sum_6m_g54,stdev_days_between_visits_15d,stdev_discount_depth_15d,stdev_discount_depth_1m
0,16095858,47.0,3.0,22.0,19.0,3.0,28.0,8.0,7.0,6.0,1.0,13.0,12.0,16.0,3.0,15.0,11.0,0.0,4.0,0.0,7.0,8.0,0.0,5.0,1.0,6.0,6.0,1.0,0.0,12.0,9.0,1.0,6.0,4.0,2.0,5.0,1.0,0.0,5.0,5.0,6.0,1.0,6.0,9.0,0.0,1.0,0.0,13.0,3.0,5.0,8.0,16.0,11.0,153.09,0.6488,0.3254,Ж,test,0.7288,1.8741,0.5263,0.7692,,,0.2917,,0.6682,0.5592,0.4,0.5871,0.4654,,0.6055,0.0,0.559,0.6183,0.4845,,0.5471,0.4554,0.6479,0.824,1.4055,1.408,,,,0.5208,,0.5462,,0.1559,0.0449,0.0,0.83,0.0115,0.3846,,0.7418,0.5004,1.2014,1.3485,0.0,1.2304,0.7229,0.5943,1.5156,0.0147,0.8036,0.6366,,0.7793,1.2143,1.0723,1.3947,0.0123,0.4621,0.4864,0.7067,0.0589,,,0.5946,0.0823,,0.1414,,0.8669,0.3707,0.0,0.7177,0.0866,1.3485,,0.464,0.3956,0.193,0.0,0.8019,0.1895,0.6128,2.1596,0.681,0.6546,0.13,1.2374,,,0.0,0.8756,0.6718,2.0876,0,0.6055,18.0,1.3393,0.5821,0,0.923077,0.071429,10.0,84.314,98.0,16.0,11.0,137.282,28.776,6.0,169.658,10.68,7.0,28.776,21.0,8.0,9.0,4469.86,658.85,1286.32,7736.05,418.8,3233.31,811.73,2321.61,182.82,283.84,3648.23,3141.25,356.67,237.25,283.84,3648.23,1195.37,535.42,1.7078,0.2798,0.3008
1,15906620,57.0,1.0,0.0,2.0,1.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,2.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,0.0,2.0,1.0,1.0,1.0,0.0,3.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,55.99,0.0,1.0,Ж,test,0.0,0.963,0.0,0.0,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,1.0102,0.0,0.0,,,,0.0,0.0,0.0,1.0027,0.0,,,,,0.0,0.0,0.0,,0.0,0.0,0.0,0.0,0.1094,0.0,,,1.1289,0.0,0.6188,0.0,0.0,0.0,,0.4981,0.6382,,,,1.1289,0.0,0.0,0.4981,0.6382,0.0,0.0,0.0,,,,0.0,0.0,0.0,0.0,,0.0,0.0,0.0,0.0,0.2072,0.0,,,0.3993,0.8333,0.0,0.0,0.0,,0.6192,0.5405,,0.2072,,,,0.0,0.0,,0.6192,1,0.0,4.0,0.0,0.0,0,1.0,0.0,1.0,1.0,2.0,2.0,0.0,0.0,1.0,0.0,1.744,2.0,1.0,1.0,0.0,2.0,0.0,113.39,62.69,58.71,93.35,87.01,0.0,122.98,0.0,58.71,87.01,179.83,113.39,62.69,58.71,87.01,179.83,0.0,122.98,0.0,0.0,0.0
2,16495466,38.0,7.0,0.0,15.0,4.0,9.0,5.0,9.0,14.0,7.0,6.0,10.0,14.0,5.0,11.0,0.0,3.0,2.0,2.0,0.0,3.0,2.0,1.0,1.0,0.0,0.0,2.0,6.0,0.0,9.0,2.0,5.0,1.0,7.0,7.0,8.0,3.0,2.0,6.0,6.0,3.0,4.0,0.0,0.0,2.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,290.0,0.3739,0.4768,М,control,,0.3295,,,0.0,,0.0,0.0,0.4159,0.8485,0.0,0.0,0.0302,0.0,0.6009,0.6205,1.0035,0.5712,0.5762,0.4714,0.983,0.0,,0.5559,,0.978,0.0,,,0.0,0.0,0.0078,,0.8362,1.3183,0.856,0.0,0.0,0.6077,0.0,0.7665,1.2056,,1.0002,0.0541,0.8461,0.5812,0.6945,1.7252,0.7579,0.6608,0.856,0.9266,1.1554,0.7782,0.7471,2.0674,0.8871,,0.1201,,0.6629,,,0.0,0.0,0.0,0.0666,,0.4668,1.3422,0.3536,0.0,0.0,0.01,0.0,0.0457,0.2615,0.5856,0.287,0.5238,0.2017,0.284,1.8758,0.6338,0.2654,0.0,0.4481,0.7673,0.2393,0.2851,0.517,0.2407,2.5227,0,0.7256,34.0,0.0,0.7256,0,1.0,0.25,5.0,21.102,50.0,109.0,0.0,0.0,7.594,0.0,25.294,11.084,3.0,11.158,31.0,59.0,0.0,1564.91,971.09,177.93,3257.49,975.21,2555.27,6351.29,0.0,0.0,0.0,783.87,1239.19,533.46,83.37,593.13,1217.43,1336.83,3709.82,0.0,,0.0803
3,16570217,65.0,6.0,3.0,25.0,2.0,10.0,14.0,11.0,8.0,1.0,0.0,2.0,6.0,7.0,2.0,0.0,0.0,0.0,1.0,0.0,5.0,0.0,0.0,1.0,0.0,0.0,0.0,2.0,1.0,11.0,2.0,3.0,5.0,5.0,4.0,2.0,1.0,0.0,1.0,3.0,1.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,3.0,0.0,51.81,0.0,1.0,Ж,test,0.0,1.4933,0.0,0.0,,,0.0,0.0,0.0,0.0,0.0,0.0,,,0.0,0.0,,,0.3295,0.0,0.0,0.0,0.0,0.7432,0.0,0.1315,,,,0.0,0.0,0.0,0.0,0.0,0.0,,0.0,,,,0.7904,0.005,,,0.0,,0.0,,0.0166,0.5362,,0.578,0.1315,0.3219,1.129,,1.7975,1.253,0.0,0.0,0.0,0.2354,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,0.0,,,,0.1655,0.656,,0.0,,0.0,,0.1326,0.1477,,0.7469,0.2352,0.2354,0.6846,,0.2671,0.1028,3.0736,1,0.0,40.0,0.0,0.0,0,0.909091,0.0,2.0,12.544,49.0,39.0,0.0,0.0,2.778,0.0,2.0,34.212,2.0,3.778,2.0,13.0,0.0,358.22,3798.18,680.93,1425.07,175.73,602.81,3544.76,0.0,119.99,73.24,346.74,139.68,1849.91,360.4,175.73,496.73,172.58,1246.21,0.0,0.0,0.0
4,16346871,61.0,0.0,1.0,2.0,0.0,2.0,1.0,0.0,3.0,2.0,1.0,1.0,5.0,5.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,2.0,0.0,0.0,1.0,0.0,1.0,2.0,0.0,2.0,1.0,0.0,8.0,2.0,2.0,1.0,1.0,4.0,3.0,0.0,0.0,0.0,1.0,2.0,4.0,1.0,1.0,2.0,4.0,2.0,161.12,0.2882,0.2882,Ж,test,0.9301,0.9014,0.8165,0.7542,0.0,,0.0,0.0,,0.0,0.0,0.0,0.6682,0.0,0.7781,0.0,0.0,0.4826,0.7526,0.0,,0.4714,0.4714,0.998,1.3497,0.0,0.0,,0.0,0.0,0.0,,,0.0,0.0,1.2273,0.0,0.0249,0.9172,0.0,,0.6549,0.0,0.6684,0.4566,0.0,0.0,0.0138,1.2899,1.607,1.071,0.9058,0.0,0.5955,,,1.6232,1.3194,0.4903,0.4903,0.9423,0.0,,0.0,0.0,0.0,0.0,,,0.0,0.0,0.2284,0.0,0.1325,0.2146,0.0,,0.1934,0.5189,0.0058,0.0,0.0,0.3868,2.0293,0.686,0.6113,1.0926,0.2211,0.0,0.0058,0.2496,,0.2195,1.4917,0,0.7128,20.0,0.0,0.7865,0,1.0,0.1,0.0,1.454,25.0,25.0,0.0,0.0,0.454,0.0,3.036,12.0,0.0,1.454,8.0,23.0,0.0,226.98,168.05,960.37,1560.21,0.0,342.45,1039.85,0.0,66.18,0.0,87.94,226.98,168.05,461.37,0.0,237.93,225.51,995.27,1.4142,0.3495,0.3495
5,15692946,27.0,0.0,0.0,7.0,0.0,2.0,1.0,0.0,2.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.0,0.0,2.0,1.0,0.0,0.0,2.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,33.62,0.994,0.9961,Ж,test,0.5292,0.6376,0.4129,0.3536,,0.0,,0.6625,0.4,0.0,0.0,,0.4777,0.0,0.3912,0.1651,0.0,,0.4777,0.0,0.0,0.8255,0.7813,1.4281,0.764,0.0,,0.0158,0.0,,0.003,0.0225,0.0,0.288,0.0,,,0.004,1.0334,0.0,0.0396,0.3889,,0.0,1.1955,0.288,,0.0,0.0,,,1.0334,0.0,0.3889,0.288,,0.0,,0.2211,0.5565,0.469,0.0,0.4149,0.0,,0.7606,0.0001,0.3357,0.0,0.1791,0.0,,,0.7606,0.0857,0.0,0.0021,0.306,0.0,0.3104,0.1791,,0.0,0.0,,,0.7606,0.0857,0.0,0.3104,,0.1791,0.0,0.0,1,0.264,20.0,0.75,0.2128,0,0.0,1.0,0.0,2.0,2.0,1.0,0.0,0.436,2.0,0.0,0.436,13.0,0.0,2.0,2.0,1.0,0.0,10.9,379.43,90.07,135.59,0.0,114.48,209.34,10.9,90.07,0.0,239.39,10.9,379.43,90.07,0.0,239.39,114.48,209.34,3.0957,0.0584,0.1171
6,16536055,57.0,1.0,3.0,0.0,0.0,0.0,1.0,2.0,6.0,2.0,0.0,1.0,2.0,2.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,3.0,0.0,0.0,0.0,0.0,2.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,Ж,control,,0.0762,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,0.0,0.0,0.0,0.0,0.9557,0.5615,0.0,,0.0,0.0,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,0.0,0.0,0.1139,0.8781,0.0,0.0,,,0.5265,0.7076,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,0.0,0.0,0.282,0.4718,0.1252,0.0,,,,,0.9513,0,1.0,34.0,0.0,1.0,0,0.0,0.111111,0.0,0.0,2.0,13.0,0.0,0.0,0.0,0.0,8.584,0.0,0.0,0.0,3.708,3.0,0.0,740.92,0.0,593.3,994.78,0.0,454.28,405.8,0.0,23.93,0.0,0.0,361.89,0.0,287.61,0.0,0.0,55.42,96.6,0.0,,
7,16388254,32.0,1.0,5.0,7.0,1.0,6.0,2.0,4.0,1.0,2.0,1.0,4.0,2.0,1.0,2.0,6.0,1.0,5.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,3.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,2.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,198.1,0.0,0.0,Ж,test,0.0,,0.0,0.0,,,,,,,,,,0.0,,0.0,0.0,0.7157,0.7716,0.0,,0.0,0.0,0.0,0.0,,,,,,,,,,,,,0.0,,0.0,0.0,,,0.0,0.0,0.0,0.0,0.0,,,0.5087,0.7576,0.0,0.7853,,,0.0389,0.0117,0.0,0.0,0.0,,,,,,,,,,,,,0.0,,0.0,0.0,,0.0,0.0,0.0,0.0,0.0,,,0.4901,,0.2224,0.0,0.0,,,,0.6881,0,0.0,64.0,0.0,0.0,0,1.0,0.0,1.0,10.526,26.0,11.0,6.0,1.0,0.0,0.0,9.464,1.0,0.0,2.746,2.0,3.0,2.0,2297.08,308.16,419.75,2501.86,59.99,492.46,445.99,69.99,0.0,0.0,0.0,825.75,19.19,195.98,0.0,343.24,73.78,100.37,0.0,0.0,0.0
8,16406288,69.0,1.0,1.0,5.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,2.0,0.0,0.0,1.0,2.0,1.0,1.0,1.0,2.0,0.0,1.0,0.0,0.0,0.0,1.0,1.0,1.0,3.0,0.0,1.0,0.0,0.0,3.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,2.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,128.38,0.0,0.0,Ж,test,0.0,0.4061,0.0,0.0,,,,,,,,,,0.0,0.5728,0.0,,0.833,0.8365,0.0,,0.0,0.0,0.202,0.0,0.0,,,,,,,,,,,,0.0034,,0.0,,0.2622,0.0,0.0716,0.0,,0.0,,0.0444,0.0,0.619,0.3102,0.0,0.259,,0.0,1.6058,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,0.0002,,0.0,,0.0752,0.2947,0.0,,0.0,,1.1356,0.0,0.0022,0.1908,0.1403,0.0,0.0,0.0,,,1.5719,0,0.0,17.0,0.0,0.0,0,1.0,0.0,0.0,1.364,22.0,2.0,2.0,2.432,1.364,0.0,11.836,4.436,0.0,1.364,3.0,0.0,0.0,1271.51,320.39,466.67,1123.75,0.0,458.88,117.15,249.85,79.97,0.0,115.93,1218.97,166.49,408.97,0.0,115.93,329.43,0.0,0.0,0.0,0.0
9,15779982,36.0,1.0,1.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,,,,,,,,,,1.0,1.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,7.65,0.0,0.0,М,test,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,0.0,,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,0.0,0.0,0.0,,0.0,,0,0.0,4.0,0.0,0.0,0,0.0,0.0,0.0,1.688,4.0,0.0,1.0,,,,11.312,4.294,0.0,1.688,1.0,0.0,1.0,491.43,411.23,0.0,168.74,0.0,43.0,0.0,,,,,491.43,411.23,0.0,0.0,406.45,43.0,0.0,0.0,0.0,0.0


In [3]:
import math
# get the dataset from Azure ML Workspace
pdf['gender'] = pdf['gender'].apply(lambda x: 1 if x == 'М' else 0)
pdf['group'] = pdf['group'].apply(lambda x: 1 if x == 'test' else 0)

for column in pdf.columns:
    pdf[column] = pdf[column].apply(lambda x: 0 if math.isnan(x) else x)
    
y = pdf['response_att']
X = pdf.drop(columns=['response_att'])

In [6]:
pdf.to_csv("dataset.csv")

In [10]:
from_disk = pd.read_csv("dataset.csv")