# Cloud Workshop Microsoft
## 1. Préparation de données avec Azure ML service

Exemple données ville de Chicago<br>


## 0. Paramétrage

In [31]:
#!pip install --upgrade azureml-dataprep

In [32]:
from IPython.display import display
from os import path
from tempfile import mkdtemp

import pandas as pd
import azureml.dataprep as dprep

# Données
file_crime_dirty  = 'data/crime-dirty.csv'
file_crime_spring = 'data/crime-spring.csv'
file_crime_winter = 'data/crime-winter.csv'
file_aldermen     = 'data/chicago-aldermen-2015.csv'

# Seed
RAND_SEED = 7251

<a id="Read"></a>

## 1. Lecture des données

Azure ML Data Prep supports many different file reading formats (i.e. CSV, Excel, Parquet) and the ability to infer column types automatically. To see how powerful the `auto_read_file` capability is, let's take a peek at the `dirty-crime.csv`:

In [33]:
dprep.read_csv(path=file_crime_dirty).head(10)

Unnamed: 0,File updated 11/2/2018,Column2
0,,
1,,
2,,
3,ID|Case Number|Date|Block|IUCR|Primary Type|De...,
4,10140490|HY329907|07/05/2015 11:50:00 PM|050XX...,-87.800174996)
5,10139776|HY329265|07/05/2015 11:30:00 PM|011XX...,-87.65955018)
6,10140270|HY329253|07/05/2015 11:20:00 PM|121XX...,
7,10139885|HY329308|07/05/2015 11:19:00 PM|051XX...,-87.754883404)
8,10140379|HY329556|07/05/2015 11:00:00 PM|012XX...,-87.657008701)
9,10140868|HY330421|07/05/2015 10:54:00 PM|118XX...,-87.644545209)


A common occurrence in many datasets is to have a column of values with commas; in our case, the last column represents location in the form of longitude-latitude pair. The default CSV reader interprets this comma as a delimiter and thus splits the data into two columns. Furthermore, it incorrectly reads in the header as the column name. Normally, we would need to `skip` the header and specify the delimiter as `|`, but our `auto_read_file` eliminates that work:

In [34]:
crime_dirty = dprep.auto_read_file(path=file_crime_dirty)

In [35]:
crime_dirty.head(10)

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
0,10140490.0,HY329907,07/05/2015 11:50:00 PM,050XX N NEWLAND AVE,820.0,THEFT,$500 AND UNDER,STREET,False,False,...,41.0,10.0,06,1129230.0,1933315.0,2015.0,07/12/2015 12:42:46 PM,41.973309,-87.800175,"(41.973309466, -87.800174996)"
1,10139776.0,HY329265,07/05/2015 11:30:00 PM,011XX W MORSE AVE,460.0,BATTERY,SIMPLE,STREET,False,True,...,49.0,1.0,08B,1167370.0,1946271.0,2015.0,07/12/2015 12:42:46 PM,42.008124,-87.65955,"(42.008124017, -87.65955018)"
2,10140270.0,HY329253,07/05/2015 11:20:00 PM,121XX S FRONT AVE,486.0,BATTERY,DOMESTIC BATTERY SIMPLE,STREET,False,True,...,9.0,53.0,08B,,,2015.0,07/12/2015 12:42:46 PM,,,
3,10139885.0,HY329308,07/05/2015 11:19:00 PM,051XX W DIVISION ST,610.0,BURGLARY,FORCIBLE ENTRY,SMALL RETAIL STORE,False,False,...,37.0,25.0,05,1141721.0,1907465.0,2015.0,07/12/2015 12:42:46 PM,41.902152,-87.754883,"(41.902152027, -87.754883404)"
4,10140379.0,HY329556,07/05/2015 11:00:00 PM,012XX W LAKE ST,930.0,MOTOR VEHICLE THEFT,THEFT/RECOVERY: AUTOMOBILE,STREET,False,False,...,27.0,28.0,07,1168413.0,1901632.0,2015.0,07/12/2015 12:42:46 PM,41.88561,-87.657009,"(41.885610142, -87.657008701)"
5,10140868.0,HY330421,07/05/2015 10:54:00 PM,118XX S PEORIA ST,1320.0,CRIMINAL DAMAGE,TO VEHICLE,VEHICLE NON-COMMERCIAL,False,False,...,34.0,53.0,14,1172409.0,1826485.0,2015.0,07/12/2015 12:42:46 PM,41.679311,-87.644545,"(41.6793109, -87.644545209)"
6,10139762.0,HY329232,07/05/2015 10:42:00 PM,026XX W 37TH PL,1020.0,ARSON,BY FIRE,VACANT LOT/LAND,False,False,...,12.0,58.0,09,1159436.0,1879658.0,2015.0,07/12/2015 12:42:46 PM,41.825501,-87.690578,"(41.825500607, -87.690578042)"
7,10139722.0,HY329228,07/05/2015 10:30:00 PM,016XX S CENTRAL PARK AVE,1811.0,NARCOTICS,POSS: CANNABIS 30GMS OR LESS,ALLEY,True,False,...,24.0,29.0,18,1152687.0,1891389.0,2015.0,07/12/2015 12:42:46 PM,41.857828,-87.715029,"(41.857827814, -87.715028789)"
8,10139774.0,HY329209,07/05/2015 10:15:00 PM,048XX N ASHLAND AVE,1310.0,CRIMINAL DAMAGE,TO PROPERTY,APARTMENT,False,False,...,46.0,3.0,14,1164821.0,1932394.0,2015.0,07/12/2015 12:42:46 PM,41.9701,-87.669324,"(41.970099796, -87.669324377)"
9,10139697.0,HY329177,07/05/2015 10:10:00 PM,058XX S ARTESIAN AVE,1320.0,CRIMINAL DAMAGE,TO VEHICLE,ALLEY,False,False,...,16.0,63.0,14,1160997.0,1865851.0,2015.0,07/12/2015 12:42:46 PM,41.78758,-87.685233,"(41.787580282, -87.685233078)"


__Advanced features:__ if you'd like to specify the file type and adjust how you want to read files in, you can see the list of our specialized file readers and how to use them [here](../../how-to-guides/data-ingestion.ipynb).

<a id="Profile"></a>

## 2. Audit des données
Let's understand what our data looks like. Azure ML Data Prep facilitates this process by offering data profiles that help us glimpse into column types and column summary statistics. Notice that our auto file reader automatically guessed the column type:

In [36]:
crime_dirty.get_profile()

Unnamed: 0,Type,Min,Max,Count,Missing Count,Not Missing Count,Percent missing,Error Count,Empty count,0.1% Quantile,1% Quantile,5% Quantile,25% Quantile,50% Quantile,75% Quantile,95% Quantile,99% Quantile,99.9% Quantile,Mean,Standard Deviation,Variance,Skewness,Kurtosis
ID,FieldType.DECIMAL,1.01397e+07,1.01409e+07,10.0,0.0,10.0,0.0,0.0,0.0,10139700.0,10139700.0,10139700.0,10139800.0,10139800.0,10140400.0,10140900.0,10140900.0,10140900.0,10140100.0,409.806,167941.0,0.688352,-1.15364
Case Number,FieldType.STRING,HY329177,HY330421,10.0,0.0,10.0,0.0,0.0,0.0,,,,,,,,,,,,,,
Date,FieldType.STRING,07/05/2015 10:10:00 PM,07/05/2015 11:50:00 PM,10.0,0.0,10.0,0.0,0.0,0.0,,,,,,,,,,,,,,
Block,FieldType.STRING,011XX W MORSE AVE,121XX S FRONT AVE,10.0,0.0,10.0,0.0,0.0,0.0,,,,,,,,,,,,,,
IUCR,FieldType.DECIMAL,460,1811,10.0,0.0,10.0,0.0,0.0,0.0,460.0,473.0,460.0,610.0,975.0,1320.0,1811.0,1811.0,1811.0,1008.7,435.056,189273.0,0.27388,-1.23243
Primary Type,FieldType.STRING,ARSON,THEFT,10.0,0.0,10.0,0.0,0.0,0.0,,,,,,,,,,,,,,
Description,FieldType.STRING,$500 AND UNDER,TO VEHICLE,10.0,0.0,10.0,0.0,0.0,0.0,,,,,,,,,,,,,,
Location Description,FieldType.STRING,ALLEY,VEHICLE NON-COMMERCIAL,10.0,0.0,10.0,0.0,0.0,0.0,,,,,,,,,,,,,,
Arrest,FieldType.BOOLEAN,False,True,10.0,0.0,10.0,0.0,0.0,0.0,,,,,,,,,,,,,,
Domestic,FieldType.BOOLEAN,False,True,10.0,0.0,10.0,0.0,0.0,0.0,,,,,,,,,,,,,,


<a id="Append"></a>

## 3. Concaténation des données
What if your data is split across multiple files? We support the ability to append multiple datasets column-wise and row-wise. Here, we demonstrate how you can coalesce datasets row-wise:

In [37]:
crime_winter = dprep.auto_read_file(path=file_crime_winter)
crime_spring = dprep.auto_read_file(path=file_crime_spring)

In [38]:
crime = (crime_dirty.append_rows(dataflows=[crime_winter, crime_spring]))

In [41]:
crime.head(10)

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
0,10140490.0,HY329907,07/05/2015 11:50:00 PM,050XX N NEWLAND AVE,820.0,THEFT,$500 AND UNDER,STREET,False,False,...,41.0,10.0,06,1129230.0,1933315.0,2015.0,07/12/2015 12:42:46 PM,41.973309,-87.800175,"(41.973309466, -87.800174996)"
1,10139776.0,HY329265,07/05/2015 11:30:00 PM,011XX W MORSE AVE,460.0,BATTERY,SIMPLE,STREET,False,True,...,49.0,1.0,08B,1167370.0,1946271.0,2015.0,07/12/2015 12:42:46 PM,42.008124,-87.65955,"(42.008124017, -87.65955018)"
2,10140270.0,HY329253,07/05/2015 11:20:00 PM,121XX S FRONT AVE,486.0,BATTERY,DOMESTIC BATTERY SIMPLE,STREET,False,True,...,9.0,53.0,08B,,,2015.0,07/12/2015 12:42:46 PM,,,
3,10139885.0,HY329308,07/05/2015 11:19:00 PM,051XX W DIVISION ST,610.0,BURGLARY,FORCIBLE ENTRY,SMALL RETAIL STORE,False,False,...,37.0,25.0,05,1141721.0,1907465.0,2015.0,07/12/2015 12:42:46 PM,41.902152,-87.754883,"(41.902152027, -87.754883404)"
4,10140379.0,HY329556,07/05/2015 11:00:00 PM,012XX W LAKE ST,930.0,MOTOR VEHICLE THEFT,THEFT/RECOVERY: AUTOMOBILE,STREET,False,False,...,27.0,28.0,07,1168413.0,1901632.0,2015.0,07/12/2015 12:42:46 PM,41.88561,-87.657009,"(41.885610142, -87.657008701)"
5,10140868.0,HY330421,07/05/2015 10:54:00 PM,118XX S PEORIA ST,1320.0,CRIMINAL DAMAGE,TO VEHICLE,VEHICLE NON-COMMERCIAL,False,False,...,34.0,53.0,14,1172409.0,1826485.0,2015.0,07/12/2015 12:42:46 PM,41.679311,-87.644545,"(41.6793109, -87.644545209)"
6,10139762.0,HY329232,07/05/2015 10:42:00 PM,026XX W 37TH PL,1020.0,ARSON,BY FIRE,VACANT LOT/LAND,False,False,...,12.0,58.0,09,1159436.0,1879658.0,2015.0,07/12/2015 12:42:46 PM,41.825501,-87.690578,"(41.825500607, -87.690578042)"
7,10139722.0,HY329228,07/05/2015 10:30:00 PM,016XX S CENTRAL PARK AVE,1811.0,NARCOTICS,POSS: CANNABIS 30GMS OR LESS,ALLEY,True,False,...,24.0,29.0,18,1152687.0,1891389.0,2015.0,07/12/2015 12:42:46 PM,41.857828,-87.715029,"(41.857827814, -87.715028789)"
8,10139774.0,HY329209,07/05/2015 10:15:00 PM,048XX N ASHLAND AVE,1310.0,CRIMINAL DAMAGE,TO PROPERTY,APARTMENT,False,False,...,46.0,3.0,14,1164821.0,1932394.0,2015.0,07/12/2015 12:42:46 PM,41.9701,-87.669324,"(41.970099796, -87.669324377)"
9,10139697.0,HY329177,07/05/2015 10:10:00 PM,058XX S ARTESIAN AVE,1320.0,CRIMINAL DAMAGE,TO VEHICLE,ALLEY,False,False,...,16.0,63.0,14,1160997.0,1865851.0,2015.0,07/12/2015 12:42:46 PM,41.78758,-87.685233,"(41.787580282, -87.685233078)"


### Echantillon

In [43]:
crime.take_sample(probability=0.25, seed=RAND_SEED).head(5)

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
0,10140270.0,HY329253,07/05/2015 11:20:00 PM,121XX S FRONT AVE,486.0,BATTERY,DOMESTIC BATTERY SIMPLE,STREET,False,True,...,9.0,53.0,08B,,,2015.0,2015-07-12 12:42:46,,,
1,10374287.0,HZ110730,1/10/2016 11:50,043XX W ARMITAGE AVE,5002.0,OTHER OFFENSE,OTHER VEHICLE OFFENSE,STREET,False,True,...,30.0,20.0,26,1146917.0,1912931.0,2016.0,2016-06-07 15:55:00,41.917054,-87.735658,"(41.917053561, -87.735657637)"
2,10375178.0,HZ110832,1/10/2016 14:20,057XX S KEDZIE AVE,460.0,BATTERY,SIMPLE,RESTAURANT,False,False,...,14.0,63.0,08B,1156029.0,1866379.0,2016.0,2016-02-04 15:44:00,41.789131,-87.703435,"(41.78913051, -87.703434602)"
3,10535059.0,HZ278872,2016-04-15T04:30:00.000000,004XX S KILBOURN AVE,810.0,THEFT,OVER $500,RESIDENCE,False,False,...,24.0,26.0,6,,,2016.0,2016-05-25 15:59:00,,,


__Advanced features:__ you can learn how to append column-wise and how to deal with appending data with different schemas [here](../../how-to-guides/append-columns-and-rows.ipynb).

<a id="Data-science-transforms"></a>

## 4. Autres fonctions disponibles

Azure ML Data Prep supports almost all common data science transforms found in other industry-standard data science libraries. Here, we'll explore the ability to `summarize`, `join`, `filter`, and `replace`. 

__Advanced features:__
* We also provide "smart" transforms not found in pandas that use machine learning to [derive new columns](../../how-to-guides/derive-column-by-example.ipynb), [split columns](../../how-to-guides/split-column-by-example.ipynb), and [fuzzy grouping](../../how-to-guides/fuzzy-group.ipynb).
* Finally, we also help featurize your dataset to prepare it for machine learning; learn more about our featurizers like [one-hot encoder](../../how-to-guides/one-hot-encoder.ipynb), [label encoder](../../how-to-guides/label-encoder.ipynb), [min-max scaler](../../how-to-guides/min-max-scaler.ipynb), and [random (train-test) split](../../how-to-guides/random-split.ipynb).
* Our complete list of example Notebooks for transforms can be found in our [How-to Guides](../../how-to-guides).

## 5. Agrégation

Let's see which wards had the most crimes in our sample dataset:

In [45]:
crime_summary = (crime
    .summarize(
        summary_columns=[
            dprep.SummaryColumnsValue(
                column_id='ID', 
                summary_column_name='TOTAL_par_zonegeo', 
                summary_function=dprep.SummaryFunction.COUNT
            )
        ],
        group_by_columns=['Ward']
    )
)

(crime_summary
     .sort(sort_order=[('TOTAL_par_zonegeo', True)])
     .head(5)
)

Unnamed: 0,Ward,TOTAL_par_zonegeo
0,9.0,3
1,41.0,2
2,16.0,2
3,24.0,2
4,14.0,2


<a id="Join"></a>

## 6. Jointure

Let's annotate each observation with more information about the ward where the crime occurred. Let's do so by joining `crime` with a dataset which lists the current aldermen (juge) for each ward:

In [48]:
juge = dprep.auto_read_file(path=file_aldermen)

In [50]:
juge.head(5)

Unnamed: 0,Ward,Name,Took Office,Party
0,1.0,Proco Joe Moreno,2010*,Dem
1,2.0,Brian Hopkins,2015,Dem
2,3.0,Pat Dowell,2007,Dem
3,4.0,Sophia King,2016*,Dem
4,5.0,Leslie Hairston,1999,Dem


In [51]:
crime.join(
    left_dataflow=crime,
    right_dataflow=juge,
    join_key_pairs=[
        ('Ward', 'Ward')
    ]
).head(5)

Unnamed: 0,l_ID,l_Case Number,l_Date,l_Block,l_IUCR,l_Primary Type,l_Description,l_Location Description,l_Arrest,l_Domestic,...,l_Y Coordinate,l_Year,l_Updated On,l_Latitude,l_Longitude,l_Location,r_Ward,r_Name,r_Took Office,r_Party
0,10140490.0,HY329907,07/05/2015 11:50:00 PM,050XX N NEWLAND AVE,820.0,THEFT,$500 AND UNDER,STREET,False,False,...,1933315.0,2015.0,2015-07-12 12:42:46,41.973309,-87.800175,"(41.973309466, -87.800174996)",41.0,Anthony Napolitano,2015,Rep
1,10400131.0,HZ136171,1/10/2016 18:00,0000X W TERMINAL ST,810.0,THEFT,OVER $500,AIRPORT BUILDING NON-TERMINAL - SECURE AREA,False,False,...,,2016.0,2016-02-02 15:58:00,,,,41.0,Anthony Napolitano,2015,Rep
2,10139776.0,HY329265,07/05/2015 11:30:00 PM,011XX W MORSE AVE,460.0,BATTERY,SIMPLE,STREET,False,True,...,1946271.0,2015.0,2015-07-12 12:42:46,42.008124,-87.65955,"(42.008124017, -87.65955018)",49.0,Joe Moore,1991,Dem
3,10140270.0,HY329253,07/05/2015 11:20:00 PM,121XX S FRONT AVE,486.0,BATTERY,DOMESTIC BATTERY SIMPLE,STREET,False,True,...,,2015.0,2015-07-12 12:42:46,,,,9.0,Anthony Beale,1999,Dem
4,10498554.0,HZ239907,2016-04-15T23:56:00.000000,007XX E 111TH ST,1153.0,DECEPTIVE PRACTICE,FINANCIAL IDENTITY THEFT OVER $ 300,OTHER,False,False,...,1831503.0,2016.0,2016-05-11 15:48:00,41.692834,-87.604319,"(41.692833841, -87.60431945)",9.0,Anthony Beale,1999,Dem


__Advanced features:__ [Learn more](../../how-to-guides/join.ipynb) about how you can do all variants of `join`, like inner-, left-, right-, anti-, and semi-joins.

<a id="Filter"></a>

## 7. Filtres

Let's look at theft crimes:

In [52]:
theft = crime.filter(crime['Primary Type'] == 'THEFT')

theft.head(5)

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
0,10140490.0,HY329907,07/05/2015 11:50:00 PM,050XX N NEWLAND AVE,820.0,THEFT,$500 AND UNDER,STREET,False,False,...,41.0,10.0,6,1129230.0,1933315.0,2015.0,2015-07-12 12:42:46,41.973309,-87.800175,"(41.973309466, -87.800174996)"
1,10374720.0,HZ110836,1/10/2016 7:30,079XX S RHODES AVE,890.0,THEFT,FROM BUILDING,OTHER,False,False,...,6.0,44.0,6,1181279.0,1852568.0,2016.0,2016-02-04 15:44:00,41.750687,-87.611277,"(41.75068679, -87.611276811)"
2,10400131.0,HZ136171,1/10/2016 18:00,0000X W TERMINAL ST,810.0,THEFT,OVER $500,AIRPORT BUILDING NON-TERMINAL - SECURE AREA,False,False,...,41.0,76.0,6,,,2016.0,2016-02-02 15:58:00,,,
3,10516598.0,HZ258664,2016-04-15T17:00:00.000000,082XX S MARSHFIELD AVE,890.0,THEFT,FROM BUILDING,RESIDENCE,False,False,...,21.0,71.0,6,1166776.0,1850053.0,2016.0,2016-05-12 15:48:00,41.744107,-87.664494,"(41.744106973, -87.664494285)"
4,10534446.0,HZ277630,2016-04-15T10:00:00.000000,055XX N KEDZIE AVE,890.0,THEFT,FROM BUILDING,"SCHOOL, PUBLIC, BUILDING",False,False,...,40.0,13.0,6,,,2016.0,2016-05-25 15:59:00,,,


<a id="Replace"></a>

## 8. Remplacement de données

Notice that our `theft` dataset has empty strings in column `Location`. Let's replace those with a missing value:

In [53]:
theft_replaced = (theft
    .replace_na(
        columns=['Location'], 
        use_empty_string_as_na=True
    )
)

theft_replaced.head(5)

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
0,10140490.0,HY329907,07/05/2015 11:50:00 PM,050XX N NEWLAND AVE,820.0,THEFT,$500 AND UNDER,STREET,False,False,...,41.0,10.0,6,1129230.0,1933315.0,2015.0,2015-07-12 12:42:46,41.973309,-87.800175,"(41.973309466, -87.800174996)"
1,10374720.0,HZ110836,1/10/2016 7:30,079XX S RHODES AVE,890.0,THEFT,FROM BUILDING,OTHER,False,False,...,6.0,44.0,6,1181279.0,1852568.0,2016.0,2016-02-04 15:44:00,41.750687,-87.611277,"(41.75068679, -87.611276811)"
2,10400131.0,HZ136171,1/10/2016 18:00,0000X W TERMINAL ST,810.0,THEFT,OVER $500,AIRPORT BUILDING NON-TERMINAL - SECURE AREA,False,False,...,41.0,76.0,6,,,2016.0,2016-02-02 15:58:00,,,
3,10516598.0,HZ258664,2016-04-15T17:00:00.000000,082XX S MARSHFIELD AVE,890.0,THEFT,FROM BUILDING,RESIDENCE,False,False,...,21.0,71.0,6,1166776.0,1850053.0,2016.0,2016-05-12 15:48:00,41.744107,-87.664494,"(41.744106973, -87.664494285)"
4,10534446.0,HZ277630,2016-04-15T10:00:00.000000,055XX N KEDZIE AVE,890.0,THEFT,FROM BUILDING,"SCHOOL, PUBLIC, BUILDING",False,False,...,40.0,13.0,6,,,2016.0,2016-05-25 15:59:00,,,


__Advanced features:__ [Learn more](../../how-to-guides/replace-fill-error.ipynb) about more advanced `replace` and `fill` capabilities.

### Données résultats

In [54]:
theft_replaced.to_pandas_dataframe()

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
0,10140490.0,HY329907,07/05/2015 11:50:00 PM,050XX N NEWLAND AVE,820.0,THEFT,$500 AND UNDER,STREET,False,False,...,41.0,10.0,6,1129230.0,1933315.0,2015.0,2015-07-12 12:42:46,41.973309,-87.800175,"(41.973309466, -87.800174996)"
1,10374720.0,HZ110836,1/10/2016 7:30,079XX S RHODES AVE,890.0,THEFT,FROM BUILDING,OTHER,False,False,...,6.0,44.0,6,1181279.0,1852568.0,2016.0,2016-02-04 15:44:00,41.750687,-87.611277,"(41.75068679, -87.611276811)"
2,10400131.0,HZ136171,1/10/2016 18:00,0000X W TERMINAL ST,810.0,THEFT,OVER $500,AIRPORT BUILDING NON-TERMINAL - SECURE AREA,False,False,...,41.0,76.0,6,,,2016.0,2016-02-02 15:58:00,,,
3,10516598.0,HZ258664,2016-04-15T17:00:00.000000,082XX S MARSHFIELD AVE,890.0,THEFT,FROM BUILDING,RESIDENCE,False,False,...,21.0,71.0,6,1166776.0,1850053.0,2016.0,2016-05-12 15:48:00,41.744107,-87.664494,"(41.744106973, -87.664494285)"
4,10534446.0,HZ277630,2016-04-15T10:00:00.000000,055XX N KEDZIE AVE,890.0,THEFT,FROM BUILDING,"SCHOOL, PUBLIC, BUILDING",False,False,...,40.0,13.0,6,,,2016.0,2016-05-25 15:59:00,,,
5,10535059.0,HZ278872,2016-04-15T04:30:00.000000,004XX S KILBOURN AVE,810.0,THEFT,OVER $500,RESIDENCE,False,False,...,24.0,26.0,6,,,2016.0,2016-05-25 15:59:00,,,


<a id="Explore"></a>

## 9. Autres fonctions de Préparation de données avec Azure ML service

Congratulations on finishing your introduction to the Azure ML Data Prep SDK! If you'd like more detailed tutorials on how to construct machine learning datasets or dive deeper into all of its functionality, you can find more information in our detailed notebooks [here](https://github.com/Microsoft/PendletonDocs). There, we cover topics including how to:

* [Cache your Dataflow to speed up your iterations](../../how-to-guides/cache.ipynb)
* [Add your custom Python transforms](../../how-to-guides/custom-python-transforms.ipynb)
* [Impute missing values](../../how-to-guides/impute-missing-values.ipynb)
* [Sample your data](../../how-to-guides/subsetting-sampling.ipynb)
* [Reference and link between Dataflows](../../how-to-guides/join.ipynb)

> Fin