# Opening and Saving Packages
Copyright (c) Microsoft Corporation. All rights reserved.<br>
Licensed under the MIT License.

Once you have built a Dataflow, you can save it as a package to a `.dprep` file. This persists all of the information in your Dataflow including steps you've added, examples and programs from by-example steps, computed aggregations, etc.

You can also open `.dprep` files to access any Dataflows in those packages.

A Data Prep Package (and `.dprep` file) can contain multiple Dataflows, and each Dataflow in a Package must have a unique name.

## Open

Use the `open()` method of the Package class to load existing `.dprep` files. You can then index into the Package to access a particular Dataflow.

In [1]:
import os
pkg_path = os.path.join(os.getcwd(), '..', 'data', 'crime.dprep')
print(pkg_path)

/mnt/vsts/4/s/target/Python/debug/azureml-dataprep/docs/how-to-guides/../data/crime.dprep


In [2]:
from azureml.dataprep.api.package import Package

In [3]:
pkg = Package.open(pkg_path)
dflow = pkg['crime0-10']
head = dflow.head(5)
head

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
0,10498554.0,HZ239907,"azureml.dataprep.native.DataPrepError(""'Micros...",007XX E 111TH ST,1153.0,DECEPTIVE PRACTICE,FINANCIAL IDENTITY THEFT OVER $ 300,OTHER,False,False,...,9.0,50.0,11,1183356.0,1831503.0,2016.0,"azureml.dataprep.native.DataPrepError(""'Micros...",41.692834,-87.604319,"(41.692833841, -87.60431945)"
1,10516598.0,HZ258664,"azureml.dataprep.native.DataPrepError(""'Micros...",082XX S MARSHFIELD AVE,890.0,THEFT,FROM BUILDING,RESIDENCE,False,False,...,21.0,71.0,6,1166776.0,1850053.0,2016.0,"azureml.dataprep.native.DataPrepError(""'Micros...",41.744107,-87.664494,"(41.744106973, -87.664494285)"
2,10519196.0,HZ261252,"azureml.dataprep.native.DataPrepError(""'Micros...",104XX S SACRAMENTO AVE,1154.0,DECEPTIVE PRACTICE,FINANCIAL IDENTITY THEFT $300 AND UNDER,RESIDENCE,False,False,...,19.0,74.0,11,,,2016.0,"azureml.dataprep.native.DataPrepError(""'Micros...",,,
3,10519591.0,HZ261534,"azureml.dataprep.native.DataPrepError(""'Micros...",113XX S PRAIRIE AVE,1120.0,DECEPTIVE PRACTICE,FORGERY,RESIDENCE,False,False,...,9.0,49.0,10,,,2016.0,"azureml.dataprep.native.DataPrepError(""'Micros...",,,
4,10534446.0,HZ277630,"azureml.dataprep.native.DataPrepError(""'Micros...",055XX N KEDZIE AVE,890.0,THEFT,FROM BUILDING,"SCHOOL, PUBLIC, BUILDING",False,False,...,40.0,13.0,6,,,2016.0,"azureml.dataprep.native.DataPrepError(""'Micros...",,,


## Edit

After a Dataflow is loaded, it can be further edited as needed. In this example, a filter is added.

In [4]:
from azureml.dataprep.api.expressions import col

In [5]:
dflow = dflow.filter(col('Description') != 'SIMPLE')
head = dflow.head(5)
head

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
0,10498554.0,HZ239907,"azureml.dataprep.native.DataPrepError(""'Micros...",007XX E 111TH ST,1153.0,DECEPTIVE PRACTICE,FINANCIAL IDENTITY THEFT OVER $ 300,OTHER,False,False,...,9.0,50.0,11,1183356.0,1831503.0,2016.0,"azureml.dataprep.native.DataPrepError(""'Micros...",41.692834,-87.604319,"(41.692833841, -87.60431945)"
1,10516598.0,HZ258664,"azureml.dataprep.native.DataPrepError(""'Micros...",082XX S MARSHFIELD AVE,890.0,THEFT,FROM BUILDING,RESIDENCE,False,False,...,21.0,71.0,6,1166776.0,1850053.0,2016.0,"azureml.dataprep.native.DataPrepError(""'Micros...",41.744107,-87.664494,"(41.744106973, -87.664494285)"
2,10519196.0,HZ261252,"azureml.dataprep.native.DataPrepError(""'Micros...",104XX S SACRAMENTO AVE,1154.0,DECEPTIVE PRACTICE,FINANCIAL IDENTITY THEFT $300 AND UNDER,RESIDENCE,False,False,...,19.0,74.0,11,,,2016.0,"azureml.dataprep.native.DataPrepError(""'Micros...",,,
3,10519591.0,HZ261534,"azureml.dataprep.native.DataPrepError(""'Micros...",113XX S PRAIRIE AVE,1120.0,DECEPTIVE PRACTICE,FORGERY,RESIDENCE,False,False,...,9.0,49.0,10,,,2016.0,"azureml.dataprep.native.DataPrepError(""'Micros...",,,
4,10534446.0,HZ277630,"azureml.dataprep.native.DataPrepError(""'Micros...",055XX N KEDZIE AVE,890.0,THEFT,FROM BUILDING,"SCHOOL, PUBLIC, BUILDING",False,False,...,40.0,13.0,6,,,2016.0,"azureml.dataprep.native.DataPrepError(""'Micros...",,,


## Save

Use the `save()` method of the Package class to write out the `.dprep` file. To create a Package, pass it a list of Dataflow objects.

In [6]:
import tempfile
temp_dir = tempfile._get_default_tempdir()
temp_file_name = next(tempfile._get_candidate_names())
temp_pkg_path = os.path.join(temp_dir, temp_file_name + '.dprep')

In [7]:
dflow = dflow.set_name('New-Crime')
pkg_to_save = Package([dflow])
pkg_to_save = pkg_to_save.save(temp_pkg_path)

## Round-trip

This illustrates the ability to load the edited Dataflow back in and use it, in this case to get a pandas DataFrame.

In [8]:
pkg_to_open = Package.open(temp_pkg_path)
dflow_to_open = pkg_to_open['New-Crime']
df = dflow_to_open.to_pandas_dataframe()
df

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
0,10498554.0,HZ239907,,007XX E 111TH ST,1153.0,DECEPTIVE PRACTICE,FINANCIAL IDENTITY THEFT OVER $ 300,OTHER,False,False,...,9.0,50.0,11,1183356.0,1831503.0,2016.0,,41.692834,-87.604319,"(41.692833841, -87.60431945)"
1,10516598.0,HZ258664,,082XX S MARSHFIELD AVE,890.0,THEFT,FROM BUILDING,RESIDENCE,False,False,...,21.0,71.0,6,1166776.0,1850053.0,2016.0,,41.744107,-87.664494,"(41.744106973, -87.664494285)"
2,10519196.0,HZ261252,,104XX S SACRAMENTO AVE,1154.0,DECEPTIVE PRACTICE,FINANCIAL IDENTITY THEFT $300 AND UNDER,RESIDENCE,False,False,...,19.0,74.0,11,,,2016.0,,,,
3,10519591.0,HZ261534,,113XX S PRAIRIE AVE,1120.0,DECEPTIVE PRACTICE,FORGERY,RESIDENCE,False,False,...,9.0,49.0,10,,,2016.0,,,,
4,10534446.0,HZ277630,,055XX N KEDZIE AVE,890.0,THEFT,FROM BUILDING,"SCHOOL, PUBLIC, BUILDING",False,False,...,40.0,13.0,6,,,2016.0,,,,
5,10535059.0,HZ278872,,004XX S KILBOURN AVE,810.0,THEFT,OVER $500,RESIDENCE,False,False,...,24.0,26.0,6,,,2016.0,,,,
6,10499802.0,HZ240778,,010XX N MILWAUKEE AVE,1152.0,DECEPTIVE PRACTICE,ILLEGAL USE CASH CARD,RESIDENCE,False,False,...,27.0,24.0,11,,,2016.0,,,,
7,10522293.0,HZ264802,,019XX W DIVISION ST,1110.0,DECEPTIVE PRACTICE,BOGUS CHECK,RESTAURANT,False,False,...,1.0,24.0,11,1163094.0,1908003.0,2016.0,,41.903206,-87.676362,"(41.903206037, -87.676361925)"
8,10523111.0,HZ265911,,061XX N SHERIDAN RD,1153.0,DECEPTIVE PRACTICE,FINANCIAL IDENTITY THEFT OVER $ 300,RESIDENCE,False,False,...,48.0,77.0,11,,,2016.0,,,,
9,10525877.0,HZ268138,,023XX W EASTWOOD AVE,1153.0,DECEPTIVE PRACTICE,FINANCIAL IDENTITY THEFT OVER $ 300,,False,False,...,47.0,4.0,11,,,2016.0,,,,


In [9]:
if os.path.isfile(temp_pkg_path):
    os.remove(temp_pkg_path)