# Saving and Loading DataFrames

In this guide, you will learn how to save and load Woodwork DataFrames.

## Saving a Woodwork DataFrame

After defining a Woodwork DataFrame with the proper logical types and semantic tags, you can save the DataFrame and  typing information by using `DataFrame.ww.to_disk`. This method will create a directory that contains a `data` folder and a `woodwork_typing_info.json` file. To illustrate, we will use this retail DataFrame which already comes configured with Woodwork typing information.

In [None]:
from woodwork.demo import load_retail
df = load_retail(nrows=100)
df.ww.schema

In [None]:
df.head()

From the `ww` acessor, use `to_disk` to save the Woodwork DataFrame.

In [None]:
df.ww.to_disk('retail')

You should see a new directory that contains the data and typing information.

```
retail
├── data
│   └── demo_retail_data.csv
└── woodwork_typing_info.json
```

### Data Directory

The `data` directory contains the underlying data written in the specified format. The method derives the filename from  `DataFrame.ww.name` and uses CSV as the default format. You can change the format by setting the method's `format` parameter to any of the following formats:

- csv (default)
- pickle
- parquet

### Typing Information

In the `woodwork_typing_info.json`, you can see all of the typing information and metadata associated with the DataFrame. This information includes:

- the version of the schema at the time of saving the DataFrame
- the DataFrame name specified by `DataFrame.ww.name`
- the column names for the index and time index
- the column typing information, which contains the logical types with their parameters and semantic tags for each column
- the loading information required for the DataFrame type and file format
- the table metadata provided by `DataFrame.ww.metadata` (must be JSON serializable)

```text
{
    "schema_version": "10.0.2",
    "name": "demo_retail_data",
    "index": "order_product_id",
    "time_index": "order_date",
    "column_typing_info": [...],
    "loading_info": {
        "table_type": "pandas",
        "location": "data/demo_retail_data.csv",
        "type": "csv",
        "params": {
            "compression": null,
            "sep": ",",
            "encoding": "utf-8",
            "engine": "python",
            "index": false
        }
    },
    "table_metadata": {}
}
```

## Loading a Woodwork DataFrame

After saving a Woodwork DataFrame, you can load the DataFrame and typing information by using `woodwork.deserialize.read_woodwork_table`. This function will use the stored typing information in the specified directory to recreate the Woodwork DataFrame.

In [None]:
from woodwork.deserialize import read_woodwork_table
df = read_woodwork_table('retail')
df.ww.schema

### Loading the DataFrame and typing information separately

You can also load the Woodwork DataFrame and typing information separately by using `woodwork.read_file`.  This approach is helpful if you want to save and load the typing information outside the specified directory or read a data file directly into a Woodwork DataFrame. To illustrate, we will load the typing information first before reading data files in different formats directly into a Woodwork DataFrame. 

In [None]:
from json import load

with open('retail/woodwork_typing_info.json') as file:
    typing_information = load(file)

Let's create the data files in different formats from a pandas DataFrame.

In [None]:
import pandas as pd

pandas_df = pd.read_csv('retail/data/demo_retail_data.csv')
pandas_df.to_csv('retail/data.csv')
pandas_df.to_parquet('retail/data.parquet')
pandas_df.to_feather('retail/data.feather')

Now, you can use `read_file` to load the data directly into a Woodwork DataFrame. This function uses the `content_type` parameter to determine the file format. If `content_type` is not specified, the function will try to infer the file format from the file extension.

In [None]:
from woodwork import read_file

woodwork_df = read_file(
    filepath='retail/data.csv',
    content_type='csv',
    index=typing_information['index'],
    time_index=typing_information['time_index'],
)

woodwork_df = read_file(
    filepath='retail/data.parquet',
    content_type='parquet',
    index=typing_information['index'],
    time_index=typing_information['time_index'],
)

woodwork_df = read_file(
    filepath='retail/data.feather',
    content_type='feather',
    index=typing_information['index'],
    time_index=typing_information['time_index'],
)

woodwork_df.ww

The parameters related to typing information such as the index, time index, logical types, and semantics tags are optional. So, you can read data files into Woodwork DataFrames and let Woodwork inference the typing information automatically.

In [None]:
# cleanup retail directory
from shutil import rmtree
rmtree('retail')