In [None]:
import woodwork as ww

data = ww.demo.load_retail(nrows=100, return_dataframe=True)
data.head(5)

As we can see, this is a dataframe containing several different data types, including dates, categorical values, numeric values and natural language descriptions. Let's use Woodwork to create a DataTable from this data.

## Creating a DataTable
Creating a Woodwork DataTable is as simple as passing in a dataframe with the data of interest during initialization. An optional name parameter can be specified to label the DataTable.

In [None]:
dt = ww.DataTable(data, name="retail")
dt.types

Using just this simple call, Woodwork was able to infer the logical types present in our data by analyzing the dataframe dtypes as well as the information contained in the columns. In addition, Woodwork also added semantic tags to some of the columns based on the logical types that were inferred.

## Updating Logical Types
If the initial inference was not to our liking, the logical type can be changed to a more appropriate value. Let's change some of the columns to a different logical type to illustrate this process. Below we will set the logical type for the ``quantity``, ``customer_name`` and ``country`` columns to be ``Categorical``.

In [None]:
dt.set_logical_types({
    'quantity': 'Categorical',
    'customer_name': 'Categorical',
    'country': 'Categorical'
})
dt.types

If we now inspect the information in the `types` output, we can see that the Logical type for the three columns has been updated with the `Categorical` logical type we specified.

## Selecting Columns

Now that we have logical types we are happy with, we can select a subset of the columns based on their logical types. Let's select only the columns that have a logical type of ``WholeNumber`` or ``Double``:

In [None]:
numeric_dt = dt.select_ltypes(['WholeNumber', 'Double'])
numeric_dt.types

This selection process has returned a new ``DataTable`` containing only the columns that match the logical types we specified. After we have selected the columns we want, we can also access a dataframe containing just those columns if we need it for additional analysis.

In [None]:
numeric_dt.to_pandas()

## Adding Semantic Tags

Next, let's add semantic tags to some of the columns. We will add the tag of ``product_details`` to the ``description`` column and tag the ``total`` column with ``currency``.

In [None]:
dt.set_semantic_tags({'description':'product_details', 'total': 'currency'})
dt.types

We can also select columns based on a semantic tag. Perhaps we want to only select the columns tagged with ``category``:

In [None]:
category_dt = dt.select_semantic_tags('category')
category_dt.types

We can also select columns using mutiple semantic tags, or even a mixture of semantic tags and logical types:


In [None]:
category_numeric_dt = dt.select_semantic_tags(['numeric', 'category'])
category_numeric_dt.types

In [None]:
mixed_dt = dt.select(['Boolean', 'product_details'])
mixed_dt.types

If we wanted to select an individual column, we just need to specify the column name. We can then get access to the data in the DataColumn using the ``to_pandas`` method:

In [None]:
dc = dt['total']
dc

In [None]:
dc.to_pandas()

You can also access multiple columns by supplying a list of column names:

In [None]:
multiple_cols_dt = dt[['product_id', 'total', 'unit_price']]
multiple_cols_dt.types

## Removing Semantic Tags
We can also remove specific semantic tags from a column if they are no longer needed. Let's remove the ``product_details`` tag from the ``description`` column:

In [None]:
dt.remove_semantic_tags({'description':'product_details'})
dt.types

Notice how the ``product_details`` tag has now been removed from the ``description`` column. If we wanted to remove all user-added semantic tags from all columns, we can also do that:

In [None]:
dt.reset_semantic_tags()
dt.types

## Set Index and Time Index
At any point, we can designate certain columns as the DataTable's `index` and  with the methods [set_index](generated/woodwork.data_table.DataTable.set_index.rst) and [set_time_index](generated/woodwork.data_table.DataTable.set_time_index.rst). These methods can be used to assign these columns for the first time or to change the column being used as the index or time index.

Index and time index columns contain `index` and `time_index` semantic tags, respectively.

In [None]:
dt.set_index('order_product_id')
dt.index

In [None]:
dt.set_time_index('order_date')
dt.time_index

In [None]:
dt.types

## List Logical Types
We can also retrieve all the Logical Types present in Woodwork. These can be useful for understanding the Logical Types, and how they will be interpreted. 

In [None]:
from woodwork.utils import list_logical_types

list_logical_types()