# Woodwork typing in Featuretools

The `Woodwork` library is used for consistent typing. These types determine what features are available for your dataset.

## Types of types

Physical types, or `dtype`, determine how the data is stored in memory.

Logical types are used by Featuretools to determine what transformations are available.

In [1]:
import featuretools as ft

ft.list_logical_types()

Unnamed: 0,name,type_string,description,physical_type,standard_tags,is_default_type,is_registered,parent_type
0,Address,address,Represents Logical Types that contain address ...,string,{},True,True,
1,Age,age,Represents Logical Types that contain whole nu...,int64,{numeric},True,True,Integer
2,AgeFractional,age_fractional,Represents Logical Types that contain non-nega...,float64,{numeric},True,True,Double
3,AgeNullable,age_nullable,Represents Logical Types that contain whole nu...,Int64,{numeric},True,True,IntegerNullable
4,Boolean,boolean,Represents Logical Types that contain binary v...,bool,{},True,True,BooleanNullable
5,BooleanNullable,boolean_nullable,Represents Logical Types that contain binary v...,boolean,{},True,True,
6,Categorical,categorical,Represents Logical Types that contain unordere...,category,{category},True,True,
7,CountryCode,country_code,Represents Logical Types that use the ISO-3166...,category,{category},True,True,Categorical
8,CurrencyCode,currency_code,Represents Logical Types that use the ISO-4217...,category,{category},True,True,Categorical
9,Datetime,datetime,Represents Logical Types that contain date and...,datetime64[ns],{},True,True,


Semantic tags are used to further define the meaning or use of data fields.

In [2]:
ft.list_semantic_tags()

Unnamed: 0,name,is_standard_tag,valid_logical_types
0,numeric,True,"[Age, AgeFractional, AgeNullable, Double, Inte..."
1,category,True,"[Categorical, CountryCode, CurrencyCode, Ordin..."
2,index,False,Any LogicalType
3,time_index,False,"[Datetime, Age, AgeFractional, AgeNullable, Do..."
4,date_of_birth,False,[Datetime]
5,ignore,False,Any LogicalType
6,passthrough,False,Any LogicalType


## Woodwork in EntitySets

Typing information is stored in the dataframes within an entityset and can be accessed in the `ww` namespace.

In [4]:
es = ft.demo.load_retail()
es

Entityset: demo_retail_data
  DataFrames:
    order_products [Rows: 401604, Columns: 8]
    products [Rows: 3684, Columns: 4]
    orders [Rows: 22190, Columns: 6]
    customers [Rows: 4372, Columns: 3]
  Relationships:
    order_products.product_id -> products.product_id
    order_products.order_id -> orders.order_id
    orders.customer_name -> customers.customer_name

In [5]:
df = es["products"]
df.head()

Unnamed: 0,product_id,description,first_order_products_time,_ft_last_time
85123A,85123A,WHITE HANGING HEART T-LIGHT HOLDER,2010-12-01 08:26:00,2011-12-09 11:34:00
71053,71053,WHITE METAL LANTERN,2010-12-01 08:26:00,2011-12-07 14:12:00
84406B,84406B,CREAM CUPID HEARTS COAT HANGER,2010-12-01 08:26:00,2011-12-05 14:30:00
84029G,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,2010-12-01 08:26:00,2011-12-09 11:26:00
84029E,84029E,RED WOOLLY HOTTIE WHITE HEART.,2010-12-01 08:26:00,2011-12-09 09:07:00


In [6]:
df.ww

Unnamed: 0_level_0,Physical Type,Logical Type,Semantic Tag(s)
Column,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
product_id,category,Categorical,['index']
description,string,NaturalLanguage,[]
first_order_products_time,datetime64[ns],Datetime,['time_index']
_ft_last_time,datetime64[ns],Datetime,['last_time_index']


## Woodwork in DFS

Primitives allow certain data types, which are specified using the various types.

In [7]:
products_df = es["products"]
product_ids_series = products_df.ww["product_id"]
column_schema = product_ids_series.ww.schema
column_schema

<ColumnSchema (Logical Type = Categorical) (Semantic Tags = ['index'])>