### Blog link :- https://www.dataindependent.com/pandas/pandas-get-dummies/

##### When you’re doing machine learning you’ll work with algorithms that cannot process categorical variables. In this case, you need to turn your column of labels (Ex: [‘cat’, ‘dog’, ‘bird’, ‘cat’]) into separate columns of 0s and 1s. This is called getting dummies pandas columns.
##### Pandas Get Dummies will turn your categorical variables into many dummy indicator variables. This means you'll go from a Series of labels (['Bob', 'Fred', 'Katie']) to a list of indicators ([0,1,0,0]).

#### Pandas pd.get_dummies() will turn your categorical column (column of labels) into indicator columns (columns of 0s and 1s).

## Be careful, if your categorical column has too many distinct values in it, you’ll quickly explode your new dummy columns. Before you run pd.get_dummies(), make sure to run pd.Series.nunique() to see how many new columns you’ll create.

In [1]:
import pandas as pd

In [2]:
# create a DataFrame
df = pd.DataFrame([('Foreign Cinema', 289.0),
                   ('Liho Liho', 224.0),
                   ('500 Club', 80.5),
                   ('Foreign Cinema', 25.30)],
           columns=('name', 'Amount')
                 )

In [3]:
df

Unnamed: 0,name,Amount
0,Foreign Cinema,289.0
1,Liho Liho,224.0
2,500 Club,80.5
3,Foreign Cinema,25.3


# 1. Creating Dummy Indicator columns

#### To create dummy columns, I need to tell pandas which DataFrame I want to use, and which columns I want to create dummies on. Here I want to create dummies on the 'name' column.

In [5]:
pd.get_dummies(df, columns=['name'])
# Within these new columns is a list of 1s and 0s showing if the previous row had the column value.

Unnamed: 0,Amount,name_500 Club,name_Foreign Cinema,name_Liho Liho
0,289.0,0,1,0
1,224.0,0,0,1
2,80.5,1,0,0
3,25.3,0,1,0


# 2. Creating Dummy Indicator columns with prefix

#### See how above all of my new columns start with "name_"? Well I don't like it. I want to switch the prefix to something else. You can do this by specifying "prefix" parameter.

In [6]:
pd.get_dummies(df, columns=['name'], prefix="dummy")

Unnamed: 0,Amount,dummy_500 Club,dummy_Foreign Cinema,dummy_Liho Liho
0,289.0,0,1,0
1,224.0,0,0,1
2,80.5,1,0,0
3,25.3,0,1,0


#### by specifying the prefix_sep=*, we can change the separator between dummy_500Club

In [8]:
pd.get_dummies(df, columns=['name'], prefix="dummy", prefix_sep="*")

Unnamed: 0,Amount,dummy*500 Club,dummy*Foreign Cinema,dummy*Liho Liho
0,289.0,0,1,0
1,224.0,0,0,1
2,80.5,1,0,0
3,25.3,0,1,0
