# Variable Types

In [1]:
import pandas as pd

### One-Hot Encoding

Another way of encoding categorical variables is called One-Hot Encoding (OHE). With OHE, we essentially create a new binary variable for each of the categories within our original variable. This technique is useful when managing nominal variables because it encodes the variable without creating an order among the categories.

In [None]:
# Import dataset as a Pandas Dataframe
cereal = pd.read_csv('cereal.csv', index_col=0)

# Show the first five rows of the `cereal` dataframe
print(cereal.head())

# Create a new dataframe with the `mfr` variable One-Hot Encoded
cereal = pd.get_dummies(data=cereal, columns=['mfr'])

# Show first five rows of new dataframe
print(cereal.head())

### Review

1. Return the first 10 rows of the auto dataframe.
2. Return the data types of the auto dataframe with the `.dtypes` attribute.
3. Change the price category from int to float with the `.astype()` method, then recheck the data types with .dtypes.
4. Convert the engine_size variable to the category data type with an order of `[‘small’, ‘medium’, ‘large’`], and check the order with the `.unique()` method.
5. Create a new variable called `engine_codes` which contains the numerical codes associated with each category in the `engine_size` variable with the `.cat.codes` accessor. Check the new values with the `.head()` method.
6. One-Hot Encode the body-style category in the auto dataframe. Then check the dataframe with `.head()`.

In [None]:
# Import pandas with alias
import pandas as pd

# Import dataset as a Pandas Dataframe
auto = pd.read_csv('autos.csv', index_col=0)

# Print the first 10 rows of the auto dataset
print(auto.head(10))

# Print the data types of the auto dataframe
print(auto.dtypes)

# Change the data type of price to float
auto['price'] = auto['price'].astype('float')

# Set the engine_size data type to category
auto['engine_size'] = pd.Categorical(auto['engine_size'], ['small', 'medium', 'large'], ordered=True)
print(auto['engine_size'].unique())
print(auto.dtypes)

# Create the engine_codes variable by encoding engine_size
auto['engine_codes'] = auto['engine_size'].cat.codes

# One-Hot Encode the body-style variable
auto = pd.get_dummies(data=auto, columns=['body-style'])
print(auto.head(10))