# The Gemika's Magical Guide to Sorting Hogwarts Students using the Decision Tree Algorithm (Part #7)

![machine-learning-03.jpg](images/machine-learning-30.jpg)

---

## **7. Casting One-Hot Encoding Spells 🏰✨**

The air is thick with the scent of parchment and the crackle of magical energy, as we delve into the depths of data wizardry. Today, we shall explore a spellbinding technique known as **One-Hot Encoding**. Imagine transforming a mischievous **Pixie**, with its vibrant personality and unpredictable nature, into a series of precise coordinates on a magical map. That, dear reader, is the essence of **One-Hot Encoding**. 🧙‍♂️✨ 

Just as Professor McGonagall's wand can turn a teacup into a `Dachshund`, this spell can turn `categorical` data – those pesky labels that refuse to conform to `numerical` calculations – into a format our models can understand. Think of it as turning a chaotic swarm of `Hufflepuffs`, `Ravenclaws`, `Gryffindors`, and `Slytherins` into a neat grid of ones and zeros, each representing a specific house. 🏰✨ 

With `One-Hot Encoding`, we create `new columns` for each `unique category`, filling them with `ones` and `zeros` to indicate `presence` or `absence`. It’s like sorting mischievous house-elves into their designated rooms, ensuring each has its own space. By the end of this chapter, you'll be able to cast this spell with the confidence of a seasoned `Charms` master, transforming your data from a tangled knot into a beautifully organized tapestry. 🪄✨

---

### **7.1 Categorical Data: Sorting Hats and Magical Labels**

In the enchanting realm of data science, where numbers dance and patterns reveal themselves, we encounter a curious breed of information known as *categorical data*. Unlike their numerical counterparts, these data points don't represent quantities but rather distinct categories or groups. 

Imagine the Sorting Hat at Hogwarts, that wise old magical object that places students into their rightful houses. The houses – `Gryffindor`, `Hufflepuff`, `Ravenclaw`, and `Slytherin` – are examples of categorical data. They represent distinct groups with unique characteristics, just like the houses themselves. Similarly, the type of pet a student chooses – a loyal `owl`, a purring `cat`, or a grumpy `toad` – also falls into the category of categorical data. 

Categorical data is like placing **magical labels** on objects, helping us **differentiate** and **classify them**. Just as a Herbology student would meticulously categorize different plants based on their properties, we use categorical data to sort and understand the diverse elements within our datasets. By understanding these magical labels, we can unlock hidden patterns and cast powerful spells (analyses) to uncover the secrets of our data. Let's seek some further knowledge, what values lie beneath the `house` column, cast your wand dear sorcerers. 🪄✨

In [201]:
# Path to our dataset
dataset_path = 'data/hogwarts-students-02.csv'

# Reading the dataset
hogwarts_df = pd.read_csv(dataset_path)

# Displaying the unique categories in the 'house' column
unique_houses = hogwarts_df['house'].unique()
print(f"Unique Houses: {unique_houses}")

Unique Houses: ['Gryffindor' 'Slytherin' 'Ravenclaw' 'Hufflepuff' 'Durmstrang'
 'Beauxbatons']


Understanding these categories is crucial because our `magical algorithms` (or models) need to know how to interpret this data. However, these algorithms often struggle with `non-numerical` data, as they are more comfortable with numbers. This is where the magic of `One-Hot Encoding` comes into play.

---

### **7.2 Transforming Categorical Data using One-Hot Encoding**

Imagine our Hogwarts student records, filled with enchanting details like house, wand type, and favorite subject. These qualities are like **magical sigils**, carrying unique energies. However, our brilliant data models, while capable of wondrous feats, cannot decipher these sigils directly. We must transform them into a language they understand – numbers.

Enter the spell of **One-Hot Encoding**, a powerful incantation that reveals the hidden essence of each categorical variable. It's like casting a **Lumos spell** on a **hidden chamber**, illuminating every nook and cranny. With a flick of our coding wand, we transform each category into its own standalone column. If a student belongs to Gryffindor, for instance, a `1` will magically appear in the `Gryffindor` column, while the other house columns remain dark.

This transformation is akin to creating a magical tapestry, where each thread represents a category. By weaving these threads together, we create a rich and detailed portrait of our students, ready to be analyzed by our data models. It's as if we're granting our models the ability to see the world through the eyes of a **Polyjuice Potion** drinker, experiencing each student's unique perspective. Let's go ahead and try to One-Hot Encoding our first column or feature, will try the `gender` column first. 

In [202]:
# Importing necessary libraries
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
from IPython.display import display, HTML

# Applying One-Hot Encoding to the 'gender' column
encoder = OneHotEncoder(sparse_output=False)  # Updated parameter name
encoded_data = encoder.fit_transform(hogwarts_df[['gender']])

# Converting the encoded data into a DataFrame and attaching it to the original dataset
encoded_df = pd.DataFrame(encoded_data, columns=encoder.get_feature_names_out(['gender']))
hogwarts_df = pd.concat([hogwarts_df, encoded_df], axis=1)

# Dropping the original 'gender' column as it's now encoded
hogwarts_df.drop('gender', axis=1, inplace=True)

# Displaying the transformed DataFrame in a scrollable pane
html = hogwarts_df.head(5).to_html() # Convert DataFrame to HTML
scrollable_html = f"""
<div style="height: 300px; overflow: auto;">
    {html}
</div>
"""
display(HTML(scrollable_html))

Unnamed: 0,name,age,origin,specialty,house,blood_status,pet,wand_type,patronus,quidditch_position,boggart,favorite_class,house_points,gender_Female,gender_Male
0,Harry Potter,11,England,Defense Against the Dark Arts,Gryffindor,Half-blood,Owl,Holly,Stag,Seeker,Dementor,Defense Against the Dark Arts,150.0,0.0,1.0
1,Hermione Granger,11,England,Transfiguration,Gryffindor,Muggle-born,Cat,Vine,Otter,Seeker,Failure,Arithmancy,200.0,1.0,0.0
2,Ron Weasley,11,England,Chess,Gryffindor,Pure-blood,Rat,Ash,Jack Russell Terrier,Keeper,Spider,Charms,50.0,0.0,1.0
3,Draco Malfoy,11,England,Potions,Slytherin,Pure-blood,Owl,Hawthorn,Non-corporeal,Seeker,Lord Voldemort,Potions,100.0,0.0,1.0
4,Luna Lovegood,11,Ireland,Creatures,Ravenclaw,Half-blood,Owl,Fir,Hare,Seeker,Her mother,Creatures,120.0,1.0,0.0


And if you scroll to the right, you might notice that the dataset now has additional two columns, the `gender_Female` and the `gender_Male` on top of the existing one, while dropping the original `gender` column that was there previously.

---

### **7.3 The Two Great Treasures of the Data Realm 🏰✨** 

In the grand tapestry of the wizarding world of data, there exist two primary categories of magical artifacts: **Structured Data** and **Unstructured Data**. These are the building blocks of our enchanting spells and powerful potions. 

**Categorical Data** is akin to a well-organized Herbology garden, where every plant (data point) has its rightful place. It's like a neatly filled Hogwarts student record, with columns for names, houses, and wand types, all aligned in perfect order. Structured data is a wizard's delight, easily understood and manipulated with a flick of the wand (or a few lines of code). 🌱✨

On the other hand, **Numerical Data** is a sprawling Forbidden Forest, filled with magical creatures (data points) roaming freely. It's like a collection of owls' letters, each with its own unique style and format. This data can be as diverse as the stars in the night sky, ranging from social media posts to news articles, images, and even spoken words. While it holds immense potential, taming this wild magic requires special spells and a keen eye for patterns. 🦉🌌 

1. **Categorical Data (Qualitative Data)**:
   - **Definition**: Categorical data refers to information that can be sorted into distinct groups or categories based on qualitative characteristics, rather than numerical values.
   - **Types**:
     - **Nominal Data**: This type includes categories without any specific order (e.g., gender, hair color). It is often used for labeling variables without providing a numerical value.
     - **Ordinal Data**: This type has a defined order or ranking (e.g., customer satisfaction ratings, economic class ratings, movie ratings). The differences between the ranks may not be equal.
   - **Examples**: Gender, race, color, and types of products.
   - **Analysis**: Categorical data is typically analyzed using frequency counts, bar graphs, and pie charts. It does not support arithmetic operations like addition or averaging.

2. **Numerical Data (Quantitative Data)**:
   - **Definition**: Numerical data consists of values that can be measured and expressed numerically, allowing for mathematical operations.
   - **Types**:
     - **Discrete Data**: Countable values (e.g., number of students in a class).
     - **Continuous Data**: Measurable quantities that can take any value within a range (e.g., height, weight).
   - **Analysis**: Numerical data can be analyzed using various statistical methods, including mean, median, mode, and standard deviation.

---

### **7.4 Transforming Text into Numbers: The Magic of One-Hot Encoding (Categorical Data) ✨**

Imagine a world where numbers and words could converse, where the language of magic flowed seamlessly from one to the other. This is the realm of **one-hot encoding**, a powerful spell that transforms the mysterious world of text into the concrete world of numbers. 

Just as a skilled `Herbologist` categorizes plants by their properties, `one-hot encoding` sorts `textual data` into `distinct categories`. Consider the `Sorting Hat`, which assigns students to houses based on their unique qualities. Similarly, one-hot encoding creates separate columns for each category, with values of `0` or `1` indicating whether a data point belongs to that category or not.

For instance, if you have a column representing the houses of Hogwarts students (`Gryffindor`, `Slytherin`, `Ravenclaw`, and `Hufflepuff`), one-hot encoding would conjure four new columns, one for each house. A value of `1` in the Gryffindor column would signify a student belonging to that house, while other columns would be filled with `0`s. This numerical representation allows our `magical models` to understand and process textual information with ease. With that being said, let's try to transform the remaining of the columns in our dataset, and see what we have to work on. 🪄

In [203]:
# Importing necessary libraries
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
from IPython.display import display, HTML

# Assuming hogwarts_df is already defined and contains the necessary columns
columns_to_encode = [
    'origin', 'specialty', 'blood_status', 'pet', 'wand_type', 'patronus', 'quidditch_position', 'boggart', 'favorite_class'
]

# Creating an instance of OneHotEncoder
encoder = OneHotEncoder(sparse_output=False)

# List to hold encoded DataFrames
encoded_dfs = []

# Applying One-Hot Encoding to each column and storing the result
for column in columns_to_encode:
    encoded_data = encoder.fit_transform(hogwarts_df[[column]])
    encoded_df = pd.DataFrame(encoded_data, columns=encoder.get_feature_names_out([column]))
    encoded_dfs.append(encoded_df)

# Concatenating all encoded DataFrames into one
encoded_df_combined = pd.concat(encoded_dfs, axis=1)

# Concatenating the encoded DataFrame with the original DataFrame
hogwarts_df = pd.concat([hogwarts_df, encoded_df_combined], axis=1)

# Dropping the original columns that were encoded
hogwarts_df.drop(columns=columns_to_encode, inplace=True)

# Displaying the transformed DataFrame in a scrollable pane
html = hogwarts_df.head(5).to_html()  # Convert DataFrame to HTML
scrollable_html = f"""
<div style="height: 300px; overflow: auto;">
    {html}
</div>
"""
display(HTML(scrollable_html))

Unnamed: 0,name,age,house,house_points,gender_Female,gender_Male,origin_Bulgaria,origin_England,origin_Europe,origin_France,origin_Indonesia,origin_Ireland,origin_Scotland,origin_USA,origin_Wales,specialty_Auror,specialty_Baking,specialty_Charms,specialty_Chess,specialty_Creatures,specialty_Dark Arts,specialty_Defense Against the Dark Arts,specialty_Dueling,specialty_Goat Charming,specialty_Gossip,specialty_Herbology,specialty_History of Magic,specialty_Household Charms,specialty_Legilimency,specialty_Magical Creatures,specialty_Memory Charms,specialty_Metamorphmagus,specialty_Muggle Artifacts,specialty_Obscurus,specialty_Potions,specialty_Quidditch,specialty_Strength,specialty_Transfiguration,specialty_Transformation,blood_status_Half-blood,blood_status_Muggle-born,blood_status_No-mag,blood_status_Pure-blood,pet_Cat,pet_Demiguise,pet_Dog,pet_Goat,pet_Owl,pet_Phoenix,pet_Rat,pet_Snake,pet_Toad,wand_type_Alder,wand_type_Ash,wand_type_Birch,wand_type_Blackthorn,wand_type_Cedar,wand_type_Cherry,wand_type_Chestnut,wand_type_Cypress,wand_type_Ebony,wand_type_Elder,wand_type_Elm,wand_type_Fir,wand_type_Hawthorn,wand_type_Hazel,wand_type_Hemlock,wand_type_Holly,wand_type_Hornbeam,wand_type_Maple,wand_type_Oak,wand_type_Pine,wand_type_Rosewood,wand_type_Rowan,wand_type_Sword,wand_type_Teak,wand_type_Vine,wand_type_Walnut,wand_type_Willow,wand_type_Yew,patronus_Cat,patronus_Doe,patronus_Dog,patronus_Eagle,patronus_Hare,patronus_Horse,patronus_Jack Russell Terrier,patronus_Lion,patronus_Non-corporeal,patronus_Otter,patronus_Phoenix,patronus_Serpent,patronus_Stag,patronus_Swan,patronus_Wolf,quidditch_position_Azkaban,quidditch_position_Beater,quidditch_position_Chaser,quidditch_position_Keeper,quidditch_position_Seeker,boggart_Ariana's death,boggart_Dementor,boggart_Dueling,boggart_Failure,boggart_Full Moon,boggart_Her mother,boggart_Lily Potter,boggart_Lord Voldemort,boggart_Severus Snape,boggart_Spider,boggart_Tom Riddle,favorite_class_Arithmancy,favorite_class_Baking,favorite_class_Charms,favorite_class_Creatures,favorite_class_Dark Arts,favorite_class_Defense Against the Dark Arts,favorite_class_Dueling,favorite_class_Goat Charming,favorite_class_Gossip,favorite_class_Herbology,favorite_class_Household Charms,favorite_class_Legilimency,favorite_class_Memory Charms,favorite_class_Muggle Studies,favorite_class_Obscurus,favorite_class_Potions,favorite_class_Quidditch,favorite_class_Strength,favorite_class_Transfiguration,favorite_class_Transformation
0,Harry Potter,11,Gryffindor,150.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Hermione Granger,11,Gryffindor,200.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Ron Weasley,11,Gryffindor,50.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Draco Malfoy,11,Slytherin,100.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0
4,Luna Lovegood,11,Ravenclaw,120.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


---

### **7.5 Discretizing the Numerical Values: Uncovering the Numerical Data) ✨**

In the magical realm of data science, where numbers hold secrets and patterns dance in the shadows, there exists a particularly enchanting spell: *one-hot encoding*. This spell is a `transfiguration charm`, capable of transforming seemingly ordinary text into a `numerical language` that our magical computers can understand. Imagine a bustling `Diagon Alley`, filled with shops selling `wands`, `cauldrons`, and `robes` of every color. Each shop has a unique name, a string of letters that defines its identity. Now, picture these shop names as magical creatures, wild and untamed. To harness their power for our spells, we must transform them into something more manageable – **numbers**. 

One-hot encoding is the spell that accomplishes this feat. It takes each unique shop name and creates a separate magical dimension (column) for it. Within these dimensions, we cast a binary spell, assigning a value of 1 to the shop that exists in that dimension and 0 to all others. It's like creating a magical grid, where each shop has its own spotlight moment. With this transformation, our once chaotic collection of shop names becomes an orderly array of numbers, ready to be analyzed and explored.🪄✨ 

### **7.5.1 Convering Columns with Numerical Values**

To convert a numerical column with values ranging from 100 to 200 into a more machine learning-friendly format using one-hot encoding, you typically need to discretize the numerical values into categorical bins first. Here’s how to do it step by step:

You can create bins (categories) for the numerical values. For example, you might define bins like this:

- **100-120**
- **121-140**
- **141-160**
- **161-180**
- **181-200**

### **7.5.2 Assign Categories**

Next, assign each numerical value to its corresponding bin. For example:

| **Original Value** | **Category** |
|---------------------|--------------|
| 100                 | 100-120      |
| 110                 | 100-120      |
| 125                 | 121-140      |
| 145                 | 141-160      |
| 165                 | 161-180      |
| 180                 | 161-180      |
| 200                 | 181-200      |

### **7.5.3 One-Hot Encode the Categories**

Now, you can apply one-hot encoding to the categorical column. Each category will be represented as a binary vector:

| **Original Value** | **100-120** | **121-140** | **141-160** | **161-180** | **181-200** |
|---------------------|--------------|--------------|--------------|--------------|--------------|
| 100                 | 1            | 0            | 0            | 0            | 0            |
| 110                 | 1            | 0            | 0            | 0            | 0            |
| 125                 | 0            | 1            | 0            | 0            | 0            |
| 145                 | 0            | 0            | 1            | 0            | 0            |
| 165                 | 0            | 0            | 0            | 1            | 0            |
| 180                 | 0            | 0            | 0            | 1            | 0            |
| 200                 | 0            | 0            | 0            | 0            | 1            |

By discretizing the numerical values into categories and applying one-hot encoding, you transform the original numerical data into a format that machine learning algorithms can process effectively. This method captures the categorical nature of the data while preserving the information contained in the original numerical values.

In [204]:
import pandas as pd
from sklearn.preprocessing import OneHotEncoder
from IPython.display import display, HTML

# Step 1: Define bins and labels
bins = [100, 120, 140, 160, 180, 200]  # Define the bin edges
labels = ['hp_100_120', 'hp_121_140', 'hp_141_160', 'hp_161_180', 'hp_181_200']  # Define the bin labels

# Step 2: Create a new categorical column based on the bins
hogwarts_df['house_category'] = pd.cut(hogwarts_df['house_points'], bins=bins, labels=labels, right=True)

# Step 3: One-hot encode the categorical column
hogwarts_df_encoded = pd.get_dummies(hogwarts_df, columns=['house_category'], prefix='', prefix_sep='')

# Replace True with 1 and False with 0
hogwarts_df_encoded = hogwarts_df_encoded.replace({True: 1, False: 0})

# Drop the house_points column
hogwarts_df_encoded.drop('house_points', axis=1, inplace=True)

# Displaying the transformed DataFrame in a scrollable pane
html = hogwarts_df_encoded.head(5).to_html()  # Convert DataFrame to HTML & # Display first 5 rows in a scrollable pane
scrollable_html = f"""
<div style="height: 300px; overflow: auto;">
    {html}
</div>
"""
display(HTML(scrollable_html))  

Unnamed: 0,name,age,house,gender_Female,gender_Male,origin_Bulgaria,origin_England,origin_Europe,origin_France,origin_Indonesia,origin_Ireland,origin_Scotland,origin_USA,origin_Wales,specialty_Auror,specialty_Baking,specialty_Charms,specialty_Chess,specialty_Creatures,specialty_Dark Arts,specialty_Defense Against the Dark Arts,specialty_Dueling,specialty_Goat Charming,specialty_Gossip,specialty_Herbology,specialty_History of Magic,specialty_Household Charms,specialty_Legilimency,specialty_Magical Creatures,specialty_Memory Charms,specialty_Metamorphmagus,specialty_Muggle Artifacts,specialty_Obscurus,specialty_Potions,specialty_Quidditch,specialty_Strength,specialty_Transfiguration,specialty_Transformation,blood_status_Half-blood,blood_status_Muggle-born,blood_status_No-mag,blood_status_Pure-blood,pet_Cat,pet_Demiguise,pet_Dog,pet_Goat,pet_Owl,pet_Phoenix,pet_Rat,pet_Snake,pet_Toad,wand_type_Alder,wand_type_Ash,wand_type_Birch,wand_type_Blackthorn,wand_type_Cedar,wand_type_Cherry,wand_type_Chestnut,wand_type_Cypress,wand_type_Ebony,wand_type_Elder,wand_type_Elm,wand_type_Fir,wand_type_Hawthorn,wand_type_Hazel,wand_type_Hemlock,wand_type_Holly,wand_type_Hornbeam,wand_type_Maple,wand_type_Oak,wand_type_Pine,wand_type_Rosewood,wand_type_Rowan,wand_type_Sword,wand_type_Teak,wand_type_Vine,wand_type_Walnut,wand_type_Willow,wand_type_Yew,patronus_Cat,patronus_Doe,patronus_Dog,patronus_Eagle,patronus_Hare,patronus_Horse,patronus_Jack Russell Terrier,patronus_Lion,patronus_Non-corporeal,patronus_Otter,patronus_Phoenix,patronus_Serpent,patronus_Stag,patronus_Swan,patronus_Wolf,quidditch_position_Azkaban,quidditch_position_Beater,quidditch_position_Chaser,quidditch_position_Keeper,quidditch_position_Seeker,boggart_Ariana's death,boggart_Dementor,boggart_Dueling,boggart_Failure,boggart_Full Moon,boggart_Her mother,boggart_Lily Potter,boggart_Lord Voldemort,boggart_Severus Snape,boggart_Spider,boggart_Tom Riddle,favorite_class_Arithmancy,favorite_class_Baking,favorite_class_Charms,favorite_class_Creatures,favorite_class_Dark Arts,favorite_class_Defense Against the Dark Arts,favorite_class_Dueling,favorite_class_Goat Charming,favorite_class_Gossip,favorite_class_Herbology,favorite_class_Household Charms,favorite_class_Legilimency,favorite_class_Memory Charms,favorite_class_Muggle Studies,favorite_class_Obscurus,favorite_class_Potions,favorite_class_Quidditch,favorite_class_Strength,favorite_class_Transfiguration,favorite_class_Transformation,hp_100_120,hp_121_140,hp_141_160,hp_161_180,hp_181_200
0,Harry Potter,11,Gryffindor,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,1,0,0
1,Hermione Granger,11,Gryffindor,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,1
2,Ron Weasley,11,Gryffindor,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0
3,Draco Malfoy,11,Slytherin,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0,0,0,0,0
4,Luna Lovegood,11,Ravenclaw,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0,0,0,0


---

### **7.6 Discretizing the Numerical Values: Transposing the Age Column ✨**

In [205]:
import pandas as pd
from IPython.display import display, HTML

# Step 1: Define bins and labels for the age column
bins = [10, 12, 14, 16, 18]  # Define the bin edges
labels = ['age_11', 'age_12', 'age_13', 'age_14']  # Define the bin labels

# Step 2: Create a new categorical column based on the bins
hogwarts_df_encoded['age_category'] = pd.cut(hogwarts_df_encoded['age'], bins=bins, labels=labels, right=True)

# Step 3: One-hot encode the categorical column
hogwarts_df_encoded_age = pd.get_dummies(hogwarts_df_encoded, columns=['age_category'], prefix='', prefix_sep='')

# Replace True with 1 and False with 0 (not necessary here since get_dummies already returns integers)
hogwarts_df_encoded_age = hogwarts_df_encoded_age.replace({True: 1, False: 0})

# Drop the age column
hogwarts_df_encoded_age.drop('age', axis=1, inplace=True)

# Displaying the transformed DataFrame in a scrollable pane
html = hogwarts_df_encoded_age.head(5).to_html()  # Convert DataFrame to HTML & Display first 5 rows in a scrollable pane
scrollable_html = f"""
<div style="height: 300px; overflow: auto;">
    {html}
</div>
"""
display(HTML(scrollable_html))

Unnamed: 0,name,house,gender_Female,gender_Male,origin_Bulgaria,origin_England,origin_Europe,origin_France,origin_Indonesia,origin_Ireland,origin_Scotland,origin_USA,origin_Wales,specialty_Auror,specialty_Baking,specialty_Charms,specialty_Chess,specialty_Creatures,specialty_Dark Arts,specialty_Defense Against the Dark Arts,specialty_Dueling,specialty_Goat Charming,specialty_Gossip,specialty_Herbology,specialty_History of Magic,specialty_Household Charms,specialty_Legilimency,specialty_Magical Creatures,specialty_Memory Charms,specialty_Metamorphmagus,specialty_Muggle Artifacts,specialty_Obscurus,specialty_Potions,specialty_Quidditch,specialty_Strength,specialty_Transfiguration,specialty_Transformation,blood_status_Half-blood,blood_status_Muggle-born,blood_status_No-mag,blood_status_Pure-blood,pet_Cat,pet_Demiguise,pet_Dog,pet_Goat,pet_Owl,pet_Phoenix,pet_Rat,pet_Snake,pet_Toad,wand_type_Alder,wand_type_Ash,wand_type_Birch,wand_type_Blackthorn,wand_type_Cedar,wand_type_Cherry,wand_type_Chestnut,wand_type_Cypress,wand_type_Ebony,wand_type_Elder,wand_type_Elm,wand_type_Fir,wand_type_Hawthorn,wand_type_Hazel,wand_type_Hemlock,wand_type_Holly,wand_type_Hornbeam,wand_type_Maple,wand_type_Oak,wand_type_Pine,wand_type_Rosewood,wand_type_Rowan,wand_type_Sword,wand_type_Teak,wand_type_Vine,wand_type_Walnut,wand_type_Willow,wand_type_Yew,patronus_Cat,patronus_Doe,patronus_Dog,patronus_Eagle,patronus_Hare,patronus_Horse,patronus_Jack Russell Terrier,patronus_Lion,patronus_Non-corporeal,patronus_Otter,patronus_Phoenix,patronus_Serpent,patronus_Stag,patronus_Swan,patronus_Wolf,quidditch_position_Azkaban,quidditch_position_Beater,quidditch_position_Chaser,quidditch_position_Keeper,quidditch_position_Seeker,boggart_Ariana's death,boggart_Dementor,boggart_Dueling,boggart_Failure,boggart_Full Moon,boggart_Her mother,boggart_Lily Potter,boggart_Lord Voldemort,boggart_Severus Snape,boggart_Spider,boggart_Tom Riddle,favorite_class_Arithmancy,favorite_class_Baking,favorite_class_Charms,favorite_class_Creatures,favorite_class_Dark Arts,favorite_class_Defense Against the Dark Arts,favorite_class_Dueling,favorite_class_Goat Charming,favorite_class_Gossip,favorite_class_Herbology,favorite_class_Household Charms,favorite_class_Legilimency,favorite_class_Memory Charms,favorite_class_Muggle Studies,favorite_class_Obscurus,favorite_class_Potions,favorite_class_Quidditch,favorite_class_Strength,favorite_class_Transfiguration,favorite_class_Transformation,hp_100_120,hp_121_140,hp_141_160,hp_161_180,hp_181_200,age_11,age_12,age_13,age_14
0,Harry Potter,Gryffindor,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,1,0,0,1,0,0,0
1,Hermione Granger,Gryffindor,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,1,1,0,0,0
2,Ron Weasley,Gryffindor,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0,1,0,0,0
3,Draco Malfoy,Slytherin,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0,0,0,0,0,1,0,0,0
4,Luna Lovegood,Ravenclaw,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0,0,0,0,1,0,0,0


---

### **7.7 House Keeping the Columns with Naming Convention ✨**

And once we've satisfied with the results, let's go ahead and save the current dataset to make the next enchanting journey more easier to navigate.

In [206]:
import pandas as pd
from IPython.display import display, HTML

# Assuming hogwarts_df_encoded is already defined and contains the necessary columns

# Manipulate column names
hogwarts_df_encoded_age.columns = hogwarts_df_encoded_age.columns.str.lower().str.replace(' ', '_').str.replace('-', '-').str.replace("'", "")

# Display the transformed DataFrame in a scrollable pane
html = hogwarts_df_encoded_age.head(5).to_html()  # Convert first 5 rows to HTML
scrollable_html = f"""
<div style="height: 300px; overflow: auto;">
    {html}
</div>
"""
display(HTML(scrollable_html))  # Display first 5 rows in a scrollable pane

Unnamed: 0,name,house,gender_female,gender_male,origin_bulgaria,origin_england,origin_europe,origin_france,origin_indonesia,origin_ireland,origin_scotland,origin_usa,origin_wales,specialty_auror,specialty_baking,specialty_charms,specialty_chess,specialty_creatures,specialty_dark_arts,specialty_defense_against_the_dark_arts,specialty_dueling,specialty_goat_charming,specialty_gossip,specialty_herbology,specialty_history_of_magic,specialty_household_charms,specialty_legilimency,specialty_magical_creatures,specialty_memory_charms,specialty_metamorphmagus,specialty_muggle_artifacts,specialty_obscurus,specialty_potions,specialty_quidditch,specialty_strength,specialty_transfiguration,specialty_transformation,blood_status_half-blood,blood_status_muggle-born,blood_status_no-mag,blood_status_pure-blood,pet_cat,pet_demiguise,pet_dog,pet_goat,pet_owl,pet_phoenix,pet_rat,pet_snake,pet_toad,wand_type_alder,wand_type_ash,wand_type_birch,wand_type_blackthorn,wand_type_cedar,wand_type_cherry,wand_type_chestnut,wand_type_cypress,wand_type_ebony,wand_type_elder,wand_type_elm,wand_type_fir,wand_type_hawthorn,wand_type_hazel,wand_type_hemlock,wand_type_holly,wand_type_hornbeam,wand_type_maple,wand_type_oak,wand_type_pine,wand_type_rosewood,wand_type_rowan,wand_type_sword,wand_type_teak,wand_type_vine,wand_type_walnut,wand_type_willow,wand_type_yew,patronus_cat,patronus_doe,patronus_dog,patronus_eagle,patronus_hare,patronus_horse,patronus_jack_russell_terrier,patronus_lion,patronus_non-corporeal,patronus_otter,patronus_phoenix,patronus_serpent,patronus_stag,patronus_swan,patronus_wolf,quidditch_position_azkaban,quidditch_position_beater,quidditch_position_chaser,quidditch_position_keeper,quidditch_position_seeker,boggart_arianas_death,boggart_dementor,boggart_dueling,boggart_failure,boggart_full_moon,boggart_her_mother,boggart_lily_potter,boggart_lord_voldemort,boggart_severus_snape,boggart_spider,boggart_tom_riddle,favorite_class_arithmancy,favorite_class_baking,favorite_class_charms,favorite_class_creatures,favorite_class_dark_arts,favorite_class_defense_against_the_dark_arts,favorite_class_dueling,favorite_class_goat_charming,favorite_class_gossip,favorite_class_herbology,favorite_class_household_charms,favorite_class_legilimency,favorite_class_memory_charms,favorite_class_muggle_studies,favorite_class_obscurus,favorite_class_potions,favorite_class_quidditch,favorite_class_strength,favorite_class_transfiguration,favorite_class_transformation,hp_100_120,hp_121_140,hp_141_160,hp_161_180,hp_181_200,age_11,age_12,age_13,age_14
0,Harry Potter,Gryffindor,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,1,0,0,1,0,0,0
1,Hermione Granger,Gryffindor,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,1,1,0,0,0
2,Ron Weasley,Gryffindor,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0,1,0,0,0
3,Draco Malfoy,Slytherin,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0,0,0,0,0,1,0,0,0
4,Luna Lovegood,Ravenclaw,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0,0,0,0,1,0,0,0


In [207]:
hogwarts_df_encoded_age.to_csv('data/hogwarts-students-03.csv', index=False)

### **7.6 Mapping The Correlation For Each Columns**

In [208]:
# Importing necessary libraries
import pandas as pd
from IPython.display import display, HTML

# Assuming hogwarts_df is already defined and contains the necessary columns

# Selecting only numerical columns
numerical_df = hogwarts_df_encoded_age.select_dtypes(include=['number'])

# Calculating the correlation matrix
correlation_matrix = numerical_df.corr()

# Displaying the correlation matrix in a scrollable pane
correlation_html = correlation_matrix.to_html()  # Convert correlation matrix to HTML
scrollable_correlation_html = f"""
<div style="height: 300px; overflow: auto;">
    {correlation_html}
</div>
"""
display(HTML(scrollable_correlation_html))  # Display correlation matrix in a scrollable pane

Unnamed: 0,gender_female,gender_male,origin_bulgaria,origin_england,origin_europe,origin_france,origin_indonesia,origin_ireland,origin_scotland,origin_usa,origin_wales,specialty_auror,specialty_baking,specialty_charms,specialty_chess,specialty_creatures,specialty_dark_arts,specialty_defense_against_the_dark_arts,specialty_dueling,specialty_goat_charming,specialty_gossip,specialty_herbology,specialty_history_of_magic,specialty_household_charms,specialty_legilimency,specialty_magical_creatures,specialty_memory_charms,specialty_metamorphmagus,specialty_muggle_artifacts,specialty_obscurus,specialty_potions,specialty_quidditch,specialty_strength,specialty_transfiguration,specialty_transformation,blood_status_half-blood,blood_status_muggle-born,blood_status_no-mag,blood_status_pure-blood,pet_cat,pet_demiguise,pet_dog,pet_goat,pet_owl,pet_phoenix,pet_rat,pet_snake,pet_toad,wand_type_alder,wand_type_ash,wand_type_birch,wand_type_blackthorn,wand_type_cedar,wand_type_cherry,wand_type_chestnut,wand_type_cypress,wand_type_ebony,wand_type_elder,wand_type_elm,wand_type_fir,wand_type_hawthorn,wand_type_hazel,wand_type_hemlock,wand_type_holly,wand_type_hornbeam,wand_type_maple,wand_type_oak,wand_type_pine,wand_type_rosewood,wand_type_rowan,wand_type_sword,wand_type_teak,wand_type_vine,wand_type_walnut,wand_type_willow,wand_type_yew,patronus_cat,patronus_doe,patronus_dog,patronus_eagle,patronus_hare,patronus_horse,patronus_jack_russell_terrier,patronus_lion,patronus_non-corporeal,patronus_otter,patronus_phoenix,patronus_serpent,patronus_stag,patronus_swan,patronus_wolf,quidditch_position_azkaban,quidditch_position_beater,quidditch_position_chaser,quidditch_position_keeper,quidditch_position_seeker,boggart_arianas_death,boggart_dementor,boggart_dueling,boggart_failure,boggart_full_moon,boggart_her_mother,boggart_lily_potter,boggart_lord_voldemort,boggart_severus_snape,boggart_spider,boggart_tom_riddle,favorite_class_arithmancy,favorite_class_baking,favorite_class_charms,favorite_class_creatures,favorite_class_dark_arts,favorite_class_defense_against_the_dark_arts,favorite_class_dueling,favorite_class_goat_charming,favorite_class_gossip,favorite_class_herbology,favorite_class_household_charms,favorite_class_legilimency,favorite_class_memory_charms,favorite_class_muggle_studies,favorite_class_obscurus,favorite_class_potions,favorite_class_quidditch,favorite_class_strength,favorite_class_transfiguration,favorite_class_transformation,hp_100_120,hp_121_140,hp_141_160,hp_161_180,hp_181_200,age_11,age_12,age_13,age_14
gender_female,1.0,-1.0,-0.134742,-0.396062,-0.19245,0.257143,0.145521,0.145521,0.3,0.092063,0.207846,0.007698,-0.134742,0.297107,-0.134742,0.145521,-0.106574,-0.073016,-0.19245,-0.134742,0.145521,0.092063,0.145521,0.145521,0.145521,-0.134742,-0.134742,0.145521,-0.134742,0.145521,-0.052727,-0.19245,-0.073016,0.01111111,0.007698,0.075556,0.207846,-0.134742,-0.118783,0.409878,-0.134742,-0.19245,-0.134742,-0.02566,-0.134742,-0.19245,0.145521,-0.134742,0.007698,-0.133333,0.007698,0.145521,0.007698,-0.19245,-0.19245,0.007698,0.007698,-0.238095,-0.19245,0.3,-0.073016,0.145521,0.145521,-0.134742,-0.19245,0.145521,-0.19245,0.207846,0.145521,0.145521,-0.134742,0.145521,0.145521,0.007698,0.207846,0.007698,0.145521,0.007698,-0.134742,0.145521,0.145521,0.145521,-0.134742,-0.134742,-0.02566,0.145521,-0.134742,-0.134742,-0.19245,0.145521,0.145521,0.145521,-0.134742,0.007698,-0.134742,0.052727,-0.134742,-0.134742,0.145521,0.161628,-0.19245,0.207846,-0.134742,-0.134742,-0.134742,-0.134742,0.145521,0.145521,-0.134742,0.311768,0.007698,-0.183289,-0.09026709,-0.19245,-0.134742,0.145521,0.092063,0.145521,0.145521,-0.134742,-0.134742,0.145521,-0.052727,-0.134742,-0.073016,0.007698,0.007698,0.33896,0.254851,-0.196946,0.011111,-0.041205,-0.1111111,0.3,-0.017972,-0.048686
gender_male,-1.0,1.0,0.134742,0.396062,0.19245,-0.257143,-0.145521,-0.145521,-0.3,-0.092063,-0.207846,-0.007698,0.134742,-0.297107,0.134742,-0.145521,0.106574,0.073016,0.19245,0.134742,-0.145521,-0.092063,-0.145521,-0.145521,-0.145521,0.134742,0.134742,-0.145521,0.134742,-0.145521,0.052727,0.19245,0.073016,-0.01111111,-0.007698,-0.075556,-0.207846,0.134742,0.118783,-0.409878,0.134742,0.19245,0.134742,0.02566,0.134742,0.19245,-0.145521,0.134742,-0.007698,0.133333,-0.007698,-0.145521,-0.007698,0.19245,0.19245,-0.007698,-0.007698,0.238095,0.19245,-0.3,0.073016,-0.145521,-0.145521,0.134742,0.19245,-0.145521,0.19245,-0.207846,-0.145521,-0.145521,0.134742,-0.145521,-0.145521,-0.007698,-0.207846,-0.007698,-0.145521,-0.007698,0.134742,-0.145521,-0.145521,-0.145521,0.134742,0.134742,0.02566,-0.145521,0.134742,0.134742,0.19245,-0.145521,-0.145521,-0.145521,0.134742,-0.007698,0.134742,-0.052727,0.134742,0.134742,-0.145521,-0.161628,0.19245,-0.207846,0.134742,0.134742,0.134742,0.134742,-0.145521,-0.145521,0.134742,-0.311768,-0.007698,0.183289,0.09026709,0.19245,0.134742,-0.145521,-0.092063,-0.145521,-0.145521,0.134742,0.134742,-0.145521,0.052727,0.134742,0.073016,-0.007698,-0.007698,-0.33896,-0.254851,0.196946,-0.011111,0.041205,0.1111111,-0.3,0.017972,0.048686
origin_bulgaria,-0.134742,0.134742,1.0,-0.200921,-0.028006,-0.034648,-0.019608,-0.019608,-0.040423,-0.034648,-0.028006,-0.028006,-0.019608,-0.055228,-0.019608,-0.019608,-0.050572,-0.034648,-0.028006,-0.019608,-0.019608,-0.034648,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.045672,0.70014,-0.034648,-0.0404226,-0.028006,-0.134742,-0.028006,-0.019608,0.151248,-0.055228,-0.019608,-0.028006,-0.019608,0.093352,-0.019608,-0.028006,-0.019608,-0.019608,-0.028006,-0.040423,-0.028006,-0.019608,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.034648,-0.028006,-0.0404226,0.565916,-0.019608,-0.019608,-0.019608,-0.028006,-0.019608,-0.028006,-0.028006,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.028006,-0.028006,-0.028006,-0.019608,-0.028006,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,0.093352,-0.019608,-0.019608,-0.019608,-0.028006,-0.019608,-0.019608,-0.019608,-0.019608,-0.028006,-0.019608,0.045672,-0.019608,-0.019608,-0.019608,0.076696,-0.028006,-0.028006,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.068327,-0.028006,-0.045672,-0.05970814,-0.028006,-0.019608,-0.019608,-0.034648,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.045672,1.0,-0.034648,-0.028006,-0.028006,-0.045672,-0.050572,-0.059708,0.485071,-0.055228,-0.08084521,-0.040423,-0.089158,0.177123
origin_england,-0.396062,0.396062,-0.200921,1.0,-0.286972,-0.355036,-0.200921,-0.200921,-0.414208,-0.355036,-0.286972,-0.073793,-0.200921,-0.325691,0.09759,-0.200921,-0.004935,0.172446,0.139386,0.09759,0.09759,-0.003381,0.09759,0.09759,-0.200921,0.09759,0.09759,-0.200921,0.09759,0.09759,0.227314,-0.073793,-0.003381,0.04733811,-0.073793,-0.231957,0.139386,-0.200921,0.234055,0.154761,0.09759,0.139386,0.09759,-0.286972,0.09759,0.139386,-0.200921,0.09759,-0.073793,0.047338,-0.073793,-0.200921,-0.073793,0.139386,0.139386,-0.073793,0.139386,-0.003381,0.139386,-0.2603596,-0.003381,-0.200921,0.09759,0.09759,0.139386,-0.200921,0.139386,0.139386,-0.200921,-0.200921,0.09759,-0.200921,0.09759,0.139386,0.139386,-0.073793,-0.200921,0.139386,0.09759,-0.200921,-0.200921,0.09759,0.09759,0.09759,-0.020498,0.09759,0.09759,0.09759,0.139386,-0.200921,-0.200921,0.09759,0.09759,0.139386,0.09759,-0.227314,0.09759,0.09759,0.09759,-0.18712,0.139386,-0.286972,0.09759,0.09759,0.09759,0.09759,0.09759,0.09759,-0.200921,-0.180036,-0.073793,-0.050811,0.06992302,0.139386,0.09759,0.09759,-0.003381,0.09759,-0.200921,0.09759,0.09759,0.09759,0.227314,-0.200921,-0.003381,-0.073793,-0.073793,-0.189874,-0.261573,0.183548,-0.106511,-0.085465,0.3076977,-0.26036,0.353257,-0.460225
origin_europe,-0.19245,0.19245,-0.028006,-0.286972,1.0,-0.049487,-0.028006,-0.028006,-0.057735,-0.049487,-0.04,-0.04,-0.028006,-0.078881,-0.028006,-0.028006,0.553775,-0.049487,-0.04,-0.028006,-0.028006,-0.049487,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.065233,-0.04,-0.049487,-0.05773503,-0.04,0.007698,-0.04,-0.028006,0.01543,-0.078881,-0.028006,-0.04,-0.028006,0.133333,-0.028006,-0.04,-0.028006,-0.028006,-0.04,-0.057735,-0.04,-0.028006,-0.04,-0.04,-0.04,-0.04,-0.04,0.379402,-0.04,-0.05773503,-0.049487,-0.028006,-0.028006,-0.028006,-0.04,-0.028006,-0.04,-0.04,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.04,-0.04,0.48,-0.028006,-0.04,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,0.133333,-0.028006,-0.028006,-0.028006,-0.04,-0.028006,-0.028006,-0.028006,-0.028006,-0.04,-0.028006,0.065233,-0.028006,-0.028006,-0.028006,0.109545,-0.04,-0.04,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.09759,-0.04,0.613188,-0.08528029,-0.04,-0.028006,-0.028006,-0.049487,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.065233,-0.028006,-0.049487,-0.04,-0.04,-0.065233,-0.072232,-0.08528,-0.057735,0.214106,-0.1154701,-0.057735,-0.127343,0.252982
origin_france,0.257143,-0.257143,-0.034648,-0.355036,-0.049487,1.0,-0.034648,-0.034648,-0.071429,-0.061224,-0.049487,-0.049487,-0.034648,0.385713,-0.034648,-0.034648,-0.089363,-0.061224,-0.049487,-0.034648,-0.034648,-0.061224,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.080705,-0.049487,0.292517,-0.07142857,-0.049487,0.257143,-0.049487,-0.034648,-0.229081,-0.09759,-0.034648,-0.049487,-0.034648,0.164957,-0.034648,-0.049487,-0.034648,-0.034648,0.379402,-0.071429,-0.049487,-0.034648,-0.049487,-0.049487,-0.049487,-0.049487,-0.049487,-0.061224,-0.049487,0.2380952,-0.061224,-0.034648,-0.034648,-0.034648,-0.049487,-0.034648,-0.049487,-0.049487,0.565916,-0.034648,-0.034648,-0.034648,-0.034648,-0.049487,-0.049487,-0.049487,-0.034648,-0.049487,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,0.164957,-0.034648,-0.034648,-0.034648,-0.049487,-0.034648,-0.034648,-0.034648,-0.034648,-0.049487,-0.034648,0.080705,-0.034648,-0.034648,-0.034648,0.135526,-0.049487,-0.049487,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,0.297816,-0.049487,-0.080705,-0.105507,-0.049487,-0.034648,-0.034648,-0.061224,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.080705,-0.034648,0.292517,-0.049487,-0.049487,0.199072,0.168797,-0.105507,-0.071429,-0.09759,-0.1428571,0.238095,-0.157546,0.143451
origin_indonesia,0.145521,-0.145521,-0.019608,-0.200921,-0.028006,-0.034648,1.0,-0.019608,-0.040423,-0.034648,-0.028006,-0.028006,-0.019608,-0.055228,-0.019608,-0.019608,-0.050572,-0.034648,-0.028006,-0.019608,-0.019608,-0.034648,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.045672,-0.028006,-0.034648,-0.0404226,0.70014,0.145521,-0.028006,-0.019608,-0.129641,-0.055228,-0.019608,-0.028006,-0.019608,-0.210042,-0.019608,-0.028006,1.0,-0.019608,-0.028006,-0.040423,-0.028006,-0.019608,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.034648,-0.028006,-0.0404226,-0.034648,-0.019608,-0.019608,-0.019608,-0.028006,-0.019608,-0.028006,-0.028006,-0.019608,-0.019608,-0.019608,1.0,-0.019608,-0.028006,-0.028006,-0.028006,-0.019608,-0.028006,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,0.093352,-0.019608,-0.019608,-0.019608,-0.028006,-0.019608,-0.019608,-0.019608,-0.019608,-0.028006,-0.019608,0.045672,-0.019608,-0.019608,-0.019608,0.076696,-0.028006,-0.028006,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.068327,-0.028006,-0.045672,-0.05970814,-0.028006,-0.019608,-0.019608,-0.034648,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.045672,-0.019608,-0.034648,-0.028006,0.70014,-0.045672,-0.050572,-0.059708,-0.040423,-0.055228,-0.08084521,-0.040423,-0.089158,0.177123
origin_ireland,0.145521,-0.145521,-0.019608,-0.200921,-0.028006,-0.034648,-0.019608,1.0,-0.040423,-0.034648,-0.028006,-0.028006,-0.019608,-0.055228,-0.019608,1.0,-0.050572,-0.034648,-0.028006,-0.019608,-0.019608,-0.034648,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.045672,-0.028006,-0.034648,-0.0404226,-0.028006,0.145521,-0.028006,-0.019608,-0.129641,-0.055228,-0.019608,-0.028006,-0.019608,0.093352,-0.019608,-0.028006,-0.019608,-0.019608,-0.028006,-0.040423,-0.028006,-0.019608,-0.028006,-0.028006,-0.028006,-0.028006,-0.028006,-0.034648,-0.028006,0.4850713,-0.034648,-0.019608,-0.019608,-0.019608,-0.028006,-0.019608,-0.028006,-0.028006,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.028006,-0.028006,-0.028006,-0.019608,-0.028006,-0.019608,-0.019608,1.0,-0.019608,-0.019608,-0.019608,-0.210042,-0.019608,-0.019608,-0.019608,-0.028006,-0.019608,-0.019608,-0.019608,-0.019608,-0.028006,-0.019608,0.045672,-0.019608,-0.019608,-0.019608,-0.255655,-0.028006,0.70014,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.068327,0.70014,-0.045672,-0.05970814,-0.028006,-0.019608,-0.019608,-0.034648,-0.019608,-0.019608,-0.019608,-0.019608,-0.019608,-0.045672,-0.019608,-0.034648,-0.028006,-0.028006,0.429318,-0.050572,-0.059708,-0.040423,-0.055228,0.2425356,-0.040423,-0.089158,-0.110702
origin_scotland,0.3,-0.3,-0.040423,-0.414208,-0.057735,-0.071429,-0.040423,-0.040423,1.0,-0.071429,-0.057735,-0.057735,-0.040423,0.52048,-0.040423,-0.040423,-0.104257,-0.071429,-0.057735,-0.040423,-0.040423,-0.071429,-0.040423,-0.040423,-0.040423,-0.040423,-0.040423,-0.040423,-0.040423,-0.040423,-0.094155,-0.057735,-0.071429,0.1875,-0.057735,0.011111,-0.057735,-0.040423,0.022272,0.09759,-0.040423,-0.057735,-0.040423,0.036084,-0.040423,-0.057735,-0.040423,-0.040423,-0.057735,-0.083333,-0.057735,-0.040423,-0.057735,-0.057735,-0.057735,-0.057735,-0.057735,-0.071429,-0.057735,0.1875,-0.071429,0.485071,-0.040423,-0.040423,-0.057735,0.485071,-0.057735,-0.057735,-0.040423,0.485071,-0.040423,-0.040423,-0.040423,-0.057735,-0.057735,-0.057735,0.485071,-0.057735,-0.040423,0.485071,-0.040423,-0.040423,-0.040423,-0.040423,-0.276647,-0.040423,-0.040423,-0.040423,-0.057735,0.485071,-0.040423,-0.040423,-0.040423,-0.057735,-0.040423,0.094155,-0.040423,-0.040423,-0.040423,-0.013176,-0.057735,0.317543,-0.040423,-0.040423,-0.040423,-0.040423,-0.040423,-0.040423,-0.040423,0.408491,-0.057735,-0.094155,-0.1230915,-0.057735,-0.040423,-0.040423,-0.071429,-0.040423,-0.040423,-0.040423,-0.040423,-0.040423,-0.094155,-0.040423,-0.071429,0.317543,-0.057735,0.150649,-0.104257,0.076932,0.1875,0.09759,-0.1666667,0.1875,-0.024507,0.068465
origin_usa,0.092063,-0.092063,-0.034648,-0.355036,-0.049487,-0.061224,-0.034648,-0.034648,-0.071429,1.0,-0.049487,0.379402,0.565916,-0.09759,-0.034648,-0.034648,-0.089363,-0.061224,-0.049487,-0.034648,-0.034648,-0.061224,-0.034648,-0.034648,0.565916,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.080705,-0.049487,-0.061224,-0.07142857,-0.049487,0.092063,-0.049487,0.565916,-0.229081,-0.09759,-0.034648,-0.049487,-0.034648,0.164957,-0.034648,-0.049487,-0.034648,-0.034648,-0.049487,0.238095,0.379402,-0.034648,-0.049487,-0.049487,-0.049487,0.379402,-0.049487,-0.061224,-0.049487,-0.07142857,-0.061224,-0.034648,-0.034648,-0.034648,-0.049487,-0.034648,-0.049487,-0.049487,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.049487,-0.049487,-0.049487,-0.034648,-0.049487,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,0.164957,-0.034648,-0.034648,-0.034648,-0.049487,-0.034648,-0.034648,-0.034648,-0.034648,-0.049487,-0.034648,0.080705,-0.034648,-0.034648,-0.034648,0.135526,-0.049487,-0.049487,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,-0.034648,0.565916,-0.120736,-0.049487,-0.080705,0.1230915,-0.049487,-0.034648,-0.034648,-0.061224,-0.034648,0.565916,-0.034648,-0.034648,-0.034648,-0.080705,-0.034648,-0.061224,-0.049487,-0.049487,-0.080705,0.426958,-0.105507,-0.071429,-0.09759,-0.1428571,-0.071429,-0.157546,0.312984


In [209]:
correlation_matrix.to_csv('data/correlation-matrix.csv')

---

In [210]:
# Importing necessary libraries
import pandas as pd
from IPython.display import display, HTML

# Assuming hogwarts_df is already defined and contains the necessary columns

# Setting display options to show all columns and prevent truncation
pd.set_option('display.max_columns', None)  # Show all columns
pd.set_option('display.expand_frame_repr', False)  # Prevent truncation in output

# Checking the data types of each column
data_types_df = hogwarts_df.dtypes.to_frame(name='Data Type')  # Convert data types to a DataFrame

# Displaying the data types in a scrollable pane
data_types_html = data_types_df.to_html()  # Convert DataFrame to HTML
scrollable_data_types_html = f"""
<div style="height: 150px; overflow: auto;">
    {data_types_html}
</div>
"""
display(HTML(scrollable_data_types_html))  # Display data types in a scrollable pane

Unnamed: 0,Data Type
name,object
age,int64
house,object
house_points,float64
gender_Female,float64
gender_Male,float64
origin_Bulgaria,float64
origin_England,float64
origin_Europe,float64
origin_France,float64


---

### **7.7 Visualizing Relationships between Features**

Next, we weave a more intricate spell, exploring the relationships between different features in our dataset. For instance, does a student’s heritage influence their choice of pet, or is there a connection between a student’s age and the type of wand they use? This step is like exploring the Forbidden Forest, uncovering the connections and mysteries that lie within.

In [211]:
# Importing necessary libraries
import pandas as pd
from IPython.display import display, HTML

# Assuming hogwarts_df is already defined and contains the necessary columns

# Setting display options to show all columns and prevent truncation
pd.set_option('display.max_columns', None)  # Show all columns
pd.set_option('display.expand_frame_repr', False)  # Prevent truncation in output

# Checking the data types of each column to identify numerical and categorical data
data_types_html = hogwarts_df.dtypes.to_frame(name='Data Type').to_html()  # Convert data types to HTML
scrollable_data_types_html = f"""
<div style="height: 150px; overflow: auto;">
    {data_types_html}
</div>
"""
display(HTML(scrollable_data_types_html))  # Display data types in a scrollable pane

Unnamed: 0,Data Type
name,object
age,int64
house,object
house_points,float64
gender_Female,float64
gender_Male,float64
origin_Bulgaria,float64
origin_England,float64
origin_Europe,float64
origin_France,float64


---

### **7.8 Gemika's Pop-Up Quiz: Spotting the Trends**

And now, dear apprentices, Gemika Haziq Nugroho has prepared a quiz to test your understanding of this powerful encoding spell. Are you ready to decode the mysteries of One-Hot Encoding?

- What is categorical data, and why is it important in data analysis?
- How does One-Hot Encoding transform categorical data for machine learning models?
- Why do we drop the original categorical column after applying One-Hot Encoding?

Answer these questions to prove your mastery over the art of data transformation. As we continue our journey through the magical world of data science, remember that each spell we learn brings us closer to unveiling the secrets of our enchanted dataset. 🌟🧙‍♂️✨

With these newfound skills, you're well-equipped to handle categorical data in any dataset. The path to becoming a master data wizard is full of wonder and discovery, and with each step, we draw closer to the heart of the magical data that surrounds us. Let us press on, eager to learn and ready to explore! 🌌🔍

---