# ðŸš€ Pandas Adventure - Day 5: Mastering Data Transformation! ðŸš€
---
**English:** Welcome to the grand finale! Today, we'll explore powerful data transformation techniques like `melt` and `pivot`. Let's start by importing our essential libraries.
**Hindi:** Grand finale me aapka swagat hai! Aaj hum `melt` aur `pivot` jaise powerful data transformation techniques explore karenge. Chaliye, apni zaroori libraries import karke shuru karte hain.

In [1]:
import pandas as pd
import numpy as np

data = {
    'name':['sachin','virat','rohit','dhoni','rahul','yuvraj'],
    'runs':[1000,900,800,700,500,700],
    'sixes':[100,120,200,160,90,120]
}
df =pd.DataFrame(data)

### âœ¨ Step 1: The `melt()` Magic - From Wide to Long! âœ¨
**English:** The `melt()` function is fantastic for transforming a DataFrame from a wide format to a long format. We are 'unpivoting' the `runs` and `sixes` columns into a single 'performance' column.
**Hindi:** `melt()` function ek DataFrame ko wide format se long format me badalne ke liye shaandaar hai. Hum `runs` aur `sixes` columns ko ek 'performance' column me 'unpivot' kar rahe hain.

In [2]:
df1 = df.melt(id_vars='name',value_vars=['runs','sixes'],var_name='performance',value_name='score')

**English:** Look at the transformed data! Each player now has two rows: one for runs and one for sixes. Much tidier!
**Hindi:** Transformed data dekhiye! Har player ke paas ab do rows hain: ek runs ke liye aur ek sixes ke liye. Kitna saaf-suthra hai!

In [3]:
df1

Unnamed: 0,name,performance,score
0,sachin,runs,1000
1,virat,runs,900
2,rohit,runs,800
3,dhoni,runs,700
4,rahul,runs,500
5,yuvraj,runs,700
6,sachin,sixes,100
7,virat,sixes,120
8,rohit,sixes,200
9,dhoni,sixes,160


### âœ¨ Step 2: The `pivot()` Power-Up - Back to Wide! âœ¨
**English:** `pivot()` is the opposite of `melt()`. It takes a long format and makes it wide. We're recreating our original structure, using player names as the index and their performance metrics as columns.
**Hindi:** `pivot()` `melt()` ka ulta hai. Yeh long format ko leta hai aur use wide banata hai. Hum player names ko index ke roop me aur unke performance metrics ko columns ke roop me istemal karke apna original structure wapas bana rahe hain.

In [4]:
df1.pivot(index='name', columns='performance', values='score')

performance,runs,sixes
name,Unnamed: 1_level_1,Unnamed: 2_level_1
dhoni,700,160
rahul,500,90
rohit,800,200
sachin,1000,100
virat,900,120
yuvraj,700,120


### ðŸ§¹ Bonus Round 1: Cleaning Up Duplicates! ðŸ§¹
**English:** Let's create a new DataFrame with some duplicate rows and see how easy it is to clean them up using `drop_duplicates()`.
**Hindi:** Chaliye, kuchh duplicate rows ke saath ek naya DataFrame banate hain aur dekhte hain ki `drop_duplicates()` ka istemal karke unhein saaf karna kitna aasan hai.

In [5]:
data2 = {
    'Name': ['Sachin', 'Sneha', 'Amit', 'Sneha', 'Rahul', 'Amit'],
    'Age': [23, 25, 30, 25, 22, 30],
    'City': ['Pune', 'Delhi', 'Mumbai', 'Delhi', 'Pune', 'Mumbai']
}
sm = pd.DataFrame(data2)

**English:** We have two duplicate rows. `sm.duplicated().sum()` confirms this.
**Hindi:** Hamare paas do duplicate rows hain. `sm.duplicated().sum()` ise confirm karta hai.

In [6]:
sm.duplicated().sum()

np.int64(2)

**English:** And... poof! They're gone. A clean DataFrame remains.
**Hindi:** Aur... gayab! Ek saaf DataFrame bacha hai.

In [7]:
new = sm.drop_duplicates()
new

Bad pipe message: %s [b' 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/142.0.0.0 Sa']
Bad pipe message: %s [b'ri/537.36\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/', b'ng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7\r\nAccept-Encoding: gzip, deflate, br, zstd\r\nA']
Bad pipe message: %s [b'ept-Language: en-US,en;q=0.9,hi;q=0.8,mr;q=0.7\r\nPriority: u=0, i\r\nReferer: https://studio.firebase', b'oogle.com/\r\nSec-Ch-Ua: "Chromium";v="142", "G', b'gle Chrome";v="142", "Not_A Brand";v="99"\r\nSec-Ch-Ua-Arch: "x86"\r\nSec-Ch-Ua-Bitness: "64"\r\nSec-Ch-Ua-Form-Fact', b's: "Desktop"\r\nSec-Ch-Ua-Full-Version: "142.0.7444.60"\r\nSec-Ch-Ua-Full-Version-List: "Chromium";v="142.0.7444.6']
Bad pipe message: %s [b', "Google Chrome";v="142.0.7444.60", "Not_A Bra']
Bad pipe message: %s [b'";v="99.0.0.0"\r\nSec-Ch-Ua-Mobile: ?0\r\nSec-Ch-Ua-Model: ""\r\nSec-Ch-Ua-Platform: "Windows"\r\nSec-Ch-Ua-Platform-']
Bad pipe message: %s [

Unnamed: 0,Name,Age,City
0,Sachin,23,Pune
1,Sneha,25,Delhi
2,Amit,30,Mumbai
4,Rahul,22,Pune


### ðŸ“§ Bonus Round 2: Mastering String and Email Operations! ðŸ“§
**English:** Now, let's play with email addresses. We'll extract the domain from each email using `.str.extract()` - a powerful regex-based tool!
**Hindi:** Ab, chaliye email addresses ke saath khelte hain. Hum har email se domain nikalne ke liye `.str.extract()` ka istemal karenge - ek powerful regex-based tool!

In [8]:
data3 = {
    'Name': ['Sachin', 'Sneha', 'Amit', 'Riya', 'Ravi', 'Suresh'],
    'Email': ['sachin@gmail.com', 'sneha@yahoo.com', 'amit@gmail.com', 
              'riya@hotmail.com', 'ravi@yahoo.com', 'suresh@gmail.com']
}
rk = pd.DataFrame(data3)
rk['domain'] = rk['Email'].str.extract('@(.*)')

**English:** Let's create a summary table showing the count and percentage of each email domain. Super useful!
**Hindi:** Chaliye ek summary table banate hain jo har email domain ki ginti aur pratishat dikhata hai. Bahut upyogi!

In [9]:
rk_counts = rk['domain'].value_counts()
domain_percentage = rk['domain'].value_counts(normalize=True) * 100
summary = pd.DataFrame({'Count': rk_counts, 'Percentage': domain_percentage.round(2)})
print(summary)

             Count  Percentage
domain                        
gmail.com        3       50.00
yahoo.com        2       33.33
hotmail.com      1       16.67


# ðŸŽ‰ Congratulations! ðŸŽ‰
---
**English:** You've completed the 5-day Pandas challenge! You've learned a ton, from basic filtering to advanced transformations. Keep practicing!
**Hindi:** Badhai ho! Aapne 5-din ka Pandas challenge poora kar liya hai! Aapne basic filtering se lekar advanced transformations tak bahut kuchh seekha hai. Practice karte rahiye!