## 1. NumPy 

### `where()`

**Scenario:** Imagine you're a school teacher and you have a list of student scores. You want to give a "Pass" or "Fail" grade based on whether the score is above 50.

**Real-world analogy:** It's like separating apples into two baskets: one for the ripe ones (Pass) and another for the unripe ones (Fail).

**Function:** `where()` helps you do this. You tell it the condition (score > 50), and it'll label each score accordingly.


In [1]:
import numpy as np

# Sample scores
scores = np.array([45, 55, 65, 40, 90, 52, 48])

# Assigning "Pass" or "Fail" based on the score where score is greater than 50
grades = np.where(scores > 50, "Pass", "Fail")
grades

array(['Fail', 'Pass', 'Pass', 'Fail', 'Pass', 'Pass', 'Fail'],
      dtype='<U4')

## 2. Pandas:

### `rename()`

**Scenario:** You have a list of student names, but instead of "Full Name", the column is labeled "Name".

**Real-world analogy:** It's like having a jar labeled "Sugar" when you want it to say "White Sugar".

**Function:** `rename()` lets you change column labels.

In [2]:
import pandas as pd

# Sample data
# Sample data with added 'Subject' column
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Math': [85, 90, 30],
    'English': [88, 82, 32],
    'Science': [80, 85, 31],
    'Score': ['253', '257', '93'],
    'Favorite Color': ['Red', 'Blue', 'Green'],
    'Gender': ['Female', 'Male', 'Male'],
    'Result': ['Pass', 'Pass', 'Fail']
}
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Math,English,Science,Score,Favorite Color,Gender,Result
0,Alice,85,88,80,253,Red,Female,Pass
1,Bob,90,82,85,257,Blue,Male,Pass
2,Charlie,30,32,31,93,Green,Male,Fail


In [3]:
# a) rename() "Name" ==> "Full Name"
df = df.rename(columns={'Name': 'Full Name'})
df

Unnamed: 0,Full Name,Math,English,Science,Score,Favorite Color,Gender,Result
0,Alice,85,88,80,253,Red,Female,Pass
1,Bob,90,82,85,257,Blue,Male,Pass
2,Charlie,30,32,31,93,Green,Male,Fail


### `apply()`

**Scenario:** You want to give a bonus 5 marks to every student in your list.

**Real-world analogy:** It's like adding a little extra scoop of ice cream to each cone.

**Function:** `apply()` lets you apply a function (like adding 5) to each item in a column.

In [4]:
# b) apply()
df['Score'] = df['Score'].astype(int)  # Convert scores to integer
df['Score'] = df['Score'].apply(lambda x: x + 5)
print('5 is added to each score value')
df

5 is added to each score value


Unnamed: 0,Full Name,Math,English,Science,Score,Favorite Color,Gender,Result
0,Alice,85,88,80,258,Red,Female,Pass
1,Bob,90,82,85,262,Blue,Male,Pass
2,Charlie,30,32,31,98,Green,Male,Fail


### `astype()`

**Scenario:** Your student scores are stored as text, but you want them as numbers to calculate the average.

**Real-world analogy:** It's like converting your shoe size from European to US format.

**Function:** `astype()` lets you change the data type of a column.

In [5]:
print('Score datatype:', df['Score'].dtypes)

Score datatype: int64


In [6]:
# c) astype()

df['Score'] = df['Score'].astype(str)
print('Score datatype as str:', df['Score'].dtypes)

df

Score datatype as str: object


Unnamed: 0,Full Name,Math,English,Science,Score,Favorite Color,Gender,Result
0,Alice,85,88,80,258,Red,Female,Pass
1,Bob,90,82,85,262,Blue,Male,Pass
2,Charlie,30,32,31,98,Green,Male,Fail


### `get_dummies()`

**Scenario:** You have a column for student's favorite color, and you want separate columns for each color with a 1 or 0 indicating their preference.

**Real-world analogy:** It's like having separate baskets for each type of fruit.

**Function:** `get_dummies()` converts categorical data into a format called "one-hot encoding".

In [7]:
# d) get_dummies()

color_dummies = pd.get_dummies(df['Favorite Color'])
color_dummies

Unnamed: 0,Blue,Green,Red
0,0,0,1
1,1,0,0
2,0,1,0


### `crosstab()`

**Scenario:** You want to see how many male and female students passed or failed.

**Real-world analogy:** It's like counting how many red and green apples you have.

**Function:** `crosstab()` gives you a table showing the frequency of combinations.

In [8]:
# e) crosstab()

tab = pd.crosstab(df['Gender'], df['Result'])
tab

Result,Fail,Pass
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Female,0,1
Male,1,1


### `sort_values()`

**Scenario:** You want to arrange student scores from highest to lowest.

**Real-world analogy:** It's like arranging books on a shelf from tallest to shortest.

**Function:** `sort_values()` lets you sort a DataFrame based on a column.

In [9]:
# f) sort_values()
sorted_df = df.sort_values(by='Score', ascending=False)
sorted_df

Unnamed: 0,Full Name,Math,English,Science,Score,Favorite Color,Gender,Result
2,Charlie,30,32,31,98,Green,Male,Fail
1,Bob,90,82,85,262,Blue,Male,Pass
0,Alice,85,88,80,258,Red,Female,Pass


### `melt()`

**Scenario:** You have separate columns for Math, English, and Science scores. You want them in two columns: "Subject" and "Score".

**Real-world analogy:** It's like melting different chocolates into one pot.

**Function:** `melt()` reshapes data from wide format to long format.

In [10]:
# g) melt()
melted_df = pd.melt(df, id_vars=['Full Name', 'Gender', 'Result'], value_vars=['Math', 'English', 'Science'], var_name='Subject', value_name='Total Scores')
melted_df

Unnamed: 0,Full Name,Gender,Result,Subject,Total Scores
0,Alice,Female,Pass,Math,85
1,Bob,Male,Pass,Math,90
2,Charlie,Male,Fail,Math,30
3,Alice,Female,Pass,English,88
4,Bob,Male,Pass,English,82
5,Charlie,Male,Fail,English,32
6,Alice,Female,Pass,Science,80
7,Bob,Male,Pass,Science,85
8,Charlie,Male,Fail,Science,31


### `pivot()`

**Scenario:** The opposite of melt. You have a long list of student scores with "Subject" and "Score" columns. You want separate columns for each subject.

**Real-world analogy:** It's like separating mixed chocolates back into their original bars.

**Function:** `pivot()` reshapes data from long format to wide format.

In [14]:
# h) pivot()

# Using the melted data to pivot back
pivoted_df = melted_df.pivot(index='Full Name', columns='Subject', values='Total Scores')
pivoted_df

Subject,English,Math,Science
Full Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Alice,88,85,80
Bob,82,90,85
Charlie,32,30,31
