In [23]:
import pandas as pd
!gdown 1Rfubz9-Cn-WXL7qMd79fwNN4vI1TidB7
df=pd.read_csv('/content/netlix_class' , index_col=0)

Downloading...
From: https://drive.google.com/uc?id=1Rfubz9-Cn-WXL7qMd79fwNN4vI1TidB7
To: /content/netlix_class
  0% 0.00/5.07k [00:00<?, ?B/s]100% 5.07k/5.07k [00:00<00:00, 15.9MB/s]


#MAP, APPLY AND APPLYMAP

**pandas: `map`, `apply`, and `applymap`**
- These three methods are used to apply functions to pandas data structures, but they differ in **where** and **how** they operate.

---

---

## `map`

- **Applies to:** `Series` only.
- **Purpose:** Used to map values in a Series according to an input correspondence like a dictionary, Series, or function.
- **Behavior:** Transforms each element of the Series individually based on the mapping or function.
- **Use case:** Value substitution or element-wise transformation in a Series.

---

## `apply`

- **Applies to:** Both `Series` and `DataFrame`.
- **Purpose:** Applies a function along a specified axis of a DataFrame (rows or columns) or applies a function element-wise on a Series.
- **Behavior:** For DataFrames, the function is applied to each row or column (depending on the axis). For Series, it applies to the entire Series.
- **Use case:** Aggregations, row/column-wise transformations, or applying complex functions to DataFrame rows/columns or Series.

---

## `applymap`

- **Applies to:** `DataFrame` only.
- **Purpose:** Applies a function element-wise to every single value in the DataFrame.
- **Behavior:** Transforms each individual element in the DataFrame independently.
- **Use case:** Element-wise transformations that need to be applied to all cells of a DataFrame.

---

# Summary

| Method    | Applicable To       | Operation Level              | Primary Purpose                       |
|-----------|--------------------|------------------------------|-------------------------------------|
| `map`     | Series             | Element-wise                  | Value mapping/substitution           |
| `apply`   | Series, DataFrame   | Function applied on axis or entire Series | Aggregation, row/column-wise function application |
| `applymap`| DataFrame          | Element-wise                  | Element-wise transformation of DataFrame |



In [24]:
df

Unnamed: 0,user_name,show_name,genre,episodes_watched,rating_given,hours_watched,subscription_type
0,Willow,Lucifer,Action,65,3.6,52.9,Basic
1,Asher,The Good Place,Romance,28,2.6,23.8,Premium
2,Lily,Sweet Magnolias,Drama,68,4.0,73.3,Standard
3,Forest,Chef's Table,Animation,64,3.6,42.3,Premium
4,Carter,Outer Banks,Documentary,68,4.6,49.5,Standard
...,...,...,...,...,...,...,...
95,Nora,13 Reasons Why,Action,34,4.7,28.3,Premium
96,Lisa,Ozark,Action,21,3.1,15.6,Standard
97,Ava,Bridgerton,Romance,70,4.1,44.5,Basic
98,Jake,The Office,Thriller,20,3.0,16.9,Standard


##**MAP**

In [25]:
price_map = {'Basic' : 100 , 'Standard' : 150 , 'Premium' : 200 }

In [26]:
df['Monthly_cost'] = df['subscription_type'].map(price_map)

In [27]:
df

Unnamed: 0,user_name,show_name,genre,episodes_watched,rating_given,hours_watched,subscription_type,Monthly_cost
0,Willow,Lucifer,Action,65,3.6,52.9,Basic,100
1,Asher,The Good Place,Romance,28,2.6,23.8,Premium,200
2,Lily,Sweet Magnolias,Drama,68,4.0,73.3,Standard,150
3,Forest,Chef's Table,Animation,64,3.6,42.3,Premium,200
4,Carter,Outer Banks,Documentary,68,4.6,49.5,Standard,150
...,...,...,...,...,...,...,...,...
95,Nora,13 Reasons Why,Action,34,4.7,28.3,Premium,200
96,Lisa,Ozark,Action,21,3.1,15.6,Standard,150
97,Ava,Bridgerton,Romance,70,4.1,44.5,Basic,100
98,Jake,The Office,Thriller,20,3.0,16.9,Standard,150


##**APPLY**

In [28]:
def feelings(rating):
  if rating >= 4.5:
    return 'awsome'
  elif rating >= 4.0 and rating <3.5:
    return'good'
  else:
    return('dissapointing')

In [29]:
df['feelings'] = df['rating_given'].apply(feelings)

In [30]:
df.head()

Unnamed: 0,user_name,show_name,genre,episodes_watched,rating_given,hours_watched,subscription_type,Monthly_cost,feelings
0,Willow,Lucifer,Action,65,3.6,52.9,Basic,100,dissapointing
1,Asher,The Good Place,Romance,28,2.6,23.8,Premium,200,dissapointing
2,Lily,Sweet Magnolias,Drama,68,4.0,73.3,Standard,150,dissapointing
3,Forest,Chef's Table,Animation,64,3.6,42.3,Premium,200,dissapointing
4,Carter,Outer Banks,Documentary,68,4.6,49.5,Standard,150,awsome


In [31]:
def binge_status(row):
  hours = row['hours_watched']
  episodes = row['episodes_watched']

  if hours > 40 and episodes > 40 :
    return 'Super binge watcher'
  elif hours >25 or episodes >30 :
    return 'Casual binge watcher'
  else:
    return 'Light watcher'


In [32]:
df['binge_status'] = df.apply(binge_status , axis = 1)

In [33]:
df

Unnamed: 0,user_name,show_name,genre,episodes_watched,rating_given,hours_watched,subscription_type,Monthly_cost,feelings,binge_status
0,Willow,Lucifer,Action,65,3.6,52.9,Basic,100,dissapointing,Super binge watcher
1,Asher,The Good Place,Romance,28,2.6,23.8,Premium,200,dissapointing,Light watcher
2,Lily,Sweet Magnolias,Drama,68,4.0,73.3,Standard,150,dissapointing,Super binge watcher
3,Forest,Chef's Table,Animation,64,3.6,42.3,Premium,200,dissapointing,Super binge watcher
4,Carter,Outer Banks,Documentary,68,4.6,49.5,Standard,150,awsome,Super binge watcher
...,...,...,...,...,...,...,...,...,...,...
95,Nora,13 Reasons Why,Action,34,4.7,28.3,Premium,200,awsome,Casual binge watcher
96,Lisa,Ozark,Action,21,3.1,15.6,Standard,150,dissapointing,Light watcher
97,Ava,Bridgerton,Romance,70,4.1,44.5,Basic,100,dissapointing,Super binge watcher
98,Jake,The Office,Thriller,20,3.0,16.9,Standard,150,dissapointing,Light watcher


In [34]:
#Calculate money spent per hour
def money(row):
  monthly_cost = row['Monthly_cost']
  hours_watched = row['hours_watched']

  if hours_watched > 0:
    cost = monthly_cost/ hours_watched
    return round(cost,2)
  else:
    return 0

In [35]:
df['cost_per_hour'] = df.apply(money , axis = 1)

In [36]:
df.head()

Unnamed: 0,user_name,show_name,genre,episodes_watched,rating_given,hours_watched,subscription_type,Monthly_cost,feelings,binge_status,cost_per_hour
0,Willow,Lucifer,Action,65,3.6,52.9,Basic,100,dissapointing,Super binge watcher,1.89
1,Asher,The Good Place,Romance,28,2.6,23.8,Premium,200,dissapointing,Light watcher,8.4
2,Lily,Sweet Magnolias,Drama,68,4.0,73.3,Standard,150,dissapointing,Super binge watcher,2.05
3,Forest,Chef's Table,Animation,64,3.6,42.3,Premium,200,dissapointing,Super binge watcher,4.73
4,Carter,Outer Banks,Documentary,68,4.6,49.5,Standard,150,awsome,Super binge watcher,3.03


#Deep copy vs Shallow copy

## 🧠 Understanding Shallow Copy vs Deep Copy in Python

When we want to make a copy of an object like a list, dictionary, or DataFrame, it's important to know the difference between **shallow copy** and **deep copy**.

---

## 1️⃣ What is Shallow Copy?

- A **shallow copy** creates a new object, but it **shares the inner (nested) objects** with the original.
- Changes made to the nested data in the original will **also affect** the copied object.

### 🔍 Example:
```python
import copy

a = [1, 2, [3, 4]]
b = copy.copy(a)

a[2][0] = 99

print("Original:", a)  # [1, 2, [99, 4]]
print("Shallow Copy:", b)  # [1, 2, [99, 4]]


## 🧬  What is Deep Copy?

A **deep copy** creates a new object.

It also creates **completely new copies of all nested objects**.  
The original and the copy are **100% independent**.

### 📌 Example:
```python
import copy

a = [1, 2, [3, 4]]
b = copy.deepcopy(a)  # deep copy

a[2][0] = 99

print("Original:", a)  # [1, 2, [99, 4]]
print("Deep Copy:", b)  # [1, 2, [3, 4]]


**🔄  Key Differences Between Shallow Copy and Deep Copy**

| Feature                 | Shallow Copy (`copy.copy()`) | Deep Copy (`copy.deepcopy()`) |
| ----------------------- | ---------------------------- | ----------------------------- |
| New outer object        | ✅ Yes                        | ✅ Yes                         |
| Inner objects shared?   | ✅ Yes                        | ❌ No                          |
| Completely independent? | ❌ No                         | ✅ Yes                         |
| Use case                | Simple structures, faster    | Nested structures, safer      |


**Shallow Copy - Like a Linked Copy 🔗**

In [37]:
print("📋 Creating copies of our Netflix data...")

# Make a shallow copy (like a shortcut on your computer)
netflix_shallow = df.copy(deep=False)

print("Original data rows:", len(df))
print("Shallow copy rows:", len(netflix_shallow))

# They look the same but...
print("Are they the exact same object?", df is netflix_shallow)

📋 Creating copies of our Netflix data...
Original data rows: 100
Shallow copy rows: 100
Are they the exact same object? False


**Deep Copy - Like a Real Photocopy 📄**

In [38]:
# Make a deep copy (like a real photocopy)
netflix_deep = df.copy(deep=True)

print("Deep copy rows:", len(netflix_deep))
print("Are original and deep copy the same?", df is netflix_deep)

Deep copy rows: 100
Are original and deep copy the same? False


**See the Difference**

In [39]:
print("\n🧪 Testing the difference...")

# Let's modify each copy
print("Original Alex's hours:", df.loc[0, 'hours_watched'])

# Change shallow copy
netflix_shallow.loc[0, 'hours_watched'] = 999
print("After changing shallow copy - Original Alex's hours:", df.loc[0, 'hours_watched'])

# Change deep copy
netflix_deep.loc[0, 'hours_watched'] = 777
print("After changing deep copy - Original Alex's hours:", df.loc[0, 'hours_watched'])

# Reset for clarity
netflix_shallow.loc[0, 'hours_watched'] = 20
netflix_deep.loc[0, 'hours_watched'] = 20


🧪 Testing the difference...
Original Alex's hours: 52.9
After changing shallow copy - Original Alex's hours: 999.0
After changing deep copy - Original Alex's hours: 999.0


In [40]:
shallow_df = df.copy(deep = False)
shallow_df

Unnamed: 0,user_name,show_name,genre,episodes_watched,rating_given,hours_watched,subscription_type,Monthly_cost,feelings,binge_status,cost_per_hour
0,Willow,Lucifer,Action,65,3.6,20.0,Basic,100,dissapointing,Super binge watcher,1.89
1,Asher,The Good Place,Romance,28,2.6,23.8,Premium,200,dissapointing,Light watcher,8.40
2,Lily,Sweet Magnolias,Drama,68,4.0,73.3,Standard,150,dissapointing,Super binge watcher,2.05
3,Forest,Chef's Table,Animation,64,3.6,42.3,Premium,200,dissapointing,Super binge watcher,4.73
4,Carter,Outer Banks,Documentary,68,4.6,49.5,Standard,150,awsome,Super binge watcher,3.03
...,...,...,...,...,...,...,...,...,...,...,...
95,Nora,13 Reasons Why,Action,34,4.7,28.3,Premium,200,awsome,Casual binge watcher,7.07
96,Lisa,Ozark,Action,21,3.1,15.6,Standard,150,dissapointing,Light watcher,9.62
97,Ava,Bridgerton,Romance,70,4.1,44.5,Basic,100,dissapointing,Super binge watcher,2.25
98,Jake,The Office,Thriller,20,3.0,16.9,Standard,150,dissapointing,Light watcher,8.88


In [41]:
df.loc[0 , 'user_name'] ='syed'

In [42]:
df=pd.read_csv('/content/netlix_class' , index_col=0)

In [43]:
df

Unnamed: 0,user_name,show_name,genre,episodes_watched,rating_given,hours_watched,subscription_type
0,Willow,Lucifer,Action,65,3.6,52.9,Basic
1,Asher,The Good Place,Romance,28,2.6,23.8,Premium
2,Lily,Sweet Magnolias,Drama,68,4.0,73.3,Standard
3,Forest,Chef's Table,Animation,64,3.6,42.3,Premium
4,Carter,Outer Banks,Documentary,68,4.6,49.5,Standard
...,...,...,...,...,...,...,...
95,Nora,13 Reasons Why,Action,34,4.7,28.3,Premium
96,Lisa,Ozark,Action,21,3.1,15.6,Standard
97,Ava,Bridgerton,Romance,70,4.1,44.5,Basic
98,Jake,The Office,Thriller,20,3.0,16.9,Standard


#**Custom Functions in Python**

##**What are Custom Functions? 🤔**
- Think of custom functions like recipes - you write them once, then use them whenever you need that specific "dish" (result). In pandas, custom functions help you:

- Do calculations that pandas doesn't have built-in
Apply complex logic to your data
Keep your code clean and reusable


###**Basic Custom Functions - Your First Recipe! 👨‍🍳**
**Simple Function (One Input, One Output)**

- Remember our example we have done earlier That would be a perfect for understanding


In [45]:
def feelings(rating):
  if rating >= 4.5:
    return 'awsome'
  elif rating >= 4.0 and rating <3.5:
    return'good'
  else:
    return('dissapointing')

In [46]:
df['feelings'] = df['rating_given'].apply(feelings)

**Functions Using Multiple Columns 🔧**

In [47]:
def binge_status(row):
  hours = row['hours_watched']
  episodes = row['episodes_watched']

  if hours > 40 and episodes > 40 :
    return 'Super binge watcher'
  elif hours >25 or episodes >30 :
    return 'Casual binge watcher'
  else:
    return 'Light watcher'

In [48]:
df['binge_status'] = df.apply(binge_status , axis = 1)

**Summary**:
When to Use Custom Functions 📋
**Use Custom Functions When**:
- ✅ Complex Logic: Multiple conditions, calculations
- ✅ Multiple Columns: Need to look at several columns together
- ✅ Business Rules: Specific domain logic (like our Netflix recommendations)
- ✅ Reusability: Same logic needed multiple times
- ✅ Readability: Makes code cleaner and more understandable

**Don't Use Custom Functions When**:
- ❌ Simple Operations: Basic math, single conditions
- ❌ Built-in Methods: Pandas already has the function
- ❌ Performance Critical: Simple operations are faster with vectorization

Practise Questions:
- Write a function that categorizes users as "New" (<10 episodes) or "Experienced" (≥10 episodes)
- Create a function that calculates "value score" (rating × hours / cost)
- Write a function to calculate the value score to estimate how much value users are getting from their subscription
    - `value_score = (hours_watched × rating_given) / subscription_weight`
  
  *Use these weights:*

      - Basic: 1.0

      - Standard: 0.8

      - Premium: 0.6

- Write a function to create a new column 'genre_engagement' based on how much a user liked a show in a specific genre

      | Condition                           | Label          |
      | ----------------------------------- | -------------- |
      | Genre = Drama **and** rating ≥ 4.0  | Drama Fan      |
      | Genre = Comedy **and** rating ≥ 4.0 | Comedy Fan     |
      | rating\_given < 3.0                 | Not Engaged    |
      | All other cases                     | Neutral Viewer |


#Binning

- Binning, also known as bucketing, is a data preprocessing technique used to group continuous numerical data into discrete intervals or bins. This process transforms numerical data into categorical data, which can be useful for various analysis and modeling purposes.

**Problem** : Creating a new column watchtime bin int the netflix data DataFrame that categorizes users based on their hours_watched:

    | Hours Watched Range | Label       |
    | ------------------- | ----------- |
    | 0 – 15              | "Very Low"  |
    | 15 – 30             | "Low"       |
    | 30 – 45             | "Moderate"  |
    | 45 – 60             | "High"      |
    | 60+                 | "Very High" |


In [49]:
bins = [0,15,30,45,60,float('inf')]
labels = ['Very Low' , 'low' , 'Moderate' , 'High' , 'Very High']

In [52]:
df['watch_time_bin'] = pd.cut(df['hours_watched'] , bins = bins , labels = labels)

In [53]:
df.head()

Unnamed: 0,user_name,show_name,genre,episodes_watched,rating_given,hours_watched,subscription_type,feelings,binge_status,watch_time_bin
0,Willow,Lucifer,Action,65,3.6,52.9,Basic,dissapointing,Super binge watcher,High
1,Asher,The Good Place,Romance,28,2.6,23.8,Premium,dissapointing,Light watcher,low
2,Lily,Sweet Magnolias,Drama,68,4.0,73.3,Standard,dissapointing,Super binge watcher,Very High
3,Forest,Chef's Table,Animation,64,3.6,42.3,Premium,dissapointing,Super binge watcher,Moderate
4,Carter,Outer Banks,Documentary,68,4.6,49.5,Standard,awsome,Super binge watcher,High


---

### 📄 pandas Documentation

[pandas Official Documentation](https://pandas.pydata.org/docs/)

---

### 🔗 `map` (Series method)

[pandas.Series.map documentation](https://pandas.pydata.org/docs/reference/api/pandas.Series.map.html)

---

### 🔗 `apply` (Series/DataFrame method)

[pandas.DataFrame.apply documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.apply.html)

---

### 🔗 `applymap` (DataFrame method)

[pandas.DataFrame.applymap documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.applymap.html)

---