import numpy as np
import pandas as pd

Great! Based on the structure of your `netflix.csv` file, here are some **lab-style Pandas questions** that comprehensively cover important topics:

---

### ✅ **Pandas Questions on Netflix Dataset**

#### 1. **Basic Exploration**

* Q1. Load the dataset and show the first 5 rows.
* Q2. Print the column names and their data types.
* Q3. How many rows and columns are present in the dataset?
* Q4. Show summary statistics for numeric columns.

#### 2. **Data Selection & Filtering**

* Q5. Display only the titles of all TV Shows.
* Q6. Show all movies released in 2020.
* Q7. List all records where the rating is `'TV-MA'`.

#### 3. **Handling Missing Values**

* Q8. Check for missing values in each column.
* Q9. Fill all missing `director` names with `"Not Available"`.
* Q10. Drop all rows where the `country` is missing.

#### 4. **Sorting & Grouping**

* Q11. Sort the dataset by `release_year` in descending order.
* Q12. Group by `type` and count how many entries of each type exist.
* Q13. Find the most common `rating` using value counts.

#### 5. **Column Operations**

* Q14. Create a new column called `content_category` where:

  * If `type` is "Movie", it's "Film"
  * If `type` is "TV Show", it's "Series"
* Q15. Extract the year from `date_added` into a new column called `added_year`.

#### 6. **Text Matching & String Operations**

* Q16. Display all rows where `title` contains the word `"Love"`.
* Q17. Count how many shows include `"International"` in `listed_in`.

#### 7. **File Export**

* Q18. Save the modified dataframe as `"netflix_cleaned.csv"` using `to_csv()`.

---

Would you like me to now answer these with actual **Python codes** using Pandas?


## q1

In [4]:
data = pd.read_csv("netflix_titles.csv")
data.head(5)

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...


In [36]:
columns = data.shape[1]
print (columns)
data.dtypes

12


show_id         object
type            object
title           object
director        object
cast            object
country         object
date_added      object
release_year     int64
rating          object
duration        object
listed_in       object
description     object
dtype: object

In [27]:
data.describe()

Unnamed: 0,release_year
count,8807.0
mean,2014.180198
std,8.819312
min,1925.0
25%,2013.0
50%,2017.0
75%,2019.0
max,2021.0


## q2

In [42]:
dispaly = data[
(data["type"]== "TV Show") &
(data["title"])
]
print (dispaly)

     show_id     type                  title         director  \
1         s2  TV Show          Blood & Water              NaN   
2         s3  TV Show              Ganglands  Julien Leclercq   
3         s4  TV Show  Jailbirds New Orleans              NaN   
4         s5  TV Show           Kota Factory              NaN   
5         s6  TV Show          Midnight Mass    Mike Flanagan   
...      ...      ...                    ...              ...   
8795   s8796  TV Show        Yu-Gi-Oh! Arc-V              NaN   
8796   s8797  TV Show             Yunus Emre              NaN   
8797   s8798  TV Show              Zak Storm              NaN   
8800   s8801  TV Show     Zindagi Gulzar Hai              NaN   
8803   s8804  TV Show            Zombie Dumb              NaN   

                                                   cast  \
1     Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...   
2     Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...   
3                                         

Show all movies released in 2020.

In [46]:
movies = data[
(data['release_year'] == 2020) &
(data['type'] == 'Movie') 
]
print (movies)


     show_id   type                                              title  \
0         s1  Movie                               Dick Johnson Is Dead   
16       s17  Movie  Europe's Most Dangerous Man: Otto Skorzeny in ...   
78       s79  Movie                                     Tughlaq Durbar   
84       s85  Movie                               Omo Ghetto: the Saga   
103     s104  Movie                                     Shadow Parties   
...      ...    ...                                                ...   
3046   s3047  Movie                      All the Freckles in the World   
3060   s3061  Movie                                      Ghost Stories   
5972   s5973  Movie                                   #cats_the_mewvie   
7594   s7595  Movie                 Norm of the North: Family Vacation   
8099   s8100  Movie                                        Straight Up   

                                               director  \
0                                       Kirsten John

In [49]:
ratings = data[
(data['rating']== 'TV-MA') 
]

print (ratings[['title', 'type']])

                       title     type
1              Blood & Water  TV Show
2                  Ganglands  TV Show
3      Jailbirds New Orleans  TV Show
4               Kota Factory  TV Show
5              Midnight Mass  TV Show
...                      ...      ...
8762         Wrong Side Raju    Movie
8769  Y.M.I.: Yeh Mera India    Movie
8788            You Carry Me    Movie
8798                Zed Plus    Movie
8801                 Zinzana    Movie

[3207 rows x 2 columns]


In [52]:
data.isnull().sum()

show_id            0
type               0
title              0
director        2634
cast             825
country          831
date_added        10
release_year       0
rating             4
duration           3
listed_in          0
description        0
dtype: int64

In [57]:
data["director"]= data["director"].fillna("Not available")

In [58]:
data["director"]

0       Kirsten Johnson
1         Not Available
2       Julien Leclercq
3         Not Available
4         Not Available
             ...       
8802      David Fincher
8803      Not Available
8804    Ruben Fleischer
8805       Peter Hewitt
8806        Mozez Singh
Name: director, Length: 8807, dtype: object

Drop all rows where the country is missing.

In [59]:
data["country"].dropna()

0                                           United States
1                                            South Africa
4                                                   India
7       United States, Ghana, Burkina Faso, United Kin...
8                                          United Kingdom
                              ...                        
8801                         United Arab Emirates, Jordan
8802                                        United States
8804                                        United States
8805                                        United States
8806                                                India
Name: country, Length: 7976, dtype: object

Sort the dataset by release_year in descending order.

In [60]:
data["release_year"].sort_values(ascending = False)

1       2021
2       2021
3       2021
31      2021
30      2021
        ... 
8739    1943
8660    1943
7790    1942
8205    1942
4250    1925
Name: release_year, Length: 8807, dtype: int64

In [63]:
data = data.sort_values(by="release_year", ascending=False)
print (data)

     show_id     type                                          title  \
1380   s1381  TV Show                                      Go Dog Go   
68       s69    Movie                                     Schumacher   
1         s2  TV Show                                  Blood & Water   
2         s3  TV Show                                      Ganglands   
3         s4  TV Show                          Jailbirds New Orleans   
...      ...      ...                                            ...   
8739   s8740    Movie             Why We Fight: The Battle of Russia   
8660   s8661    Movie  Undercover: How to Operate Behind Enemy Lines   
7790   s7791    Movie                                 Prelude to War   
8205   s8206    Movie                           The Battle of Midway   
4250   s4251  TV Show              Pioneers: First Women Filmmakers*   

                                               director  \
1380                                      Not Available   
68    Hanns-Bruno

 Group by type and count how many entries of each type exist.

In [69]:
type_count = data.groupby("type").size()
print (type_count)

type
Movie      6132
TV Show    2675
dtype: int64


Find the most common rating using value counts.

In [75]:
common = data["rating"].value_counts().head(1)
common

rating
TV-MA    3207
Name: count, dtype: int64

Create a new column called content_category where:

If type is "Movie", it's "Film"
If type is "TV Show", it's "Series"

In [80]:
data["content_category"] = data["type"].apply(
    lambda x : "It is a movie" if x == "Movie" else "It is a series"
)

In [82]:
print (data['content_category'])

1380    It is a series
68       It is a movie
1       It is a series
2       It is a series
3       It is a series
             ...      
8739     It is a movie
8660     It is a movie
7790     It is a movie
8205     It is a movie
4250    It is a series
Name: content_category, Length: 8807, dtype: object


Extract the year from date_added into a new column called added_year.

In [None]:
data"] = 