![Logo.png](attachment:c9188f9f-cd30-4e98-bacc-726ee45a48e9.png)

# SOFT40161 - Introduction to Computer Programming: Lab 07

<font  color= 'red'> **Note: Try to buid up the concepts! Don't just execute the codes without knowing it's meaning.** </font>

## Data Manipulation with Pandas and Python Libraries


Please visit the lecture materials first!

## Lab Learning Outcomes

By the end of this lab, you will be able to:
1. LO-01: Understand and use Pandas for efficient data manipulation
2. 
LO-02: Perform exploratory data analysis.3. 
LO-03: Integrate Pandas with NumPy and other libraries for comprehensive data handling tasks4. .
LO-04: Demonstrate practical skills in handling structured and semi-structured da5. ta
LO-05: Integrate Python libraries for advanced data analyis.



# Example, Explanation and Exercise

## Example 1: Create a Data Frame from Dictionary

A DataFrame in Python is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns). It is one of the most commonly used data structures in the Pandas library, which is widely used for data manipulation and analysis in Python.

Key Characteristics of a DataFrame:
- Two-Dimensional Structure: A DataFrame consists of rows and columns, similar to a table in a database or a spreadsheet.
- Labels: Each row and column in a DataFrame is labeled. The row labels are called the index, and the column labels are the column names. This makes accessing data by labels very efficient.
- Heterogeneous Data: A DataFrame can hold different types of data in different columns (e.g., integers, strings, floats). This is in contrast to arrays (such as NumPy arrays), which require all elements to be of the same type.

In [1]:
import pandas as pd

# Create a dictionary with data
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

# Create a DataFrame from the dictionary
df = pd.DataFrame(data)

# Display the DataFrame
print(df)

      Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago
3    David   40      Houston


**Explanation:**

- Importing Pandas: The pandas library is imported as pd.
- Creating the Dictionary: The dictionary data contains three keys `('Name', 'Age', and 'City')`, each with a list of values corresponding to that key. The values in the lists represent the data for each column in the future DataFrame.
- Creating the DataFrame: `pd.DataFrame(data)` converts the dictionary into a DataFrame. In the DataFrame, each key of the dictionary becomes a column name, and the values in the lists are the data entries for those columns.
- Displaying the DataFrame: `print(df)` is used to display the created DataFrame

**Key Points:**
- In the resulting DataFrame, each row corresponds to an index (starting from 0 by default), and each column has a label (the keys of the dictionary).
- The DataFrame structure is tabular, making it easy to analyze and manipulate data.

## Example 2: Create a Data Frame from List

In [2]:
import pandas as pd

# Create a list of lists (data)
data = [
    ['Alice', 25, 'New York'],
    ['Bob', 30, 'Los Angeles'],
    ['Charlie', 35, 'Chicago']
]

# Create a DataFrame from the list
df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])

# Display the DataFrame
print(df)


      Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago


**Explanation:**
- Importing Pandas: The pandas library is imported as pd. This is the standard practice when using Pandas in Python.
- Creating the List: Data is a list of lists. Each inner list represents a row of data. The first list `['Alice', 25, 'New York']` is the first row, the second list `['Bob', 30, 'Los Angeles']` is the second row, and so on. Each element within the inner lists corresponds to a column in the DataFrame.
- Creating the DataFrame: `pd.DataFrame(data, columns=['Name', 'Age', 'City'])` converts the list of lists into a DataFrame. The columns parameter is used to provide labels for the DataFrame's columns (i.e., `'Name', 'Age', and 'City'`). The DataFrame is created where each list within data becomes a row in the DataFrame, and the column names are provided by the columns argument.
- Displaying the DataFrame: `print(df)` displays the DataFrame.

**Key Points:**
- Data from List: In this case, each list inside the main list is considered a row in the DataFrame, and the elements within each sublist represent the values for each column.
- Column Labels: By specifying the columns parameter, we provide names for each column. If the column names are not provided, Pandas will automatically assign numeric labels (0, 1, 2, etc.) to the columns.
- Index: The index (row labels) is automatically assigned by Pandas. In this case, the default index starts from 0. You can customize the index if needed using the index parameter in the pd.DataFrame() function.

## Example 3: Create a Data Frame from Array

In [3]:
import pandas as pd
import numpy as np

# Create a 2D NumPy array (rows and columns of data)
data = np.array([
    ['Alice', 25, 'New York'],
    ['Bob', 30, 'Los Angeles'],
    ['Charlie', 35, 'Chicago']
])

# Create a DataFrame from the NumPy array
df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])

# Display the DataFrame
print(df)


      Name Age         City
0    Alice  25     New York
1      Bob  30  Los Angeles
2  Charlie  35      Chicago


**Explanation:**

- Creating a 2D NumPy Array: The variable data is a 2D NumPy array with 3 rows and 3 columns. Each row represents a data entry, and each column represents a different feature (e.g., `name, age, and city`).
- Creating the DataFrame: `pd.DataFrame(data, columns=['Name', 'Age', 'City'])` converts the NumPy array into a Pandas DataFrame. The columns argument specifies the column names: `'Name', 'Age', and 'City'`. The data is automatically assigned to these columns, making the DataFrame readable and structured.
- Displaying the DataFrame: `print(df)` prints the DataFrame.

## Example 4: DataFrame with NaN
An example of creating a DataFrame with NaN (Not a Number) values and how to deal with them in Python using the Pandas library

In [4]:
import pandas as pd
import numpy as np

# Create a DataFrame with NaN values
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, np.nan, 35, np.nan],
    'City': ['New York', 'Los Angeles', 'Chicago', np.nan]
}

df = pd.DataFrame(data)

# Display the original DataFrame with NaN values
print("Original DataFrame:")
print(df)

# Handling NaN values:

# 1. Fill NaN values with a specific value (e.g., 0 for numerical columns and 'Unknown' for string columns)
df_filled = df.fillna({'Age': 0, 'City': 'Unknown'})

# Display the DataFrame after filling NaN values
print("\nDataFrame after filling NaN values:")
print(df_filled)

# 2. Drop rows that contain any NaN values
df_dropped = df.dropna()

# Display the DataFrame after dropping rows with NaN values
print("\nDataFrame after dropping rows with NaN values:")
print(df_dropped)


Original DataFrame:
      Name   Age         City
0    Alice  25.0     New York
1      Bob   NaN  Los Angeles
2  Charlie  35.0      Chicago
3    David   NaN          NaN

DataFrame after filling NaN values:
      Name   Age         City
0    Alice  25.0     New York
1      Bob   0.0  Los Angeles
2  Charlie  35.0      Chicago
3    David   0.0      Unknown

DataFrame after dropping rows with NaN values:
      Name   Age      City
0    Alice  25.0  New York
2  Charlie  35.0   Chicago


**Explanation:**
- Importing Libraries: We import Pandas as pd and NumPy as np. NumPy is used to handle NaN values, while Pandas provides DataFrame manipulation tools.
- Creating a DataFrame with NaN Values:
> The data dictionary contains some NaN values:
In the 'Age' column, the second and fourth entries are NaN.
In the 'City' column, the last entry is NaN.
We create the DataFrame using pd.DataFrame(data).
Handling NaN Values:

Filling NaN Values: `df.fillna({'Age': 0, 'City': 'Unknown'})` is used to fill `NaN` values with specific values. For the `'Age'` column, all `NaN` values are replaced with 0. For the `'City'` column, all `NaN` values are replaced with `'Unknown'`. This operation creates a new DataFrame, df_filled, with the NaN values replaced.

- Dropping Rows with NaN Values: `df.dropna()` is used to remove any rows that contain `NaN` values. This creates a new DataFrame, df_dropped, where rows with NaN in any column are dropped.
- Displaying the DataFrames: The original DataFrame, `df`, is printed first, showing the `NaN` values.
After filling `NaN` values, the DataFrame `df_filled` is printed.
After dropping rows with `NaN` values, the DataFrame `df_dropped` is printed.


## Example 05: Reading CSV data in Pandas and removing 'NaN' and fixing index

The following example will read a messy CSV file containg a lot of `NaN` values. Then remove the `NaN` and reindex it's contents accordingly for further processing. Download the `netflix_titles.csv` Dataset for further processing.

In [5]:
# Read the data from external data source
raw_csv_data = pd.read_csv('netflix_titles.csv')

In [10]:
# Display the content that we have just read
raw_csv_data

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...
...,...,...,...,...,...,...,...,...,...,...,...,...
8802,s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a..."
8803,s8804,TV Show,Zombie Dumb,,,,"July 1, 2019",2018,TV-Y7,2 Seasons,"Kids' TV, Korean TV Shows, TV Comedies","While living alone in a spooky town, a young g..."
8804,s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,88 min,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...
8805,s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,88 min,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero..."


### Descriptive Statistics of the CSV Data Frame¶

In [11]:
raw_csv_data.describe(include = 'all')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
count,8807,8807,8807,6173,7982,7976,8797,8807.0,8803,8804,8807,8807
unique,8807,2,8807,4528,7692,748,1767,,17,220,514,8775
top,s1,Movie,Dick Johnson Is Dead,Rajiv Chilaka,David Attenborough,United States,"January 1, 2020",,TV-MA,1 Season,"Dramas, International Movies","Paranormal activity at a lush, abandoned prope..."
freq,1,6131,1,19,19,2818,109,,3207,1793,362,4
mean,,,,,,,,2014.180198,,,,
std,,,,,,,,8.819312,,,,
min,,,,,,,,1925.0,,,,
25%,,,,,,,,2013.0,,,,
50%,,,,,,,,2017.0,,,,
75%,,,,,,,,2019.0,,,,


### Checking for Null Values in the Dataframe and removing it

In [12]:
df_dropped = raw_csv_data.dropna()

In [13]:
# Display the content after removing the 'NaN' values. What about the index? They are still the previous one.
df_dropped

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
7,s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...","September 24, 2021",1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s..."
8,s9,TV Show,The Great British Baking Show,Andy Devonshire,"Mel Giedroyc, Sue Perkins, Mary Berry, Paul Ho...",United Kingdom,"September 24, 2021",2021,TV-14,9 Seasons,"British TV Shows, Reality TV",A talented batch of amateur bakers face off in...
9,s10,Movie,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,"September 24, 2021",2021,PG-13,104 min,"Comedies, Dramas",A woman adjusting to life after a loss contend...
12,s13,Movie,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...","Germany, Czech Republic","September 23, 2021",2021,TV-MA,127 min,"Dramas, International Movies",After most of her family is murdered in a terr...
24,s25,Movie,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,"September 21, 2021",1998,TV-14,166 min,"Comedies, International Movies, Romantic Movies",When the father of the man she loves insists t...
...,...,...,...,...,...,...,...,...,...,...,...,...
8801,s8802,Movie,Zinzana,Majid Al Ansari,"Ali Suliman, Saleh Bakri, Yasa, Ali Al-Jabri, ...","United Arab Emirates, Jordan","March 9, 2016",2015,TV-MA,96 min,"Dramas, International Movies, Thrillers",Recovering alcoholic Talal wakes up inside a s...
8802,s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a..."
8804,s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,88 min,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...
8805,s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,88 min,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero..."


In [14]:
# Fix the index
df_cleaned = df_dropped.reset_index(drop=True)

netflixCleanedData = df_cleaned

In [15]:
# Display the clean data.
df_cleaned

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...","September 24, 2021",1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s..."
1,s9,TV Show,The Great British Baking Show,Andy Devonshire,"Mel Giedroyc, Sue Perkins, Mary Berry, Paul Ho...",United Kingdom,"September 24, 2021",2021,TV-14,9 Seasons,"British TV Shows, Reality TV",A talented batch of amateur bakers face off in...
2,s10,Movie,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,"September 24, 2021",2021,PG-13,104 min,"Comedies, Dramas",A woman adjusting to life after a loss contend...
3,s13,Movie,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...","Germany, Czech Republic","September 23, 2021",2021,TV-MA,127 min,"Dramas, International Movies",After most of her family is murdered in a terr...
4,s25,Movie,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,"September 21, 2021",1998,TV-14,166 min,"Comedies, International Movies, Romantic Movies",When the father of the man she loves insists t...
...,...,...,...,...,...,...,...,...,...,...,...,...
5327,s8802,Movie,Zinzana,Majid Al Ansari,"Ali Suliman, Saleh Bakri, Yasa, Ali Al-Jabri, ...","United Arab Emirates, Jordan","March 9, 2016",2015,TV-MA,96 min,"Dramas, International Movies, Thrillers",Recovering alcoholic Talal wakes up inside a s...
5328,s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a..."
5329,s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,88 min,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...
5330,s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,88 min,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero..."


In [16]:
netflixCleanedData

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...","September 24, 2021",1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s..."
1,s9,TV Show,The Great British Baking Show,Andy Devonshire,"Mel Giedroyc, Sue Perkins, Mary Berry, Paul Ho...",United Kingdom,"September 24, 2021",2021,TV-14,9 Seasons,"British TV Shows, Reality TV",A talented batch of amateur bakers face off in...
2,s10,Movie,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,"September 24, 2021",2021,PG-13,104 min,"Comedies, Dramas",A woman adjusting to life after a loss contend...
3,s13,Movie,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...","Germany, Czech Republic","September 23, 2021",2021,TV-MA,127 min,"Dramas, International Movies",After most of her family is murdered in a terr...
4,s25,Movie,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,"September 21, 2021",1998,TV-14,166 min,"Comedies, International Movies, Romantic Movies",When the father of the man she loves insists t...
...,...,...,...,...,...,...,...,...,...,...,...,...
5327,s8802,Movie,Zinzana,Majid Al Ansari,"Ali Suliman, Saleh Bakri, Yasa, Ali Al-Jabri, ...","United Arab Emirates, Jordan","March 9, 2016",2015,TV-MA,96 min,"Dramas, International Movies, Thrillers",Recovering alcoholic Talal wakes up inside a s...
5328,s8803,Movie,Zodiac,David Fincher,"Mark Ruffalo, Jake Gyllenhaal, Robert Downey J...",United States,"November 20, 2019",2007,R,158 min,"Cult Movies, Dramas, Thrillers","A political cartoonist, a crime reporter and a..."
5329,s8805,Movie,Zombieland,Ruben Fleischer,"Jesse Eisenberg, Woody Harrelson, Emma Stone, ...",United States,"November 1, 2019",2009,R,88 min,"Comedies, Horror Movies",Looking to survive in a world taken over by zo...
5330,s8806,Movie,Zoom,Peter Hewitt,"Tim Allen, Courteney Cox, Chevy Chase, Kate Ma...",United States,"January 11, 2020",2006,PG,88 min,"Children & Family Movies, Comedies","Dragged from civilian life, a former superhero..."


## Example06 : Uses of .loc[] and .iloc[]

In [20]:
# Create a DataFarme
import pandas as pd
df = pd.DataFrame({'age':[30, 2, 12, 4, 32, 33, 69],
                   'color':['blue', 'green', 'red', 'white', 'gray', 'black', 'red'],
                   'food':['Steak', 'Lamb', 'Mango', 'Apple', 'Cheese', 'Melon', 'Beans'],
                   'height':[165, 70, 120, 80, 180, 172, 150],
                   'score':[4.6, 8.3, 9.0, 3.3, 1.8, 9.5, 2.2],
                   'state':['NY', 'TX', 'FL', 'AL', 'AK', 'TX', 'TX']
                   },
                  index=['Jane', 'Nick', 'Aaron', 'Penelope', 'Dean', 'Christina', 'Cornelia'])

In [21]:
df.loc['Jane']

age          30
color      blue
food      Steak
height      165
score       4.6
state        NY
Name: Jane, dtype: object

In [19]:
df.iloc[:,1]

Jane          blue
Nick         green
Aaron          red
Penelope     white
Dean          gray
Christina    black
Cornelia       red
Name: color, dtype: object

In [22]:
df.iloc[2:4,1:4]

Unnamed: 0,color,food,height
Aaron,red,Mango,120
Penelope,white,Apple,80


### Find the `description` of all the movies that we have previously read and cleaned for Netflix Dataset.

In [25]:
netflixCleanedData.iloc[:,11]

0       On a photo shoot in Ghana, an American model s...
1       A talented batch of amateur bakers face off in...
2       A woman adjusting to life after a loss contend...
3       After most of her family is murdered in a terr...
4       When the father of the man she loves insists t...
                              ...                        
5327    Recovering alcoholic Talal wakes up inside a s...
5328    A political cartoonist, a crime reporter and a...
5329    Looking to survive in a world taken over by zo...
5330    Dragged from civilian life, a former superhero...
5331    A scrappy but poor boy worms his way into a ty...
Name: description, Length: 5332, dtype: object

### Find the `Duration` of all the movies that we have previously read and cleaned for Netflix Dataset.

In [24]:
netflixCleanedData.loc[:,'duration']

0         125 min
1       9 Seasons
2         104 min
3         127 min
4         166 min
          ...    
5327       96 min
5328      158 min
5329       88 min
5330       88 min
5331      111 min
Name: duration, Length: 5332, dtype: object

### Find the `Duration` of `First Seven movies` that we have previously read and cleaned for Netflix Dataset.

In [26]:
netflixCleanedData.loc[0:6,'duration']

0      125 min
1    9 Seasons
2      104 min
3      127 min
4      166 min
5      103 min
6       97 min
Name: duration, dtype: object

## Example07: Filtering Data in Pandas

The follwing example will Filter the data with some basic conditions. This can be expanded for a real dataset with different other conditions.

In [27]:
import pandas as pd

# Sample DataFrame
data = {'A': [1, 2, 3, 4, 5], 'B': [10, 15, 20, 25, 30]}
df = pd.DataFrame(data)

# Condition 1: Filter rows where A > 2
filtered_A = df[df['A'] > 2]

# Condition 2: Filter rows where B is even
filtered_B = df[df['B'] % 2 == 0]

# Combined Conditions: A > 2 and B is even
combined_filter = df[(df['A'] > 2) & (df['B'] % 2 == 0)]

print("Original DataFrame:")
print(df)
print("\nFiltered by A > 2:")
print(filtered_A)
print("\nFiltered by B is even:")
print(filtered_B)
print("\nFiltered by A > 2 and B is even:")
print(combined_filter)


Original DataFrame:
   A   B
0  1  10
1  2  15
2  3  20
3  4  25
4  5  30

Filtered by A > 2:
   A   B
2  3  20
3  4  25
4  5  30

Filtered by B is even:
   A   B
0  1  10
2  3  20
4  5  30

Filtered by A > 2 and B is even:
   A   B
2  3  20
4  5  30


### Find all the movies released in year 2007 from the previously cleaned etflix Dataset.

In [28]:
# df_cleaned[df_cleaned['release_year']==2007]
# netflixCleanedData
netflixCleanedData[netflixCleanedData['release_year']==2007]

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
22,s60,Movie,Naruto Shippuden: The Movie,Hajime Kamegaki,"Junko Takeuchi, Chie Nakamura, Yoichi Masukawa...",Japan,"September 15, 2021",2007,TV-PG,95 min,"Action & Adventure, Anime Features, Internatio...",The adventures of adolescent ninja Naruto Uzum...
53,s143,Movie,Freedom Writers,Richard LaGravenese,"Hilary Swank, Patrick Dempsey, Scott Glenn, Im...","Germany, United States","September 1, 2021",2007,PG-13,124 min,Dramas,While her at-risk students are reading classic...
106,s216,Movie,Shootout at Lokhandwala,Apoorva Lakhia,"Amitabh Bachchan, Sanjay Dutt, Sunil Shetty, A...",India,"August 27, 2021",2007,TV-MA,116 min,"Action & Adventure, Dramas, International Movies","Based on a true story, this action film follow..."
147,s328,Movie,Beowulf,Robert Zemeckis,"Ray Winstone, Anthony Hopkins, John Malkovich,...","United States, United Kingdom","August 1, 2021",2007,PG-13,114 min,"Action & Adventure, Sci-Fi & Fantasy",This deftly animated take on a legendary Old E...
155,s338,Movie,Good Luck Chuck,Mark Helfrich,"Dane Cook, Jessica Alba, Dan Fogler, Ellia Eng...","United States, Canada","August 1, 2021",2007,R,99 min,"Comedies, Romantic Movies","Every time Chuck breaks up with a girlfriend, ..."
...,...,...,...,...,...,...,...,...,...,...,...,...
5152,s8554,Movie,The Water Horse: Legend of the Deep,Jay Russell,"Emily Watson, Alex Etel, Ben Chaplin, David Mo...","New Zealand, United Kingdom, Australia","November 1, 2018",2007,PG,112 min,Children & Family Movies,A boy watches an egg he found hatch into somet...
5190,s8612,Movie,Traffic Signal,Madhur Bhandarkar,"Kunal Khemu, Neetu Chandra, Konkona Sen Sharma...",India,"December 31, 2019",2007,TV-MA,130 min,"Dramas, Independent Movies, International Movies",Four characters coexist at an urban traffic si...
5191,s8613,Movie,Train of the Dead,Sukum Maetawanitch,"Kett Thantup, Savika Chaiyadej, Sura Theerakon...",Thailand,"July 25, 2018",2007,TV-MA,88 min,"Horror Movies, International Movies",Five teenagers on the lam hide out in an empty...
5256,s8695,Movie,War,Philip G. Atwell,"Jet Li, Jason Statham, John Lone, Devon Aoki, ...","United States, Canada","April 1, 2019",2007,R,103 min,Action & Adventure,When his partner is killed and all clues point...


In [None]:
### Find all the movies released at India in year 2000 from the previously cleaned etflix Dataset.

In [29]:
# Let's find all the movies released at India
# netflixCleanedData
moviesAtIndia = netflixCleanedData[netflixCleanedData['country']=='India']
moviesAtIndia

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
4,s25,Movie,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,"September 21, 2021",1998,TV-14,166 min,"Comedies, International Movies, Romantic Movies",When the father of the man she loves insists t...
33,s106,Movie,Angamaly Diaries,Lijo Jose Pellissery,"Antony Varghese, Reshma Rajan, Binny Rinky Ben...",India,"September 5, 2021",2017,TV-14,128 min,"Action & Adventure, Comedies, Dramas",After growing up amidst the gang wars of his h...
35,s115,Movie,Anjaam,Rahul Rawail,"Madhuri Dixit, Shah Rukh Khan, Tinnu Anand, Jo...",India,"September 2, 2021",1994,TV-14,143 min,"Dramas, International Movies, Thrillers",A wealthy industrialist’s dangerous obsession ...
37,s117,Movie,Dhanak,Nagesh Kukunoor,"Krrish Chhabria, Hetal Gada, Vipin Sharma, Gul...",India,"September 2, 2021",2015,TV-PG,114 min,"Comedies, Dramas, Independent Movies",A movie-loving 10-year-old and her blind littl...
38,s119,Movie,Gurgaon,Shanker Raman,"Akshay Oberoi, Pankaj Tripathi, Ragini Khanna,...",India,"September 2, 2021",2017,TV-14,106 min,"Dramas, International Movies, Thrillers",When the daughter of a wealthy family returns ...
...,...,...,...,...,...,...,...,...,...,...,...,...
5307,s8773,Movie,Yamla Pagla Deewana 2,Sangeeth Sivan,"Dharmendra, Sunny Deol, Bobby Deol, Neha Sharm...",India,"May 1, 2017",2013,TV-14,147 min,"Action & Adventure, Comedies, International Mo...","Up to his old tricks, con man Dharam poses as ..."
5308,s8774,Movie,Yanda Kartavya Aahe,Kedar Shinde,"Ankush Choudhary, Smita Shewale, Mohan Joshi, ...",India,"January 1, 2018",2006,TV-PG,151 min,"Comedies, Dramas, International Movies",Thanks to an arranged marriage that was design...
5325,s8799,Movie,Zed Plus,Chandra Prakash Dwivedi,"Adil Hussain, Mona Singh, K.K. Raina, Sanjay M...",India,"December 31, 2019",2014,TV-MA,131 min,"Comedies, Dramas, International Movies",A philandering small-town mechanic's political...
5326,s8800,Movie,Zenda,Avadhoot Gupte,"Santosh Juvekar, Siddharth Chandekar, Sachit P...",India,"February 15, 2018",2009,TV-14,120 min,"Dramas, International Movies",A change in the leadership of a political part...


In [30]:
# Let's find all the movies released in 2000.
moviesIn2000 = netflixCleanedData[netflixCleanedData['release_year']==2000]
moviesIn2000

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
166,s351,Movie,Space Cowboys,Clint Eastwood,"Clint Eastwood, Tommy Lee Jones, Donald Suther...",United States,"August 1, 2021",2000,PG-13,130 min,"Action & Adventure, Dramas, Sci-Fi & Fantasy",A retired engineer agrees to help NASA prevent...
175,s360,Movie,The Original Kings of Comedy,Spike Lee,"Steve Harvey, D.L. Hughley, Cedric the Enterta...",United States,"August 1, 2021",2000,R,111 min,Stand-Up Comedy,"Comedians Steve Harvey, Cedric the Entertainer..."
234,s567,Movie,Charlie's Angels,McG,"Cameron Diaz, Drew Barrymore, Lucy Liu, Bill M...","United States, Germany","July 1, 2021",2000,PG-13,98 min,"Action & Adventure, Comedies",A tight-knit trio of specially trained agents ...
253,s594,Movie,Snow Day,Chris Koch,"Chris Elliott, Mark Webber, Jean Smart, Schuyl...",United States,"July 1, 2021",2000,PG,89 min,"Children & Family Movies, Comedies","When a snow day shuts down the whole town, the..."
334,s780,Movie,Battlefield Earth,Roger Christian,"John Travolta, Barry Pepper, Forest Whitaker, ...",United States,"June 2, 2021",2000,PG-13,118 min,"Action & Adventure, Cult Movies, Sci-Fi & Fantasy","In the year 3000, an alien race known as the P..."
431,s953,Movie,The Whole Nine Yards,Jonathan Lynn,"Bruce Willis, Matthew Perry, Rosanna Arquette,...",United States,"May 1, 2021",2000,R,99 min,"Action & Adventure, Comedies",An unhappily married dentist becomes mixed up ...
1889,s3473,Movie,Rugrats in Paris: The Movie,"Stig Bergqvist, Paul Demeyer","Elizabeth Daily, Tara Strong, Cheryl Chase, Ch...","Germany, United States","October 1, 2019",2000,G,79 min,"Children & Family Movies, Comedies",The Rugrats take to the big screen and visit P...
2449,s4546,Movie,Monty Python: Before the Flying Circus,Will Yapp,"Graham Chapman, Eric Idle, Terry Jones, Michae...",United Kingdom,"October 2, 2018",2000,TV-MA,56 min,"Comedies, Documentaries",Discover how six seemingly ordinary but suprem...
2563,s4724,Movie,Fiza,Khalid Mohamed,"Karisma Kapoor, Jaya Bhaduri, Hrithik Roshan, ...",India,"August 2, 2018",2000,TV-14,163 min,"Dramas, International Movies, Music & Musicals",Fiza's brother disappears during Mumbai's horr...
2711,s4957,Movie,Phir Bhi Dil Hai Hindustani,Aziz Mirza,"Shah Rukh Khan, Juhi Chawla, Paresh Rawal, Sat...",India,"April 1, 2018",2000,TV-14,159 min,"Comedies, Dramas, International Movies","In this Bollywood entertainment, two journalis..."


In [31]:
# Let's find all the movies released at India in 2000 together.
moviesAnIndiaIn2000 = netflixCleanedData[(df_cleaned['country']=='India')&(netflixCleanedData['release_year']==2000)]
moviesAnIndiaIn2000

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
2563,s4724,Movie,Fiza,Khalid Mohamed,"Karisma Kapoor, Jaya Bhaduri, Hrithik Roshan, ...",India,"August 2, 2018",2000,TV-14,163 min,"Dramas, International Movies, Music & Musicals",Fiza's brother disappears during Mumbai's horr...
2711,s4957,Movie,Phir Bhi Dil Hai Hindustani,Aziz Mirza,"Shah Rukh Khan, Juhi Chawla, Paresh Rawal, Sat...",India,"April 1, 2018",2000,TV-14,159 min,"Comedies, Dramas, International Movies","In this Bollywood entertainment, two journalis..."
3673,s6441,Movie,Chal Mere Bhai,David Dhawan,"Sanjay Dutt, Salman Khan, Karisma Kapoor, Dali...",India,"December 31, 2019",2000,TV-14,132 min,"Comedies, International Movies, Romantic Movies","When a secretary saves her tycoon boss's life,..."
4001,s6913,Movie,Hamara Dil Aapke Paas Hai,Satish Kaushik,"Anil Kapoor, Aishwarya Rai Bachchan, Sonali Be...",India,"March 1, 2018",2000,TV-14,158 min,"Dramas, International Movies, Music & Musicals",Love blooms when kind-hearted Avinash takes in...
4237,s7248,Movie,Kya Kehna,Kundan Shah,"Preity Zinta, Saif Ali Khan, Anupam Kher, Fari...",India,"April 1, 2018",2000,TV-PG,149 min,"Dramas, International Movies, Romantic Movies",A young university student's world is shaken a...
4545,s7703,Movie,Papa the Great,Bhagyaraj,"Krishan Kumar, Nagma, Satya Prakash, Master Bo...",India,"December 8, 2017",2000,TV-PG,137 min,"Comedies, Dramas, International Movies","After witnessing a murder, a meek family man m..."
4597,s7802,Movie,Pukar,Rajkumar Santoshi,"Anil Kapoor, Madhuri Dixit, Namrata Shirodkar,...",India,"March 1, 2018",2000,TV-14,165 min,"Action & Adventure, Dramas, International Movies",A notorious terrorist manipulates an Indian ar...


## Example08: GroupBy in Pandas 

In [32]:
import pandas as pd

# Sample DataFrame
data = {
    'Category': ['Electronics', 'Electronics', 'Clothing', 'Clothing', 'Books'],
    'Product': ['Laptop', 'Smartphone', 'Jeans', 'T-shirt', 'Novel'],
    'Price': [1000, 700, 50, 20, 15],
    'Quantity': [3, 5, 10, 7, 15]
}
df = pd.DataFrame(data)

# Add a Total Sales column
df['Total Sales'] = df['Price'] * df['Quantity']

# Group by Category and calculate the total sales per category
grouped = df.groupby('Category')['Total Sales'].sum()

print("Original DataFrame:")
print(df)
print("\nTotal Sales by Category:")
print(grouped)


Original DataFrame:
      Category     Product  Price  Quantity  Total Sales
0  Electronics      Laptop   1000         3         3000
1  Electronics  Smartphone    700         5         3500
2     Clothing       Jeans     50        10          500
3     Clothing     T-shirt     20         7          140
4        Books       Novel     15        15          225

Total Sales by Category:
Category
Books           225
Clothing        640
Electronics    6500
Name: Total Sales, dtype: int64


In [33]:
df

Unnamed: 0,Category,Product,Price,Quantity,Total Sales
0,Electronics,Laptop,1000,3,3000
1,Electronics,Smartphone,700,5,3500
2,Clothing,Jeans,50,10,500
3,Clothing,T-shirt,20,7,140
4,Books,Novel,15,15,225


In [34]:
grouped

Category
Books           225
Clothing        640
Electronics    6500
Name: Total Sales, dtype: int64

# Challenge (H/W): 

<font  color= 'red'> **Start exploration with your Data for CourseWork.** </font>