# Instructions for Pandas Practice
---
Welcome to the Pandas Practice! In this exercise, you will work with the Rick and Morty dataset to practice your data manipulation and analysis skills using Pandas. Please follow the instructions below carefully:

1. **Understand the Dataset**: The dataset used in this Practice is related to the Rick and Morty TV show. Familiarize yourself with the dataset's structure and content before starting.

2. **Read Each Question Thoroughly**: Each question in this Practice is designed to test different Pandas functionalities. Make sure you understand the requirements of each question before you begin coding.

3. **Write Your Code in the Provided Cells**: For each question, a code cell is provided where you should write your solution. Do not modify any other cells or sections of the notebook.

4. **Execute Your Code**: After writing your code in each cell, run the cell to check if your code produces the expected results. Verify that the output matches the requirements specified in the question.

5. **Complete All Questions**: Ensure that you attempt and complete all the questions provided in the notebook. Each question is designed to test different Pandas skills and concepts.

6. **Review Your Work**: Before submitting, double-check your answers and make sure all questions are addressed. Ensure that the notebook runs without errors and that all outputs are correct.

7. **Download Your Notebook**: Once you have completed all the questions and verified your solutions, download the notebook file (.ipynb). You can do this by selecting `File` > `Download` > `Download .ipynb` from the Google Colab menu.

8. **Submit Your Practice**: Upload the downloaded .ipynb file to the designated learning platform for submission.

9. **Verify Submission**: Ensure that you have uploaded the correct file and that it is not corrupted. If you encounter any issues with the file, you may need to resubmit.

---

Good luck with your Practice!

In [3]:
import numpy as np
import pandas as pd

locations_data_url = "https://s3.ap-south-1.amazonaws.com/new-assets.ccbp.in/frontend/content/aiml/classical-ml/locations_dataset.csv"
episodes_data_url = "https://s3.ap-south-1.amazonaws.com/new-assets.ccbp.in/frontend/content/aiml/classical-ml/episodes_dataset.csv"
character_data_url = "https://s3.ap-south-1.amazonaws.com/new-assets.ccbp.in/frontend/content/aiml/classical-ml/characters_dataset.csv"

characters = pd.read_csv(character_data_url,index_col=0)
episodes = pd.read_csv(episodes_data_url,index_col=0)
locations = pd.read_csv(locations_data_url,index_col=0)

Q1. You are given a DataFrame locations containing a column 'name' with the names of different locations, find only the location name with the maximum number of characters (longest name) and print it.

In [None]:
# write your code here
max_length =locations.sort_values(by="name", key=lambda x:x.str.len())
longest_locations =max_length.iloc[-1]["name"]
print(longest_locations)

Earth (Replacement Dimension)


Q2. From the characters DataFrame, count how many characters have a `status` of `unknown` and print the result as a frequency table, display the result as a Series.

In [86]:
# write your code here
status_distribution =characters.query('status=="unknown"')["status"].value_counts()
status_distribution.name="count"
status_distribution.index.name="status"
print(status_distribution)

status
unknown    6
Name: count, dtype: int64


Q3. From the `episodes` DataFrame, find the episode with the earliest `created` date and print its name.

In [53]:
# write your code here
earliest_date =episodes.sort_values(by="created")
earliest_episode =earliest_date.iloc[0]["name"]
print(earliest_episode)

Pilot


Q4. From the `characters` DataFrame, count how many characters belong to each species, set the index name to `species` and the Series name to `count`, then print the result.

In [85]:
# write your code here
num_of_species_counts =characters["species"].value_counts()
num_of_species_counts.name="count"
num_of_species_counts.index.name="species"
print(num_of_species_counts)

species
Human    15
Alien     5
Name: count, dtype: int64


Q5. Group the episodes DataFrame by `air_date`, count how many episodes aired on each date, and then print the total number of air dates.


In [64]:
# write your code here
episodes_count =episodes.groupby(by="air_date")["air_date"].value_counts()
print(episodes_count.sum())

20


Q.6 Using the `characters` DataFrame, group the data by `gender` and count how many characters belong to each gender based on id. Print the result.

In [70]:
# write your code here
characters_count =characters.groupby(by="gender")['id'].count()
print(characters_count)

gender
Female      4
Male       15
unknown     1
Name: id, dtype: int64


Q.7 Convert all episode names in the `episodes` DataFrame to uppercase and print the `name` column.

In [15]:
# write your code here
episodes["name"]=episodes["name"].str.upper()
print(episodes[["name"]])

                                       name
0                                     PILOT
1                             LAWNMOWER DOG
2                              ANATOMY PARK
3                    M. NIGHT SHAYM-ALIENS!
4                      MEESEEKS AND DESTROY
5                            RICK POTION #9
6                       RAISING GAZORPAZORP
7                             RIXTY MINUTES
8           SOMETHING RICKED THIS WAY COMES
9      CLOSE RICK-COUNTERS OF THE RICK KIND
10                          RICKSY BUSINESS
11                         A RICKLE IN TIME
12                           MORTYNIGHT RUN
13                 AUTO EROTIC ASSIMILATION
14                            TOTAL RICKALL
15                             GET SCHWIFTY
16                  THE RICKS MUST BE CRAZY
17            BIG TROUBLE IN LITTLE SANCHEZ
18  INTERDIMENSIONAL CABLE 2: TEMPTING FATE
19                   LOOK WHO'S PURGING NOW


Q.8 From the `characters` DataFrame, select all characters whose species is **Human**. Then count how many characters have gender **Male**, set the index name to `gender` and the Series name to `count`, and print the result.

In [13]:
# write your code here
humans = characters[characters['species'] == 'Human']
gender = characters[characters['gender'] == 'Male']['gender'].value_counts()
gender.name = 'count'
gender.index.name = 'gender'
print(gender)

gender
Male    15
Name: count, dtype: int64


Q.9 From the `characters` data, select all rows where the **name** contains the word **Citadel** (case-insensitive) and print the resulting DataFrame.


In [46]:
# write your code here
citadel_characters =characters[characters["name"].str.contains("Citadel",case=False)]
print(citadel_characters)

Empty DataFrame
Columns: [id, name, status, species, type, gender, origin, location, image, episode, url, created]
Index: []


Q.10 From the `episodes` DataFrame, extract the **names of all episodes aired in 2013** and print them as a Python list.

In [45]:
# write your code here
episodes_2013 =episodes[episodes["air_date"].str.endswith("2013")]
print(list(episodes_2013.name))

['Anatomy Park', 'Pilot', 'Lawnmower Dog']


Q.11 Find the three `characters` who appeared in the most episodes by applying len to the `episode` column, and print their episode counts.

In [44]:
# write your code here
character_episodes_count =characters["episode"].apply(len).sort_values(ascending=False)
print(character_episodes_count[:3])

0    2337
1    2337
2    1928
Name: episode, dtype: int64


Q.12 From the `characters` data, extract the **name** value from each **origin** dictionary, find how many unique origins there are, and print the count.

In [9]:
# write your code here
unique_characters =characters["origin"].apply(lambda x:eval(x)["name"])
print(unique_characters.nunique())


4


Q. 13 Convert the **air_date** column of the episodes DataFrame to datetime, find the most recent episode (latest air date), and print its name.


In [43]:
# write your code here
episodes =episodes.sort_values(by="air_date")
most_recent_episode =episodes.iloc[-1]
print(most_recent_episode["name"])

Look Who's Purging Now


Q. 14 From the `characters` data, determine the **most common species** and print it.

In [42]:
# write your code here
most_common_species =characters['species'].mode()
print(most_common_species[0])

Human


Q. 15 From the `characters` data, select and print all character names that **start with the letter "A"**.

In [41]:
# write your code here
name_count =characters[characters["name"].str.startswith("A")]['name']
print(name_count)

5     Abadango Cluster Princess
6              Abradolf Lincler
7              Adjudicator Rick
8               Agency Director
9                    Alan Rails
10              Albert Einstein
11                    Alexander
12                 Alien Googah
13                  Alien Morty
14                   Alien Rick
15                 Amish Cyborg
16                        Annie
17                Antenna Morty
18                 Antenna Rick
19      Ants in my Eyes Johnson
Name: name, dtype: object


# End!