<a href="https://colab.research.google.com/github/Hanish-DotCom/pandas/blob/main/Pandas_Practice_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Pandas Practice Notebook
Dear Students this notebook will help you practice Important Pandas operations step-by-step.


## 1. File Import using Pandas
Below are some examples to import files in Pandas and practice questions.

In [None]:
# Example: Importing CSV, Excel, and JSON files
import pandas as pd

# Read CSV
df_csv = pd.read_csv('data.csv')

# Read Excel
df_excel = pd.read_excel('data.xlsx')

# Read JSON
df_json = pd.read_json('data.json')

### Practice Questions
1. Import the file `sales.csv` and display the first 5 rows.
2. Import the Excel file `students.xlsx` and check its columns.
3. Import any JSON file and find its shape.

In [9]:
# 1

import pandas as pd

data = {
    'product': ['laptop', 'desktop', 'mobile'],
    'price': [1000, 2000, 500],
    'sales': [100, 200, 300]
}
sales_df = pd.DataFrame(data)
sales_df.to_csv('sales.csv', index=False)

sales_df = pd.read_csv('sales.csv')
print(sales_df.head())


   product  price  sales
0   laptop   1000    100
1  desktop   2000    200
2   mobile    500    300


In [7]:
# 2

data_students = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [20, 21, 22],
    'Major': ['Computer Science', 'Physics', 'Chemistry']
}
students_df = pd.DataFrame(data_students)
students_df.to_excel('students.xlsx', index=False)

students_df = pd.read_excel('students.xlsx')
print(students_df.columns)

Index(['Name', 'Age', 'Major'], dtype='object')


In [8]:
# 3

data_json = {
    'name': ['Alice', 'Bob'],
    'age': [20, 21]
}
json_df = pd.DataFrame(data_json)
json_df.to_json('data.json')

json_df = pd.read_json('data.json')
print(json_df.shape)

(2, 2)


## 2. File Export using Pandas
You can export data to different formats like CSV, Excel, and JSON.

In [11]:
# Example: Exporting DataFrame to different formats
sales_df.to_csv('output.csv', index=False)
students_df.to_excel('output.xlsx', index=False)
json_df.to_json('output.json')

### Practice Questions
1. Export any given DataFrame to CSV format without index.
2. Export the same DataFrame to Excel with the name 'Sheet1'.
3. Export to JSON format with indentation.

In [15]:

sales_df.to_csv('exported_sales.csv', index=False)

sales_df.to_excel('exported_sales.xlsx', sheet_name='Sheet1', index=False)

sales_df.to_json('exported_sales.json', indent=4)

## 3. Creating Series and DataFrames

In [None]:
# Example: Creating Series and DataFrame
series = pd.Series([10, 20, 30], name='Sales')
data = {'Name': ['Amit', 'Riya', 'John'], 'Age': [22, 24, 26]}
df = pd.DataFrame(data)
print(series)
print(df)

### Practice Questions
1. Create a Series of 5 numbers named 'Marks'.
2. Create a DataFrame of 3 students with columns 'Name', 'Score', 'City'.
3. Print shape, head, and describe of your DataFrame.

In [19]:
# 1
series = pd.Series([80, 90, 75, 95, 85], name='Marks')
print(series)

0    80
1    90
2    75
3    95
4    85
Name: Marks, dtype: int64


In [33]:
# 2
data = {
    'Name': ['Hanish','Kunal','Nikhil'],
    'Score': [80, 90, 95],
    'City' : ['Mumbai', 'Noida','Noida'],
    'Age': [25, 21, 23]
}
student_data = pd.DataFrame(data, index = [1,2,3])
print(student_data)

     Name  Score    City  Age
1  Hanish     80  Mumbai   25
2   Kunal     90   Noida   21
3  Nikhil     95   Noida   23


In [34]:
# 3
print( 'Shape', student_data.shape)
print()
print('Head',student_data.head())
print()
print('Description',student_data.describe())

Shape (3, 4)

Head      Name  Score    City  Age
1  Hanish     80  Mumbai   25
2   Kunal     90   Noida   21
3  Nikhil     95   Noida   23

Description            Score   Age
count   3.000000   3.0
mean   88.333333  23.0
std     7.637626   2.0
min    80.000000  21.0
25%    85.000000  22.0
50%    90.000000  23.0
75%    92.500000  24.0
max    95.000000  25.0


## 4. Indexing and Slicing in Pandas

In [35]:
# Example: Indexing and Slicing
df['Name']
df[['Name', 'Age']]
df.iloc[0:2]
df.loc[df['Age'] > 22]

NameError: name 'df' is not defined

### Practice Questions
1. Select only the 'Score' column from your DataFrame.
2. Retrieve rows from index 1 to 3.
3. Filter rows where 'Age' > 23.

In [37]:

# 1.
print(student_data['Score'])
# 2.
print(student_data.loc[1:3])
# 3.
print(student_data[student_data['Age'] > 23])

1    80
2    90
3    95
Name: Score, dtype: int64
     Name  Score    City  Age
1  Hanish     80  Mumbai   25
2   Kunal     90   Noida   21
3  Nikhil     95   Noida   23
     Name  Score    City  Age
1  Hanish     80  Mumbai   25


## 5. Merging and Concatenation

In [None]:
# Example: Merge and Concat
df1 = pd.DataFrame({'ID': [1,2], 'Name': ['A', 'B']})
df2 = pd.DataFrame({'ID': [1,2], 'Score': [85, 90]})
merged = pd.merge(df1, df2, on='ID')
print(merged)

# Concat
df3 = pd.concat([df1, df2], axis=0, ignore_index=True)

### Practice Questions
1. Merge two DataFrames 'students' and 'marks' on 'ID'.
2. Perform left join between two given DataFrames.
3. Concatenate two DataFrames vertically and reset the index.

In [None]:
# Answer here


## 6. Pivot Tables and GroupBy

In [None]:
# Example: Pivot and GroupBy
df = pd.DataFrame({'Region': ['North', 'South', 'North'], 'Month': ['Jan', 'Jan', 'Feb'], 'Sales': [250, 300, 400]})

pivot = df.pivot_table(values='Sales', index='Region', columns='Month', aggfunc='sum', fill_value=0)
grouped = df.groupby('Region')['Sales'].sum()
print(pivot)
print(grouped)

### Practice Questions
1. Group the DataFrame by 'Month' and find total sales.
2. Create a pivot table to show Region-wise total sales.
3. Group data by 'Region' and count number of entries.

In [None]:
# Answer here


## Optional Practice (Medium Level)
These questions test your overall understanding of Pandas.

1. Import a dataset and clean all missing values, rename columns properly, and export it back to Excel.

Dataset- https://drive.google.com/file/d/1kw_Sdib3WCo_acQYA851poCUdKcWgfYs/view?usp=sharing

In [None]:
# Answer here
# Answer here

df_cleaned = df.dropna().rename(columns={'Product': 'Item', 'Sales': 'Rev'})
print("\nCleaned DataFrame:\n", df_cleaned)

2. Combine multiple CSV files from a folder into one DataFrame and find top 5 entries by a numeric column like 'Sales'.

Dataset- https://drive.google.com/drive/folders/1zgJs_6r8Aa_-93YXmYXfxOK5L4hVUI9Q?usp=sharing

In [None]:

combined = pd.concat([df1, df2], axis=0)

df[score]