# Pandas Exercises

## Overview

This module covers essential Pandas operations, including data manipulation, analysis, and basic statistical functions. It provides hands-on experience with real-world data using the Pandas library.

## Learning Objectives

- Convert list of dictionaries and CSV files to DataFrames
- Perform data access operations using Pandas
- Handle missing data with `fillna` function
- Apply descriptive statistics functions to analyze data
- Utilize Pandas for data slicing and dicing

## Prerequisites

- Basic understanding of Python
- Familiarity with Jupyter notebooks
- Installed libraries: numpy, pandas

## Get Started

### Install required packages.

In [3]:
%pip install numpy pandas


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


### Import necessary libraries

In [4]:
import numpy as np
import pandas as pd

## Convert list of dictionaries to DataFrame

In [5]:
d = [
    {"city": "Delhi", "data": 1000},
    {"city": "Bangalore", "data": 2000},
    {"city": "Mumbai", "data": 1000},
]
d

[{'city': 'Delhi', 'data': 1000},
 {'city': 'Bangalore', 'data': 2000},
 {'city': 'Mumbai', 'data': 1000}]

Convert the list of dictionaries `d` into a DataFrame.

In [6]:
# Your code goes here

## Convert CSV files to DataFrame

Read in csv file and convert it to DataFrame.

In [7]:
# Replace `None` with your code

city_data = None

Show the first 10 rows of converted DataFrame.

In [8]:
# Your code goes here

## Data Access

### Head and Tail

Get the last 10 rows of `city_data`:

In [9]:
# Your code goes here

### Slicing and Dicing

In [10]:
series_es = city_data.lat
type(series_es)

AttributeError: 'NoneType' object has no attribute 'lat'

Get the first 5 odd number of rows of `series_es`:

In [None]:
# Your code goes here

Get the first 8 rows of `series_es`:

In [None]:
# Your code goes here

Get first 8 rows of `city_data`:

In [None]:
# Your code goes here

Get the first 4 columns of the first 5 rows of **city_data**:

In [None]:
# Your code goes here

Select cities that have population of more than 10 million and select columns with column name start with the letter `p`:

In [None]:
# Your code goes here

## Data Operations

### Missing data and the `fillna` function

In [None]:

df = pd.DataFrame(np.random.randn(8, 3), columns=["A", "B", "C"])
df.iloc[4, 2] = np.nan
df

Replace all the "NaN" in `df` with `0`:

In [None]:
# Your code goes here

## Descriptive Statistics functions

In [None]:
columns_numeric = ["lat", "lng", "pop"]

Get average `lat`, `lng`, and `pop` values:

In [None]:
# Your code goes here

Get sum of `lat`, `lng`, and `pop` values:

In [None]:
# Your code goes here

Get total number of `lat`, `lng`, and `pop` values:

In [None]:
# Your code goes here

Get 75 percentile of `lat`, `lng`, and `pop` values:

In [None]:
# Your code goes here

Get sums of each row:

In [None]:
# Your code goes here

Calculate
the most important statistics for numerical data in one go so that we don’t have to use individual functions:

In [None]:
# Your code goes here

## Conclusion

In this module, you've learned how to:

- Convert different data formats to Pandas DataFrames
- Access and manipulate data using Pandas
- Handle missing data
- Perform basic statistical analysis on datasets
- Use various Pandas functions for data exploration and manipulation

These skills form a foundation for more advanced data analysis and machine learning tasks using Python and Pandas.

## Clean up

Remember to shut down your Jupyter notebook kernel when you're done to free up resources.