# Pandas Fundamentals

Pandas is a Python library for data manipulation and analysis. It provides data structures like Series and DataFrames to efficiently handle and analyze data. This notebook will introduce you to the fundamental concepts of Pandas and guide you through creating and manipulating data structures.

## 1. Introduction to Pandas
To use Pandas, first, you need to import the library:

In [1]:
import pandas as pd

## 2. Series
A Series is a one-dimensional array-like object that can hold any data type (integers, strings, floats, etc.). It is similar to a column in a spreadsheet.
Let's create a simple Series:

In [None]:
# Creating a Series
s = pd.Series([10, 20, 30, 40, 50])
print(s)

You can also create a Series with custom indices:

In [None]:
# Creating a Series with custom indices
s = pd.Series([10, 20, 30, 40, 50], index=['a', 'b', 'c', 'd', 'e'])
print(s)

## 3. DataFrame
A DataFrame is a two-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns).
Here’s how to create a DataFrame:

In [None]:
# Creating a DataFrame from a dictionary
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data)
print(df)

## 4. Basic Operations on DataFrame
You can perform a variety of operations on a DataFrame. Here are a few basic ones:

- Accessing columns:

In [None]:
# Accessing a column
print(df['Name'])

- Adding a new column:

In [None]:
# Adding a new column
df['Salary'] = [50000, 60000, 70000, 80000]
print(df)

- Filtering data:

In [None]:
# Filtering rows based on a condition
print(df[df['Age'] > 30])

## 5. Handling Missing Data
Pandas makes it easy to handle missing data. The `isna()` function helps to detect missing values, and the `fillna()` function helps to fill in the missing values.

In [None]:
# Creating a DataFrame with missing values
data_with_nan = {'A': [1, 2, None], 'B': [None, 5, 6], 'C': [7, 8, 9]}
df_nan = pd.DataFrame(data_with_nan)
print(df_nan)

In [None]:
# Filling missing values
df_filled = df_nan.fillna(0)
print(df_filled)

## 6. Grouping Data
Grouping data allows you to aggregate data based on certain conditions. You can use the `groupby()` function for this:

In [None]:
# Grouping data by a column
grouped = df.groupby('City')

# Calculating the mean for numeric columns only
print(grouped.mean(numeric_only=True))
