# 1. Introduction to Pandas
Pandas is a Python library for data manipulation and analysis. It provides two main data structures:

*   Series: 1D labeled array (like a column in a table).
*   DataFrame: 2D labeled data structure (like a table with rows and columns).

# 2. Installation
If you haven't installed Pandas, you can do so using pip:

`pip install pandas`

# 3. Importing Pandas
Start by importing the library:


In [1]:
import pandas as pd

# 4. Pandas Series
A Series is a one-dimensional array-like object.

## **a. Creating a Series**

In [2]:
data = [10, 20, 30, 40]
s = pd.Series(data)
print(s)

0    10
1    20
2    30
3    40
dtype: int64


## **b. Accessing Elements**

In [3]:
print(s[0])  # Output: 10

10


## **c. Series Attributes**

In [4]:
print(s.index)  # Output: RangeIndex(start=0, stop=4, step=1)
print(s.values) # Output: [10 20 30 40]

RangeIndex(start=0, stop=4, step=1)
[10 20 30 40]


# 5. Pandas DataFrame
A DataFrame is a 2D data structure with rows and columns.

## **a. Creating a DataFrame**

In [5]:
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'Los Angeles', 'Chicago']
}
df = pd.DataFrame(data)
print(df)

      Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago


## **b. Accessing Columns**

In [6]:
print(df['Name'])  # Access the 'Name' column

0      Alice
1        Bob
2    Charlie
Name: Name, dtype: object


## **c. Accessing Rows**

In [8]:
print(df.iloc[0])  # Access the first row

Name       Alice
Age           25
City    New York
Name: 0, dtype: object


## **d. DataFrame Attributes**

In [9]:
print(df.columns)  # Output: Index(['Name', 'Age', 'City'], dtype='object')
print(df.index)    # Output: RangeIndex(start=0, stop=3, step=1)
print(df.shape)    # Output: (3, 3)
print(df.dtypes)   # Output: Name: object, Age: int64, City: object

Index(['Name', 'Age', 'City'], dtype='object')
RangeIndex(start=0, stop=3, step=1)
(3, 3)
Name    object
Age      int64
City    object
dtype: object


# 6. Basic Operations
## **a. Reading Data**

In [10]:
df = pd.read_csv('data.csv')  # Read a CSV file
df = pd.read_excel('data.xlsx')  # Read an Excel file

FileNotFoundError: [Errno 2] No such file or directory: 'data.csv'

## **b. Viewing Data**


In [None]:
print(df.head())  # First 5 rows
print(df.tail())  # Last 5 rows
print(df.info())  # Summary of the DataFrame
print(df.describe())  # Statistical summary

## **c. Filtering Data**

In [None]:
# Filter rows where Age > 30
filtered_df = df[df['Age'] > 30]
print(filtered_df)

## **d. Adding a Column**

In [None]:
df['Salary'] = [50000, 60000, 70000]
print(df)

## **e. Dropping a Column**

In [None]:
df = df.drop('Salary', axis=1)
print(df)

# 7. Advanced Operations
## **a. Grouping Data**

In [None]:
grouped = df.groupby('City').mean()
print(grouped)

## **b. Merging DataFrames**


In [None]:
df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', 'B1']})
df2 = pd.DataFrame({'A': ['A2', 'A3'], 'B': ['B2', 'B3']})
merged_df = pd.concat([df1, df2])
print(merged_df)

## **c. Pivot Tables**


In [None]:
pivot = df.pivot_table(values='Age', index='City', aggfunc='mean')
print(pivot)

## **d. Handling Missing Data**


In [None]:
df['Age'].fillna(0, inplace=True)  # Fill missing values with 0
df.dropna(inplace=True)  # Drop rows with missing values

## **e. Applying Functions**


In [None]:
df['Age'] = df['Age'].apply(lambda x: x + 1)  # Increment age by 1
print(df)

# 8. Time Series Operations
Pandas is great for working with time series data.

## **a. Creating a Date Range**

In [None]:
dates = pd.date_range('20230101', periods=6)
df = pd.DataFrame({'Date': dates, 'Value': [1, 2, 3, 4, 5, 6]})
print(df)

## **b. Resampling**


In [None]:
df.set_index('Date', inplace=True)
resampled = df.resample('M').mean()  # Resample by month
print(resampled)

# 9. Input/Output
## Saving Data

In [None]:
df.to_csv('output.csv', index=False)  # Save to CSV
df.to_excel('output.xlsx', index=False)  # Save to Excel