# Pandas Library: Introduction and Usage
Pandas is a powerful Python library for data manipulation and analysis. It provides data structures like DataFrame and Series.

## Installation
```bash
pip install pandas
```

## Importing Pandas
```python
import pandas as pd
```


## Series and DataFrame Creation
Create Series and DataFrames from lists, dicts, or numpy arrays.

In [None]:
import pandas as pd
import numpy as np
s = pd.Series([1, 3, 5, np.nan, 6, 8])
print('Series:', s)
df = pd.DataFrame({'A': 1., 'B': pd.Timestamp('20230101'), 'C': pd.Series(1, index=list(range(4)), dtype='float32'), 'D': np.array([3] * 4, dtype='int32'), 'E': pd.Categorical(['test', 'train', 'test', 'train']), 'F': 'foo'})
# print('DataFrame:', df)
df.head()
df.tail()
df.info()
df.describe()



## Indexing and Selection
Select data by label, position, or boolean indexing.

In [None]:
df = pd.DataFrame(np.random.randn(6, 4), columns=['A', 'B', 'C', 'D'])
print('First 3 rows:', df.head(3))
print('Select column A:', df['A'])
print('Select by position:', df.iloc[0, 1])
print('Boolean indexing:', df[df['A'] > 0])


## Handling Missing Data
Detect, remove, or fill missing values.

In [None]:
df = pd.DataFrame({'A': [1, np.nan, 3], 'B': [4, 5, np.nan]})
print('Original:', df)
print('Drop rows with NaN:', df.dropna())
print('Fill NaN with 0:', df.fillna(0))


## Aggregation and GroupBy
Aggregate data and group by categories.

In [None]:
df = pd.DataFrame({'Category': ['A', 'B', 'A', 'B'], 'Value': [10, 20, 30, 50]})
print('Group by Category and sum:')
df.groupby('Category').sum()


## Merging and Joining
Combine multiple DataFrames.

In [None]:
df1 = pd.DataFrame({'key': ['A', 'B', 'C'], 'value': [1, 2, 3]})
df2 = pd.DataFrame({'key': ['B', 'C', 'D'], 'value': [4, 5, 6]})
merged = pd.merge(df1, df2, on='key', how='outer', suffixes=('_left', '_right'))
print('Merged DataFrame:')
merged


## Reading Data from Files
Pandas can read data from many file formats, such as CSV and Excel.

In [None]:
# Read from a CSV file
df_csv = pd.read_csv('mail_data.csv')  # Make sure data.csv exists
print('CSV DataFrame:')
df_csv.head()



## Saving Data to Files
You can save DataFrames to CSV, Excel, and other formats.

In [None]:
# Save to a CSV file
df.to_csv('output.csv', index=False)

# Save to an Excel file
# df.to_excel('output.xlsx', index=False)  # Uncomment to save as Excel


## Documentation
- [Pandas Documentation](https://pandas.pydata.org/docs/)
- [Pandas User Guide](https://pandas.pydata.org/pandas-docs/stable/user_guide/index.html)


## Project: Using Pandas in AI
Analyze a dataset (e.g., Titanic) to find insights for a machine learning model.

### Example
```python
# Load a dataset
df = pd.read_csv('https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv')
print(df[['Survived', 'Age', 'Fare']].groupby('Survived').mean())
```
