## Introduction to Pandas

Pandas is a powerful open-source data analysis and data manipulation library for Python. It provides data structures such as Series and DataFrame that allow for efficient data handling.

In [None]:
import pandas as pd
print(pd.__version__)

## Creating a Series

A Series is a one-dimensional labeled array capable of holding any data type.

In [None]:
import pandas as pd
s = pd.Series([1, 3, 5, 7, 9])
print(s)

## Creating a DataFrame

A DataFrame is a two-dimensional labeled data structure with columns of potentially different types.

In [2]:
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
df

Unnamed: 0,Name,Age
0,Alice,25
1,Bob,30
2,Charlie,35


## Reading a CSV File

Pandas allows reading data from various file formats, including CSV.

In [2]:
import pandas as pd
df = pd.read_csv('./data.csv')
df.head()

Unnamed: 0,car_ID,symboling,CarName,fueltype,aspiration,doornumber,carbody,drivewheel,enginelocation,wheelbase,...,enginesize,fuelsystem,boreratio,stroke,compressionratio,horsepower,peakrpm,citympg,highwaympg,price
0,1,3,alfa-romero giulia,gas,std,two,convertible,rwd,front,88.6,...,130,mpfi,3.47,2.68,9.0,111,5000,21,27,13495.0
1,2,3,alfa-romero stelvio,gas,std,two,convertible,rwd,front,88.6,...,130,mpfi,3.47,2.68,9.0,111,5000,21,27,16500.0
2,3,1,alfa-romero Quadrifoglio,gas,std,two,hatchback,rwd,front,94.5,...,152,mpfi,2.68,3.47,9.0,154,5000,19,26,16500.0
3,4,2,audi 100 ls,gas,std,four,sedan,fwd,front,99.8,...,109,mpfi,3.19,3.4,10.0,102,5500,24,30,13950.0
4,5,2,audi 100ls,gas,std,four,sedan,4wd,front,99.4,...,136,mpfi,3.19,3.4,8.0,115,5500,18,22,17450.0


## Basic DataFrame Operations

You can perform various operations on a DataFrame such as viewing, selecting, filtering, and modifying data.

In [3]:
import pandas as pd
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
print(df.describe())
print(df.info())

         A    B
count  3.0  3.0
mean   2.0  5.0
std    1.0  1.0
min    1.0  4.0
25%    1.5  4.5
50%    2.0  5.0
75%    2.5  5.5
max    3.0  6.0
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   A       3 non-null      int64
 1   B       3 non-null      int64
dtypes: int64(2)
memory usage: 180.0 bytes
None


## Filtering Data

Pandas provides powerful tools to filter data based on conditions.

In [None]:
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
filtered_df = df[df['Age'] > 25]
print(filtered_df)

## Handling Missing Data

Missing data is a common issue in real-world datasets. Pandas provides functions to handle missing values.

In [None]:
import pandas as pd
data = {'A': [1, 2, None], 'B': [4, None, 6]}
df = pd.DataFrame(data)
print(df.fillna(0))

## Grouping and Aggregation

Pandas allows grouping data and applying aggregation functions.

In [None]:
import pandas as pd
data = {'Category': ['A', 'B', 'A', 'B'], 'Values': [10, 20, 30, 40]}
df = pd.DataFrame(data)
print(df.groupby('Category').sum())

## Merging and Joining DataFrames

Pandas provides powerful tools for merging and joining datasets.

In [4]:
import pandas as pd
df1 = pd.DataFrame({'ID': [1, 2, 3], 'Name': ['Alice', 'Bob', 'Charlie']})
df2 = pd.DataFrame({'ID': [1, 2, 4], 'Age': [25, 30, 40]})
merged_df = pd.merge(df1, df2, on='ID', how='inner')
print(merged_df)

   ID   Name  Age
0   1  Alice   25
1   2    Bob   30


## Pivot Tables

Pivot tables allow you to summarize data in a flexible manner.

In [None]:
import pandas as pd
data = {'Category': ['A', 'A', 'B', 'B'], 'Values': [10, 20, 30, 40]}
df = pd.DataFrame(data)
pivot = df.pivot_table(values='Values', index='Category', aggfunc='sum')
print(pivot)

## Exporting Data

Pandas allows exporting DataFrames to various file formats such as CSV, Excel, and JSON.

In [5]:
import pandas as pd
data = {'Name': ['Alice', 'Bob'], 'Age': [25, 30]}
df = pd.DataFrame(data)
df.to_csv('output.csv', index=False)
print('CSV file saved.')

CSV file saved.
