# Introduction to Pandas

### Importing and Installing pandas

First you need to install and import the pandas library.

- To install pandas library:
    `pip install pandas`  or  `conda install pandas`

- To import pandas, we'll use pandas as pd

In [None]:
import pandas as pd

## Data Structures in Pandas:

Pandas provides two main data structures: "Series" and "DataFrame".

- Series: A one-dimensional labeled array.
- DataFrame: A two-dimensional table of data with columns that can be of different data types.

### Series

Creating a series from a list

In [None]:
data = [10, 20, 30, 40, 50]

series = pd.Series(data)
print(series)

Creating series with custom index

In [None]:
numb = [12, 65, 23, 84, 28, 94, 58]
ind = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
series_1 = pd.Series(numb, index = ind)
print(series_1)

### DataFrame

Creating a DataFrame

In [None]:
normal_df = { 'Name': ['Alice', 'Bob', 'Carol', 'David'],
      'Age': [54, 83, 23, 33]
      }

df = pd.DataFrame(normal_df)
print(df)

Specifying custom index in DataFrame

In [None]:
normal_1 = { 'Name': ['Alice', 'Bob', 'Carol', 'David'],
      'Age': [54, 83, 23, 33]
      }
ind_1 = ['person 1', 'person 2', 'person 3', 'person 4']

df_1 = pd.DataFrame(normal_1, index = ind_1)
print(df_1)

## Reading and Writind Data

Reading data from a CSV file into a DataFrame.

In [None]:
df = pd.read_csv(data/data.csv)
print(df)

Writing a DataFrame to a CSV file.

In [13]:
df.to_csv('data/new_data.csv', index=False)

## Basic DataFrame Operations

### Basic DataFrame Information

Getting basic information about the DataFrame.

In [None]:
# Display the first five rows
print(df.head())

In [None]:
# Display the last five column
print(df.tail())

In [None]:
# Display column names
print(df.columns)

In [None]:
# Display the data types of column
print(df.dtypes)

In [14]:
# Display summary statistics
print(df.describe())

Accessing and selecting data from the DataFrame.

In [None]:
# Selecting a single column
ages = df['Age']

In [None]:
# Selecting specific row
row = df[1:3]

In [None]:
# Select row based on the condition
adult = df[df['Age']>=18]

In [None]:
# Select row using loc (label based indexing)
row_loc = df.loc['person 2']

## Data Manipulation with Pandas

Filtering data

In [None]:
# filtering rows based on conditions
adults = df[df['Age'] >= 18]

### Checking Missing Data

In [None]:
# Checking for missing values in the DataFrame
missing_values = df.isnull().sum()

### Handling missing Data

In [None]:
# Dealing with missing values using methods like fillna()
df['Age'].fillna(df['Age'].mean(), inplace=True)

## Merging and Joining DataFrame

### Concatenating DataFrame

In [15]:
# Combining multiple dataframe horizontally or vertically
combined_df = pd.concat([df1, df2], axis=0)

### Merging DataFrame

In [None]:
# Merging DataFrame based on common column
merged_df = pd.merge(df1, df2, on='ID', how='inner')

## Working with Date and Time

### Converting to date and time

In [None]:
# Converting a column to a DateTime format
df['Date'] = pd.to_datetime(df['Date'])

### Extracting Date Components

In [None]:
# Exctracting components like day, month, year
df['Year'] = df['Date'].dt.year

df['Month'] = df['Date'].dt.month

## PLotting with Pandas

### Line Plot

In [None]:
# Creating a line plot frodm data
df.plot(x='Year', y='Sales', kind='line')

### Bar Plot

In [None]:
# creating a bar plot from data
df.plot(x='Categor', y='Revenue', kind='bar')

## Reshaping and Pivot Tables

### Reshaping Data

In [None]:
# Reshaping data using functions like pivot, melt, stack
pivotTable = df.pivot(index='Date', columns='Category', values='Revenue')

### Pivot Tables

In [None]:
pivot_table = df.pivot_table(index='Category', values='Revenue', aggfunc='sum')

## Working with TextData

### String Operations

In [None]:
# Applying string operation on text columns
df('Title_Length') = df['Title'].str.len()