# Pandas DataFrame Tutorial

This notebook covers essential operations and concepts for working with pandas DataFrames.

## Table of Contents
1. [Importing pandas and creating DataFrames](#1.-Importing-pandas-and-creating-DataFrames)
2. [Basic DataFrame operations](#2.-Basic-DataFrame-operations)
3. [Indexing and selection](#3.-Indexing-and-selection)
4. [Handling missing data](#4.-Handling-missing-data)
5. [Data manipulation](#5.-Data-manipulation)
6. [Grouping and aggregation](#6.-Grouping-and-aggregation)
7. [Merging and joining DataFrames](#7.-Merging-and-joining-DataFrames)
8. [Basic data visualization](#8.-Basic-data-visualization)

## 1. Importing pandas and creating DataFrames

In [1]:
import pandas as pd
import numpy as np

# Creating a DataFrame from a dictionary
data = {
    'Name': ['Rami', 'Osama', 'Nael', 'Yara'],
    'cash': [25, 30, 35, 28],
    'birthdate': ['1990-01-01', '1951-02-02', '1892-03-03', '1970-04-04']
}
df = pd.DataFrame(data)
print(df)

    Name  cash   birthdate
0   Rami    25  1990-01-01
1  Osama    30  1951-02-02
2   Nael    35  1892-03-03
3   Yara    28  1970-04-04


## 2. Basic DataFrame operations

In [None]:
# Display basic information about the DataFrame


In [None]:
# Display summary statistics


In [None]:
# Display the first few rows


In [None]:
# Display the last few rows


In [None]:
# Get column names


In [None]:
# Get data types of columns


In [None]:
# fix date


## 3. Indexing and selection

In [None]:
# Select a single column


In [None]:
# Select multiple columns



In [None]:
# Select rows by index



In [None]:
# Select rows by condition


In [None]:
# Select specific rows and columns


## 4. Handling missing data

In [2]:
# Create a DataFrame with missing values
df_missing = pd.DataFrame({
    'A': [1, 2, np.nan, 4],
    'B': [5, np.nan, np.nan, 8],
    'C': [9, 10, 11, 12]
})

df_missing

Unnamed: 0,A,B,C
0,1.0,5.0,9
1,2.0,,10
2,,,11
3,4.0,8.0,12


In [None]:
# Check for missing values


In [None]:
# Drop rows with missing values


In [None]:
# Fill missing values


## 5. Data manipulation

In [3]:
# Add a new column
df['Country'] = ['JOR', 'JOR', 'JOR', 'JOR']
df


Unnamed: 0,Name,cash,birthdate,Country
0,Rami,25,1990-01-01,JOR
1,Osama,30,1951-02-02,JOR
2,Nael,35,1892-03-03,JOR
3,Yara,28,1970-04-04,JOR


In [None]:
# Rename columns



In [None]:
# Sort DataFrame


In [None]:
# Apply a function to a column


## 6. Grouping and aggregation

In [4]:
# Create a larger DataFrame for grouping
df_large = pd.DataFrame({
    'Category': ['red ball', 'blue ball', 'red ball', 'blue ball', 'red ball', 'blue ball'],
    'Value': [10, 21, 13, 34, 18, 27]
})
df_large

Unnamed: 0,Category,Value
0,red ball,10
1,blue ball,21
2,red ball,13
3,blue ball,34
4,red ball,18
5,blue ball,27


In [None]:
# Group by category and calculate sum


In [None]:
# Group by category and calculate multiple aggregations


## 7. Merging and joining DataFrames

In [5]:
# Create two DataFrames to merge
df1 = pd.DataFrame({'ID': [1, 2, 3, 4], 'Name': ['Osama', 'Yara', 'Nael', 'Rami']})
df2 = pd.DataFrame({'ID': [2, 3, 4, 5], 'Salary': [50000, 60000, 70000, 80000]})

display(df1)
display(df2)

Unnamed: 0,ID,Name
0,1,Osama
1,2,Yara
2,3,Nael
3,4,Rami


Unnamed: 0,ID,Salary
0,2,50000
1,3,60000
2,4,70000
3,5,80000


In [None]:
# Perform an inner join


In [None]:
# Perform a left join


In [None]:
# Concatenate DataFrames


## 8. Basic data visualization

In [6]:
import plotly.express as px

df_large = pd.DataFrame({
    'Category': ['red ball', 'blue ball', 'red ball', 'blue ball', 'red ball', 'blue ball'],
    'Value': [10, 21, 13, 34, 18, 27]
})
df_large

Unnamed: 0,Category,Value
0,red ball,10
1,blue ball,21
2,red ball,13
3,blue ball,34
4,red ball,18
5,blue ball,27


In [None]:
# Create a bar chart


In [None]:
# create pie chart


In [None]:
# Create a histogram
