# Pandas DataFrame Basics

<!--
Author: RSK World
Website: https://rskworld.in
Email: help@rskworld.in
Phone: +91 93305 39277
Description: Introduction to Pandas DataFrames
-->

## Introduction

This notebook covers the basics of Pandas DataFrames, including creating DataFrames, understanding their structure, and basic operations.



In [None]:
# Author: RSK World | Website: https://rskworld.in | Email: help@rskworld.in | Phone: +91 93305 39277

import pandas as pd
import numpy as np

print("Pandas version:", pd.__version__)
print("NumPy version:", np.__version__)



## Creating DataFrames

There are multiple ways to create a DataFrame in Pandas:


In [None]:
# Author: RSK World | Website: https://rskworld.in | Email: help@rskworld.in | Phone: +91 93305 39277

# Method 1: From a dictionary
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [25, 30, 35, 28, 32],
    'City': ['New York', 'London', 'Tokyo', 'Paris', 'Sydney'],
    'Salary': [50000, 60000, 70000, 55000, 65000]
}

df1 = pd.DataFrame(data)
print("DataFrame from dictionary:")
print(df1)
print("\nDataFrame shape:", df1.shape)
print("DataFrame columns:", df1.columns.tolist())
print("DataFrame index:", df1.index.tolist())



In [None]:
# Author: RSK World | Website: https://rskworld.in | Email: help@rskworld.in | Phone: +91 93305 39277

# Method 2: From a list of lists
data_list = [
    ['Alice', 25, 'New York', 50000],
    ['Bob', 30, 'London', 60000],
    ['Charlie', 35, 'Tokyo', 70000],
    ['David', 28, 'Paris', 55000],
    ['Eve', 32, 'Sydney', 65000]
]

df2 = pd.DataFrame(data_list, columns=['Name', 'Age', 'City', 'Salary'])
print("DataFrame from list of lists:")
print(df2)



In [None]:
# Author: RSK World | Website: https://rskworld.in | Email: help@rskworld.in | Phone: +91 93305 39277

# Method 3: From a CSV file (we'll use sample data)
# df3 = pd.read_csv('data/sample_data.csv')

# Method 4: Using NumPy array
np_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
df4 = pd.DataFrame(np_array, columns=['A', 'B', 'C'])
print("DataFrame from NumPy array:")
print(df4)



## Basic DataFrame Operations


In [None]:
# Author: RSK World | Website: https://rskworld.in | Email: help@rskworld.in | Phone: +91 93305 39277

# Using the first DataFrame
df = df1.copy()

# Basic information about the DataFrame
print("=== DataFrame Info ===")
print(df.info())
print("\n=== DataFrame Description ===")
print(df.describe())
print("\n=== First few rows ===")
print(df.head())
print("\n=== Last few rows ===")
print(df.tail(2))



In [None]:
# Author: RSK World | Website: https://rskworld.in | Email: help@rskworld.in | Phone: +91 93305 39277

# Accessing columns
print("=== Accessing Columns ===")
print("Name column:")
print(df['Name'])
print("\nMultiple columns:")
print(df[['Name', 'Age']])
print("\nColumn data types:")
print(df.dtypes)



In [None]:
# Author: RSK World | Website: https://rskworld.in | Email: help@rskworld.in | Phone: +91 93305 39277

# Adding new columns
df['Bonus'] = df['Salary'] * 0.1
df['Total_Compensation'] = df['Salary'] + df['Bonus']
print("DataFrame with new columns:")
print(df)



In [None]:
# Author: RSK World | Website: https://rskworld.in | Email: help@rskworld.in | Phone: +91 93305 39277

# Basic statistics
print("=== Basic Statistics ===")
print(f"Mean salary: ${df['Salary'].mean():.2f}")
print(f"Max salary: ${df['Salary'].max():.2f}")
print(f"Min salary: ${df['Salary'].min():.2f}")
print(f"Standard deviation: ${df['Salary'].std():.2f}")

