# Pandas
Pandas is a powerful Python library for data manipulation and analysis. It provides data structures and functions to efficiently handle structured data, such as tabular data, time series, and more. Let's explore some key features of Pandas with examples:


1. Importing Pandas:

In [1]:
import pandas as pd

2. Creating a DataFrame:

In [2]:
data = {'Name': ['John', 'Emma', 'Ryan'],
        'Age': [25, 28, 32],
        'City': ['New York', 'London', 'Sydney']}
df = pd.DataFrame(data)
print(df)

   Name  Age      City
0  John   25  New York
1  Emma   28    London
2  Ryan   32    Sydney


3. Accessing and Manipulating Data:

In [3]:
# Accessing columns
print(df['Name'])   # Access 'Name' column
print(df.Age)       # Access 'Age' column using attribute-style access

# Accessing rows
print(df.iloc[0])   # Access first row

# Filtering data
adults = df[df['Age'] >= 28]   # Filter rows where Age is greater than or equal to 28
print(adults)

# Adding a new column
df['Profession'] = ['Engineer', 'Manager', 'Doctor']
print(df)

0    John
1    Emma
2    Ryan
Name: Name, dtype: object
0    25
1    28
2    32
Name: Age, dtype: int64
Name        John
Age           25
City    New York
Name: 0, dtype: object
   Name  Age    City
1  Emma   28  London
2  Ryan   32  Sydney
   Name  Age      City Profession
0  John   25  New York   Engineer
1  Emma   28    London    Manager
2  Ryan   32    Sydney     Doctor


4. Data Summary:

In [4]:
# Basic statistics
print(df.describe())

# Counting unique values
print(df['City'].value_counts())

# Grouping and aggregation
city_group = df.groupby('City')
print(city_group.mean())    # Calculate mean values by City

             Age
count   3.000000
mean   28.333333
std     3.511885
min    25.000000
25%    26.500000
50%    28.000000
75%    30.000000
max    32.000000
New York    1
London      1
Sydney      1
Name: City, dtype: int64
           Age
City          
London    28.0
New York  25.0
Sydney    32.0


  print(city_group.mean())    # Calculate mean values by City


5. Reading and Writing Data:

In [7]:
# Reading from a CSV file
data = pd.read_csv('./data/data.csv')
print(data)
# Writing to a CSV file
df.to_csv('./data/output.csv', index=False)

   Name  Age      City Profession
0  John   25  New York   Engineer
1  Emma   28    London    Manager
2  Ryan   32    Sydney     Doctor


These are just a few examples of the features provided by Pandas. It also offers advanced functionality for data alignment, merging and joining datasets, handling missing data, time series analysis, and more. Pandas is widely used in data exploration, data preprocessing, and data analysis tasks in various domains.