This file i will cover the important concepts of Pandas.

# Topics

Introduction to pandas

    1. What is pandas?

        Overview of pandas library
        Key features and benefits
        Installation of pandas

Data Structures

    2. Series
    
        Creating a Series
        Accessing data from Series
        Operations on Series

    3. DataFrame

        Creating DataFrames
        Accessing data from DataFrames
        Basic operations (selection, filtering, etc.)
    
Data Input/Output

    4. Reading and Writing Data

        Reading data from CSV, Excel, JSON, SQL, etc.
        Writing data to CSV, Excel, JSON, SQL, etc.
    
Data Manipulation

    5. Indexing and Selection

        Indexing and selecting data using loc, iloc, at, iat
        Boolean indexing
        Setting and resetting index
    
    6. Handling Missing Data

        Identifying missing data
        Handling missing data using dropna(), fillna(), etc.
    
    7. Data Cleaning

        Removing duplicates
        Replacing values
        Applying functions using apply() and map()
    
    8. Data Transformation

        Sorting data
        Renaming columns
        Discretization and binning

Data Aggregation and Grouping

    9. GroupBy

        Grouping data
        Aggregation functions
        Applying multiple functions at once
        Grouping by multiple columns

Data Merging and Concatenation

    10. Combining DataFrames

        Concatenating DataFrames
        Merging and joining DataFrames
        Append method

Time Series Analysis

    11. Time Series Data
    
        Working with datetime objects
        Date range generation
        Resampling time series data

Visualization

    12. Data Visualization with pandas
    
        Plotting with pandas (using plot())
        Integration with matplotlib

Advanced Topics

    13. Advanced Data Operations

        Pivot tables
        MultiIndex and hierarchical indexing
        Using query() method

    14. Performance Optimization

        Efficient use of memory
        Vectorization
        Using eval() and query() for performance

Practical Applications

    15. Case Studies and Projects

        Real-world data analysis projects
        Hands-on practice with datasets

Resources and Further Learning

    16. Additional Resources

        Documentation and tutorials
        Books and online courses
        Community and forums

# Installation steps :

Open your command prompt and write down these commands :

1. pip install pandas 
2. For conda, conda install pandas


# Topic 1 : Introduction to pandas

### Overview of pandas library


Pandas is an open-source data manipulation and analysis library for Python. It provides data structures and functions needed to manipulate structured data seamlessly. With pandas, you can handle a variety of tasks such as data cleaning, transformation, aggregation, and visualization.

Key features and benefits :

1. Data Structures: Provides two primary data structures – Series (1-dimensional) and DataFrame (2-dimensional).
2. Easy Handling: Simplifies handling of missing data, data alignment, and data manipulation.
3. Integration: Works well with other Python libraries such as NumPy, matplotlib, and scikit-learn.
4. Performance: Offers efficient handling of large datasets through optimized performance and memory usage.


In [4]:
# Import pandas library
import pandas as pd

# Create a simple DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'San Francisco', 'Los Angeles']
}

# Convert the dictionary to a DataFrame
df = pd.DataFrame(data)

# Display the DataFrame
print("DataFrame:")
print(df)

# Perform basic operations
# Select a single column
ages = df['Age']
print("\nAges column:")
print(ages)

# Select multiple columns
name_city = df[['Name', 'City']]
print("\nName and City columns:")
print(name_city)

# Filter rows based on a condition
age_above_28 = df[df['Age'] > 28]
print("\nRows where Age is above 28:")
print(age_above_28)


DataFrame:
      Name  Age           City
0    Alice   25       New York
1      Bob   30  San Francisco
2  Charlie   35    Los Angeles

Ages column:
0    25
1    30
2    35
Name: Age, dtype: int64

Name and City columns:
      Name           City
0    Alice       New York
1      Bob  San Francisco
2  Charlie    Los Angeles

Rows where Age is above 28:
      Name  Age           City
1      Bob   30  San Francisco
2  Charlie   35    Los Angeles


# Topic 2 : Data Structures in Pandas

### Series

A Series is a one-dimensional array-like object that can hold data of any type (integers, strings, floating point numbers, Python objects, etc.). It is similar to a column in a table or an Excel spreadsheet. Each element in a Series is associated with an index, which can be either implicitly assigned or explicitly specified.

In [2]:
import pandas as pd

# Create a Series from a list
data_list = [10, 20, 30, 40]
series_from_list = pd.Series(data_list)
print("Series from list:")
print(series_from_list)

# Create a Series from a dictionary
data_dict = {'a': 1, 'b': 2, 'c': 3}
series_from_dict = pd.Series(data_dict)
print("\nSeries from dictionary:")
print(series_from_dict)

# Create a Series from a scalar value
scalar_value = 5
series_from_scalar = pd.Series(scalar_value, index=['x', 'y', 'z'])
print("\nSeries from scalar:")
print(series_from_scalar)


Series from list:
0    10
1    20
2    30
3    40
dtype: int64

Series from dictionary:
a    1
b    2
c    3
dtype: int64

Series from scalar:
x    5
y    5
z    5
dtype: int64


In [3]:
# Accessing data from a Series
# You can access elements in a Series using labels (index) or positional indexing.

# Accessing data by label
print("\nElement with label 'b':", series_from_dict['b'])

# Accessing data by position
print("Element at position 2:", series_from_list[2])


Element with label 'b': 2
Element at position 2: 30


In [4]:
# Operations on Series
# You can perform various operations on Series, such as arithmetic operations, statistical calculations, and more.

# Arithmetic operations
series1 = pd.Series([1, 2, 3])
series2 = pd.Series([4, 5, 6])
sum_series = series1 + series2
print("\nSum of two Series:")
print(sum_series)

# Statistical operations
mean_value = series1.mean()
print("\nMean of series1:", mean_value)



Sum of two Series:
0    5
1    7
2    9
dtype: int64

Mean of series1: 2.0


In [5]:
# With the index argument, you can name your own labels.

import pandas as pd

a = [1, 7, 2]

myvar = pd.Series(a, index = ["x", "y", "z"])

print(myvar)

print(myvar["y"])

x    1
y    7
z    2
dtype: int64
7


In [6]:
# You can also use a key/value object, like a dictionary, when creating a Series.

import pandas as pd

calories = {"day1": 420, "day2": 380, "day3": 390}

myvar = pd.Series(calories)

print(myvar)

day1    420
day2    380
day3    390
dtype: int64


### DataFrame

A DataFrame is a two-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns). It is similar to a table in a relational database or an Excel spreadsheet.

In [7]:
# Create a DataFrame from a list of dictionaries
data = [
    {'Name': 'Alice', 'Age': 25, 'City': 'New York'},
    {'Name': 'Bob', 'Age': 30, 'City': 'San Francisco'},
    {'Name': 'Charlie', 'Age': 35, 'City': 'Los Angeles'}
]
df = pd.DataFrame(data)
print("DataFrame from list of dictionaries:")
print(df)

# Create a DataFrame from a dictionary of lists
data_dict = {
    'Name': ['David', 'Eve', 'Frank'],
    'Age': [40, 45, 50],
    'City': ['Chicago', 'Houston', 'Phoenix']
}
df_dict = pd.DataFrame(data_dict)
print("\nDataFrame from dictionary of lists:")
print(df_dict)


DataFrame from list of dictionaries:
      Name  Age           City
0    Alice   25       New York
1      Bob   30  San Francisco
2  Charlie   35    Los Angeles

DataFrame from dictionary of lists:
    Name  Age     City
0  David   40  Chicago
1    Eve   45  Houston
2  Frank   50  Phoenix


In [8]:
# Accessing data from DataFrames
# You can access elements in a DataFrame using labels (columns) or positional indexing.

# Accessing columns
print("\nAccessing 'Name' column:")
print(df['Name'])

# Accessing rows using iloc (positional indexing)
print("\nAccessing the first row using iloc:")
print(df.iloc[0])

# Accessing rows using loc (label-based indexing)
print("\nAccessing the row with index 1 using loc:")
print(df.loc[1])



Accessing 'Name' column:
0      Alice
1        Bob
2    Charlie
Name: Name, dtype: object

Accessing the first row using iloc:
Name       Alice
Age           25
City    New York
Name: 0, dtype: object

Accessing the row with index 1 using loc:
Name              Bob
Age                30
City    San Francisco
Name: 1, dtype: object


In [9]:
# Basic operations on DataFrames
# You can perform various operations on DataFrames, such as selecting, filtering, and aggregating data.

# Selecting specific columns
name_age_df = df[['Name', 'Age']]
print("\nDataFrame with 'Name' and 'Age' columns:")
print(name_age_df)

# Filtering rows based on a condition
age_above_30 = df[df['Age'] > 30]
print("\nRows where Age is above 30:")
print(age_above_30)

# Adding a new column
df['Salary'] = [50000, 60000, 70000]
print("\nDataFrame after adding 'Salary' column:")
print(df)



DataFrame with 'Name' and 'Age' columns:
      Name  Age
0    Alice   25
1      Bob   30
2  Charlie   35

Rows where Age is above 30:
      Name  Age         City
2  Charlie   35  Los Angeles

DataFrame after adding 'Salary' column:
      Name  Age           City  Salary
0    Alice   25       New York   50000
1      Bob   30  San Francisco   60000
2  Charlie   35    Los Angeles   70000


# Topic 3 : Data input/output

pandas provides a wide range of functionalities for reading and writing data from various file formats and data sources. 

This allows seamless integration of data into pandas for analysis and manipulation.

### Reading Data

pandas can read data from various sources such as CSV, Excel, JSON, SQL databases, and more.

#### Reading data from CSV files

In [13]:
# CSV (Comma-Separated Values) files are one of the most common data formats. 
# Pandas provides the read_csv function to read data from CSV files into a DataFrame.

import pandas as pd

# Reading a CSV file
df_csv = pd.read_csv('data.csv')
print("Data from CSV file:")
print(df_csv.head())


Data from CSV file:
  Product  Price  Quantity
0       A    100        30
1       B    150        45
2       C    200        20
3       D    250        60
4       E    300        50


#### Reading Excel files



In [14]:
# # Reading an Excel file
# df_excel = pd.read_excel('data.xlsx', sheet_name='Sheet1')
# print("\nData from Excel file:")
# print(df_excel.head())

#### Reading JSON files

In [15]:
# # Reading a JSON file
# df_json = pd.read_json('data.json')
# print("\nData from JSON file:")
# print(df_json.head())


### Writing to CSV File 

In [16]:
# Writing to a CSV file

df_csv.to_csv('output.csv', index=False)
print("\nData written to CSV file.")



Data written to CSV file.


In [17]:
# # Writing data to an Excel file
# df_excel.to_excel('output.xlsx', sheet_name='Sheet1', index=False)
# print("\nData written to Excel file 'output.xlsx'")

In [18]:
# # Writing data to a JSON file
# df_json.to_json('output.json')
# print("\nData written to JSON file 'output.json'")


### Example code

In [19]:
# Example Code
#Here's a small example demonstrating the process of reading data from a CSV file, 
# performing some operations, and then writing the results back to a CSV file.

import pandas as pd

# Reading data from a CSV file
df = pd.read_csv('example.csv')
print("DataFrame from CSV file:")
print(df.head())

# Performing some operations
# Filtering rows where the value in 'Age' column is greater than 25
filtered_df = df[df['Age'] > 25]
print("\nFiltered DataFrame (Age > 25):")
print(filtered_df)

# Adding a new column
filtered_df['New_Column'] = filtered_df['Age'] * 2
print("\nDataFrame after adding 'New_Column':")
print(filtered_df)

# Writing the modified DataFrame to a new CSV file
filtered_df.to_csv('filtered_output.csv', index=False)
print("\nFiltered data written to CSV file 'filtered_output.csv'")


DataFrame from CSV file:
      Name  Age           City
0    Alice   23       New York
1      Bob   35  San Francisco
2  Charlie   45    Los Angeles
3    David   25        Chicago
4      Eve   27        Houston

Filtered DataFrame (Age > 25):
      Name  Age           City
1      Bob   35  San Francisco
2  Charlie   45    Los Angeles
4      Eve   27        Houston

DataFrame after adding 'New_Column':
      Name  Age           City  New_Column
1      Bob   35  San Francisco          70
2  Charlie   45    Los Angeles          90
4      Eve   27        Houston          54

Filtered data written to CSV file 'filtered_output.csv'


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df['New_Column'] = filtered_df['Age'] * 2


# Topic 4: Data Manipulation


Data manipulation is a key aspect of using pandas effectively. It involves various operations such as indexing, selecting, filtering, handling missing data, and transforming the data to prepare it for analysis.

Indexing and Selection
pandas provides powerful and flexible ways to index and select data. The primary methods for this are loc and iloc.


### Using loc

The loc method is used for label-based indexing and can accept labels or boolean arrays.


In [20]:
import pandas as pd

# Sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [23, 35, 45],
    'City': ['New York', 'San Francisco', 'Los Angeles']
}
df = pd.DataFrame(data)

# Using loc for label-based indexing
# Select row with index 0
print("Row with index 0:")
print(df.loc[0])

# Select rows where Age > 30
print("\nRows where Age > 30:")
print(df.loc[df['Age'] > 30])

# Select specific rows and columns
print("\nSelect specific rows and columns (Name and City for index 1):")
print(df.loc[1, ['Name', 'City']])


Row with index 0:
Name       Alice
Age           23
City    New York
Name: 0, dtype: object

Rows where Age > 30:
      Name  Age           City
1      Bob   35  San Francisco
2  Charlie   45    Los Angeles

Select specific rows and columns (Name and City for index 1):
Name              Bob
City    San Francisco
Name: 1, dtype: object


### Using 'iloc'
The iloc method is used for positional indexing, where you can specify rows and columns by their integer positions.

In [None]:
# Using iloc for positional indexing
# Select the first row
print("\nFirst row:")
print(df.iloc[0])

# Select the first two rows
print("\nFirst two rows:")
print(df.iloc[:2])

# Select specific rows and columns by position
print("\nSelect specific rows and columns by position (first two rows, first two columns):")
print(df.iloc[:2, :2])


### Boolean Indexing
Boolean indexing is a powerful technique to filter data based on conditions.

In [21]:
# Boolean indexing to filter rows
# Select rows where Age is greater than 30
print("\nRows where Age > 30:")
print(df[df['Age'] > 30])



Rows where Age > 30:
      Name  Age           City
1      Bob   35  San Francisco
2  Charlie   45    Los Angeles


### Handling Missing Data
Handling missing data is a crucial aspect of data preparation. pandas provides several methods to identify and handle missing data.

Identifying Missing Data
You can use isna() or isnull() to identify missing values in a DataFrame.



In [23]:
# Sample DataFrame with missing values
data = {
    'Name': ['Alice', 'Bob', None],
    'Age': [23, None, 45],
    'City': ['New York', 'San Francisco', None]
}
df_missing = pd.DataFrame(data)

# Identify missing data
print("\nIdentify missing data:")
print(df_missing.isna())

# Prints True if data is missing at some location, and False if not. 


Identify missing data:
    Name    Age   City
0  False  False  False
1  False   True  False
2   True  False   True


### Handling Missing Data
You can use methods like dropna() to remove missing data or fillna() to fill missing data with specific values.

In [24]:
# Dropping rows with any missing values
print("\nDataFrame after dropping rows with any missing values:")
print(df_missing.dropna())

# Filling missing values with a specific value
print("\nDataFrame after filling missing values with 'Unknown':")
print(df_missing.fillna('Unknown'))



DataFrame after dropping rows with any missing values:
    Name   Age      City
0  Alice  23.0  New York

DataFrame after filling missing values with 'Unknown':
      Name      Age           City
0    Alice     23.0       New York
1      Bob  Unknown  San Francisco
2  Unknown     45.0        Unknown


### Topic 7 : Data Cleaning
Data cleaning involves various tasks to ensure data quality and consistency.

Removing Duplicates
You can use drop_duplicates() to remove duplicate rows from a DataFrame.

In [25]:
# Sample DataFrame with duplicate rows
data = {
    'Name': ['Alice', 'Bob', 'Alice'],
    'Age': [23, 35, 23],
    'City': ['New York', 'San Francisco', 'New York']
}
df_duplicates = pd.DataFrame(data)

# Remove duplicate rows
print("\nDataFrame after removing duplicate rows:")
print(df_duplicates.drop_duplicates())


DataFrame after removing duplicate rows:
    Name  Age           City
0  Alice   23       New York
1    Bob   35  San Francisco


### Replacing Values
You can use replace() to replace specific values in a DataFrame.

In [29]:
# Replacing specific values
print("\nDataFrame after replacing 'New York' with 'NYC':")
print(df.replace('New York', 'NYC'))



DataFrame after replacing 'New York' with 'NYC':
      Name  Age           City
0    Alice   23            NYC
1      Bob   35  San Francisco
2  Charlie   45    Los Angeles


# Topic 8: Data Transformation

Data transformation includes operations like sorting, renaming columns, and applying functions to modify the data.

Sorting Data
You can use sort_values() to sort the data by specific columns.

In [30]:
# Sorting by Age
print("\nDataFrame sorted by Age:")
print(df.sort_values(by='Age'))

# Sorting by multiple columns
print("\nDataFrame sorted by Age and Name:")
print(df.sort_values(by=['Age', 'Name']))



DataFrame sorted by Age:
      Name  Age           City
0    Alice   23       New York
1      Bob   35  San Francisco
2  Charlie   45    Los Angeles

DataFrame sorted by Age and Name:
      Name  Age           City
0    Alice   23       New York
1      Bob   35  San Francisco
2  Charlie   45    Los Angeles


### Renaming Columns
You can use rename() to rename columns in a DataFrame.

In [32]:
# Renaming columns
print("\nDataFrame with renamed columns:")
print(df.rename(columns={'Name': 'Full Name', 'Age': 'Years'}))



DataFrame with renamed columns:
  Full Name  Years           City
0     Alice     23       New York
1       Bob     35  San Francisco
2   Charlie     45    Los Angeles


### Applying Functions
You can use apply() to apply a function to each element in a Series or DataFrame.

In [33]:
# Applying a function to a column
print("\nDataFrame with Age column incremented by 1:")
print(df['Age'].apply(lambda x: x + 1))



DataFrame with Age column incremented by 1:
0    24
1    36
2    46
Name: Age, dtype: int64


### Example Data for All Topics

In [34]:
# Example DataFrame for all topics
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
    'Age': [23, 35, 45, 25, 27],
    'City': ['New York', 'San Francisco', 'Los Angeles', 'Chicago', 'Houston']
}
df = pd.DataFrame(data)

# Example DataFrame with missing values
data_missing = {
    'Name': ['Alice', 'Bob', None],
    'Age': [23, None, 45],
    'City': ['New York', 'San Francisco', None]
}
df_missing = pd.DataFrame(data_missing)

# Example DataFrame with duplicate rows
data_duplicates = {
    'Name': ['Alice', 'Bob', 'Alice'],
    'Age': [23, 35, 23],
    'City': ['New York', 'San Francisco', 'New York']
}
df_duplicates = pd.DataFrame(data_duplicates)


# Topic 9: Data Aggregation and Grouping

Data aggregation and grouping are powerful features in pandas that allow you to group data based on certain criteria and apply aggregate functions to each group. The primary method for this is groupby().

### GroupBy
The groupby() method is used to group data by one or more columns and perform aggregate operations.

In [35]:
import pandas as pd

# Sample DataFrame
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'Alice', 'Bob'],
    'Sales': [200, 150, 300, 400, 130],
    'Region': ['East', 'West', 'North', 'East', 'West']
}
df = pd.DataFrame(data)

# Grouping data by 'Name'
grouped = df.groupby('Name')
print("Grouped by 'Name':")
print(grouped.sum())


Grouped by 'Name':
         Sales    Region
Name                    
Alice      600  EastEast
Bob        280  WestWest
Charlie    300     North


### Aggregation Functions
You can apply various aggregation functions like sum(), mean(), count(), etc., to the grouped data.

In [36]:
# Aggregating using sum and mean
print("\nSum of Sales by Name:")
print(grouped['Sales'].sum())

print("\nMean of Sales by Name:")
print(grouped['Sales'].mean())



Sum of Sales by Name:
Name
Alice      600
Bob        280
Charlie    300
Name: Sales, dtype: int64

Mean of Sales by Name:
Name
Alice      300.0
Bob        140.0
Charlie    300.0
Name: Sales, dtype: float64


### Grouping by Multiple Columns
You can group data by multiple columns by passing a list of column names to groupby().

In [37]:
# Grouping by 'Name' and 'Region'
grouped_multi = df.groupby(['Name', 'Region'])
print("\nGrouped by 'Name' and 'Region':")
print(grouped_multi.sum())



Grouped by 'Name' and 'Region':
                Sales
Name    Region       
Alice   East      600
Bob     West      280
Charlie North     300


# Topic 10: Data Merging and Concatenation
Data merging and concatenation are used to combine multiple DataFrames into a single DataFrame.

### Concatenating DataFrames
The concat() function is used to concatenate DataFrames along a particular axis (rows or columns).

In [38]:
# Sample DataFrames
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2'], 'B': ['B0', 'B1', 'B2']})
df2 = pd.DataFrame({'A': ['A3', 'A4', 'A5'], 'B': ['B3', 'B4', 'B5']})

# Concatenating along rows
result = pd.concat([df1, df2])
print("Concatenated along rows:")
print(result)

# Concatenating along columns
result = pd.concat([df1, df2], axis=1)
print("\nConcatenated along columns:")
print(result)


Concatenated along rows:
    A   B
0  A0  B0
1  A1  B1
2  A2  B2
0  A3  B3
1  A4  B4
2  A5  B5

Concatenated along columns:
    A   B   A   B
0  A0  B0  A3  B3
1  A1  B1  A4  B4
2  A2  B2  A5  B5


### Merging DataFrames
The merge() function is used to merge DataFrames based on common columns or indices.

In [40]:
# Sample DataFrames
df1 = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'A': ['A0', 'A1', 'A2']})
df2 = pd.DataFrame({'key': ['K0', 'K1', 'K3'], 'B': ['B0', 'B1', 'B3']})

# Merging DataFrames
merged = pd.merge(df1, df2, on='key')
print("\nMerged DataFrame on 'key':")
print(merged)



Merged DataFrame on 'key':
  key   A   B
0  K0  A0  B0
1  K1  A1  B1


### Joining DataFrames
The join() method is used to join DataFrames on their indices.

In [41]:
# Sample DataFrames
df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2']}, index=['K0', 'K1', 'K2'])
df2 = pd.DataFrame({'B': ['B0', 'B1', 'B2']}, index=['K0', 'K1', 'K2'])

# Joining DataFrames
joined = df1.join(df2)
print("\nJoined DataFrame on index:")
print(joined)



Joined DataFrame on index:
     A   B
K0  A0  B0
K1  A1  B1
K2  A2  B2


# Topic 11: Time Series Analysis
pandas provides robust functionality for working with time series data. This includes parsing dates, generating date ranges, and resampling time series data.

### Working with Datetime
You can convert strings to datetime objects using pd.to_datetime().

In [42]:
# Sample DataFrame
data = {
    'Date': ['2021-01-01', '2021-01-02', '2021-01-03'],
    'Value': [100, 200, 300]
}
df = pd.DataFrame(data)

# Converting to datetime
df['Date'] = pd.to_datetime(df['Date'])
print("DataFrame with datetime objects:")
print(df)


DataFrame with datetime objects:
        Date  Value
0 2021-01-01    100
1 2021-01-02    200
2 2021-01-03    300


### Date Range Generation
You can generate a range of dates using pd.date_range().

In [43]:
# Generating a date range
date_range = pd.date_range(start='2021-01-01', end='2021-01-07')
print("\nGenerated date range:")
print(date_range)



Generated date range:
DatetimeIndex(['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04',
               '2021-01-05', '2021-01-06', '2021-01-07'],
              dtype='datetime64[ns]', freq='D')


### Resampling Time Series Data
You can resample time series data to a different frequency using the resample() method.

In [44]:
# Sample time series data
data = {
    'Date': pd.date_range(start='2021-01-01', periods=6, freq='D'),
    'Value': [100, 200, 300, 400, 500, 600]
}
df = pd.DataFrame(data).set_index('Date')

# Resampling to monthly frequency
resampled = df.resample('M').sum()
print("\nResampled time series data (monthly):")
print(resampled)



Resampled time series data (monthly):
            Value
Date             
2021-01-31   2100


  resampled = df.resample('M').sum()
