# üêº Pandas -  Working with Time Series
Today we‚Äôll learn how to handle and analyze time series data.

## 1. Converting Columns to Datetime (`pd.to_datetime`)
- Use `pd.to_datetime()` to convert strings or numbers into datetime objects.
- Once converted, Pandas recognizes them as dates, enabling powerful operations.

In [8]:
import pandas as pd

# 1. Create the dataset
data = {
    "OrderID": [1001, 1002, 1003, 1004, 1005],
    "OrderDate": ["2024-01-05", "2024/01/12", "15-01-2024", "2024-01-20", "20240125"],
    "Customer": ["Alice", "Bob", "Charlie", "David", "Emma"],
    "Amount": [250, 400, 150, 300, 500]
}

df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

# 2. Convert the 'OrderDate' column to datetime
df["OrderDate"] = pd.to_datetime(df["OrderDate"], dayfirst=True, errors="coerce")
print("\nAfter converting OrderDate to datetime:")
print(df)

# 3. Verify data types
print("\nData types after conversion:")
print(df.dtypes)

# 4. (Optional) Handle any rows where conversion failed
if df["OrderDate"].isna().any():
    print("\nRows with invalid dates:")
    print(df[df["OrderDate"].isna()])


Original DataFrame:
   OrderID   OrderDate Customer  Amount
0     1001  2024-01-05    Alice     250
1     1002  2024/01/12      Bob     400
2     1003  15-01-2024  Charlie     150
3     1004  2024-01-20    David     300
4     1005    20240125     Emma     500

After converting OrderDate to datetime:
   OrderID  OrderDate Customer  Amount
0     1001 2024-05-01    Alice     250
1     1002        NaT      Bob     400
2     1003        NaT  Charlie     150
3     1004        NaT    David     300
4     1005        NaT     Emma     500

Data types after conversion:
OrderID               int64
OrderDate    datetime64[ns]
Customer             object
Amount                int64
dtype: object

Rows with invalid dates:
   OrderID OrderDate Customer  Amount
1     1002       NaT      Bob     400
2     1003       NaT  Charlie     150
3     1004       NaT    David     300
4     1005       NaT     Emma     500


## 2. Setting DateTime Index & Resampling
- Set a datetime column as the index with `set_index()`.
- Resample time series using `resample('M').mean()` or other rules ('D', 'W', 'Y').
- Useful for aggregating data by time periods.

In [9]:

# 1Ô∏è‚É£ Convert OrderDate to datetime
df["OrderDate"] = pd.to_datetime(df["OrderDate"], dayfirst=True, errors="coerce")
print("DataFrame after converting dates:")
print(df)

# 2Ô∏è‚É£ Set OrderDate as the index
df_idx = df.set_index("OrderDate")
print("\nDataFrame with OrderDate as index:")
print(df_idx)

# 3Ô∏è‚É£ Resample by month to compute total Amount per month
monthly_total = df_idx.resample("M")["Amount"].sum()
print("\nTotal Amount per month:")
print(monthly_total)

# 4Ô∏è‚É£ Resample by week to compute average Amount per week
weekly_avg = df_idx.resample("W")["Amount"].mean()
print("\nAverage Amount per week:")
print(weekly_avg)

DataFrame after converting dates:
   OrderID  OrderDate Customer  Amount
0     1001 2024-05-01    Alice     250
1     1002        NaT      Bob     400
2     1003        NaT  Charlie     150
3     1004        NaT    David     300
4     1005        NaT     Emma     500

DataFrame with OrderDate as index:
            OrderID Customer  Amount
OrderDate                           
2024-05-01     1001    Alice     250
NaT            1002      Bob     400
NaT            1003  Charlie     150
NaT            1004    David     300
NaT            1005     Emma     500

Total Amount per month:
OrderDate
2024-05-31    250
Name: Amount, dtype: int64

Average Amount per week:
OrderDate
2024-05-05    250.0
Name: Amount, dtype: float64


  monthly_total = df_idx.resample("M")["Amount"].sum()


## mini-project

Converting columns to datetime (pd.to_datetime)

Setting a DateTime index & resampling

In [10]:
import pandas as pd

# 1Ô∏è‚É£ Create a dataset of website visits (string dates + visit counts)
data = {
    "VisitDate": [
        "2024-01-03", "2024/01/08", "15-01-2024", "2024-02-02",
        "2024-02-15", "2024/03/01", "2024-03-12", "01-04-2024"
    ],
    "Visitors": [120, 150, 90, 200, 180, 220, 210, 300]
}

df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

# 2Ô∏è‚É£ Convert VisitDate to datetime
df["VisitDate"] = pd.to_datetime(df["VisitDate"], dayfirst=True, errors="coerce")
print("\nAfter converting VisitDate to datetime:")
print(df)

# 3Ô∏è‚É£ Set VisitDate as the index
df = df.set_index("VisitDate")
print("\nDataFrame with VisitDate as index:")
print(df)

# 4Ô∏è‚É£ Resample to get total visitors per month
monthly_total = df.resample("M")["Visitors"].sum()
print("\nTotal visitors per month:")
print(monthly_total)

# 5Ô∏è‚É£ Resample to get average visitors per week
weekly_avg = df.resample("W")["Visitors"].mean()
print("\nAverage visitors per week:")
print(weekly_avg)


Original DataFrame:
    VisitDate  Visitors
0  2024-01-03       120
1  2024/01/08       150
2  15-01-2024        90
3  2024-02-02       200
4  2024-02-15       180
5  2024/03/01       220
6  2024-03-12       210
7  01-04-2024       300

After converting VisitDate to datetime:
   VisitDate  Visitors
0 2024-03-01       120
1        NaT       150
2        NaT        90
3 2024-02-02       200
4        NaT       180
5        NaT       220
6 2024-12-03       210
7        NaT       300

DataFrame with VisitDate as index:
            Visitors
VisitDate           
2024-03-01       120
NaT              150
NaT               90
2024-02-02       200
NaT              180
NaT              220
2024-12-03       210
NaT              300

Total visitors per month:
VisitDate
2024-02-29    200
2024-03-31    120
2024-04-30      0
2024-05-31      0
2024-06-30      0
2024-07-31      0
2024-08-31      0
2024-09-30      0
2024-10-31      0
2024-11-30      0
2024-12-31    210
Name: Visitors, dtype: int64

Avera

  monthly_total = df.resample("M")["Visitors"].sum()


---
## Summary
- Converted columns to datetime with `pd.to_datetime`.
- Set DateTime index and resampled data.
- Performed date-based filtering and slicing.
- Applied rolling and expanding window calculations.