# 5. Time Series Analysis
- Time series analysis involves methods for analyzing time-ordered data. Pandas provides comprehensive tools for handling and analyzing time series data, which is essential for applications in finance, economics, and various fields requiring time-dependent analysis.

### Date and Time Handling
**1. Parsing Dates**

Theory: Parsing dates is the process of converting date and time information from strings into pandas datetime objects. This conversion is crucial for performing time-based operations, comparisons, and manipulations. Pandas provides flexible functions to parse dates from various formats, allowing for efficient date handling.

#### Common Date Formats:

- `'YYYY-MM-DD'`

- `'MM/DD/YYYY'`

- `'DD-MM-YYYY'`

Example:

In [None]:
import pandas as pd

# Example data with date strings
data = {
    'Date': ['2024-01-01', '2024-01-02', '2024-01-03'],
    'Value': [10, 20, 30]
}
df = pd.DataFrame(data)

# Parsing dates
df['Date'] = pd.to_datetime(df['Date'])
print(df)


**2. Date Ranges and Frequency**

-  Date ranges and frequency are used to generate sequences of dates at specified intervals. This is useful for creating time series data with a consistent frequency, such as daily, monthly, or yearly intervals. Pandas provides functions like date_range() to create these sequences and specify frequencies.

#### Common Frequencies:

`'D'`: Daily

`'W'`: Weekly

`'M'`: Monthly

`'Q'`: Quarterly

`'A'`: Annually



Example:

In [None]:
import pandas as pd

# Generate a date range with daily frequency
date_range = pd.date_range(start='2024-01-01', end='2024-01-10', freq='D')
print(date_range)


## Time Series Data
**1. Resampling and Frequency Conversion**

-  Resampling is the process of changing the frequency of your time series data. For example, converting daily data into monthly or yearly summaries. Pandas provides methods for resampling, which allow you to aggregate or interpolate data to a different frequency.

#### Resampling Methods:

- `Aggregation:` Summarizing data over new time periods, e.g., daily to monthly.
- `Downsampling:` Reducing the frequency, e.g., from minute-level to hourly.
- `Upsampling:` Increasing the frequency, e.g., from daily to hourly, often requiring interpolation.



Example:


In [None]:
import pandas as pd

# Create a time series with daily frequency
date_range = pd.date_range(start='2024-01-01', periods=10, freq='D')
df = pd.DataFrame({'Value': range(10)}, index=date_range)

# Resample to monthly frequency and calculate mean
df_resampled = df.resample('M').mean()
print(df_resampled)


**2. Rolling Window Calculations**

-  Rolling window calculations involve applying a function (e.g., mean, sum) to a moving window of data. This technique is used to smooth time series data and compute statistics over a rolling period, such as a moving average.

#### Types of Rolling Calculations:

- `Rolling Mean:` Computes the average within the window.
- `Rolling Sum:` Computes the sum within the window.
-  `Rolling Standard Deviation:` Measures the variability within the window.


Example:

In [None]:
import pandas as pd

# Create a time series with daily frequency
date_range = pd.date_range(start='2024-01-01', periods=10, freq='D')
df = pd.DataFrame({'Value': range(10)}, index=date_range)

# Calculate a 3-day rolling mean
df['Rolling_Mean'] = df['Value'].rolling(window=3).mean()
print(df)


** 3. Time Zone Handling**

- + Time zone handling is crucial when working with time series data across different time zones. Pandas allows you to localize timestamps to specific time zones and convert between time zones, ensuring that your time series analysis is accurate regardless of geographic location.


#### Types of Time Zone Operations:

- `Localization:` Assigning a time zone to naive datetime objects.
- `Conversion:` Changing the time zone of datetime objects to a different time zone.


Example:

In [None]:
import pandas as pd

# Create a time series with naive datetime (no time zone information)
date_range = pd.date_range(start='2024-01-01', periods=3, freq='D')
df = pd.DataFrame({'Value': [1, 2, 3]}, index=date_range)

# Localize to a specific time zone
df = df.tz_localize('America/New_York')
print(df)

# Convert to another time zone
df_utc = df.tz_convert('UTC')
print(df_utc)
