# Assignment: Time-Series Data Analysis

This assignment will guide you through the process of analyzing a time-series dataset to identify underlying trends, patterns, and seasonal variations using Pandas.

#### Setup: Generate the Time-Series Dataset

Run the following code in this Jupyter Notebook to generate your synthetic time-series dataset:


In [2]:
import pandas as pd
import numpy as np

# Seed for reproducibility
np.random.seed(0)

# Generate a date range
dates = pd.date_range(start="2020-01-01", end="2020-12-31", freq="D")

# Generate synthetic time-series data
data = {
    "Date": dates,
    "Sales": np.random.rand(len(dates)) * 200
    + np.sin(np.linspace(-3, 3, len(dates))) * 50
    + 100,
}

# Create DataFrame
df = pd.DataFrame(data)
df.set_index("Date", inplace=True)

#### Task 1: Initial Exploration

Begin with an initial exploration to understand your dataset's structure and main components.

1. **Display the first few rows of the dataset to get an idea of its structure.** Insert your code below:


In [3]:
# INSERT CODE HERE to display the first 5 rows of the dataframe
print(df.head())

                 Sales
Date                  
2020-01-01  202.706700
2020-01-02  235.169170
2020-01-03  211.873396
2020-01-04  199.489126
2020-01-05  174.437782


2. **Generate a quick statistical summary of the 'Sales' column.** Insert your code below:


In [4]:
# INSERT CODE HERE to generate a statistical summary for 'Sales'
print(df["Sales"].describe())

count    366.000000
mean     198.310915
std       67.669889
min       51.917902
25%      153.612585
50%      195.180813
75%      247.201742
max      345.940247
Name: Sales, dtype: float64


#### Task 2: Time-Series Analysis

Dive deeper into the time-series data to identify trends and patterns.


1. **Calculate the monthly average sales.** Insert your code below:


In [5]:
# INSERT CODE HERE to resample the data by month and calculate average sales
monthly_sales = df["Sales"].resample("ME").mean()
print(monthly_sales)

Date
2020-01-31    195.460937
2020-02-29    153.416845
2020-03-31    134.496599
2020-04-30    162.798116
2020-05-31    176.783977
2020-06-30    180.825404
2020-07-31    205.664190
2020-08-31    226.765056
2020-09-30    253.600081
2020-10-31    257.958790
2020-11-30    216.756263
2020-12-31    212.977238
Freq: ME, Name: Sales, dtype: float64


2. **Identify any obvious trends in monthly average sales.** (For now, describe the trend in a markdown cell. Later, you'll learn how to visualize these trends for better insight.)


In [6]:
# INSERT YOUR OBSERVATION HERE about any trends in monthly average sales
# The monthly average sales show a clear upward trend from January to December.

3. **Calculate the rolling 7-day average of sales to smooth out any short-term fluctuations.** Insert your code below:


In [13]:
# INSERT CODE HERE to calculate 7-day rolling average of sales

print(df["Sales"].rolling(window=7, min_periods=1).mean())


Date
2020-01-01    202.706700
2020-01-02    218.937935
2020-01-03    216.583089
2020-01-04    212.309598
2020-01-05    204.735235
                 ...    
2020-12-27    213.487064
2020-12-28    208.759776
2020-12-29    213.083616
2020-12-30    233.800616
2020-12-31    234.879050
Name: Sales, Length: 366, dtype: float64
