# Assignment: Time-Series Data Analysis

This assignment will guide you through the process of analyzing a time-series dataset to identify underlying trends, patterns, and seasonal variations using Pandas.

#### Setup: Generate the Time-Series Dataset

Run the following code in this Jupyter Notebook to generate your synthetic time-series dataset:


In [4]:
import pandas as pd
import numpy as np

# Seed for reproducibility
np.random.seed(0)

# Generate a date range
dates = pd.date_range(start="2020-01-01", end="2020-12-31", freq="D")

# Generate synthetic time-series data
data = {
    "Date": dates,
    "Sales": np.random.rand(len(dates)) * 200
    + np.sin(np.linspace(-3, 3, len(dates))) * 50
    + 100,
}

# Create DataFrame
df = pd.DataFrame(data)
df.set_index("Date", inplace=True)

#### Task 1: Initial Exploration

Begin with an initial exploration to understand your dataset's structure and main components.

1. **Display the first few rows of the dataset to get an idea of its structure.** Insert your code below:


In [11]:
# INSERT CODE HERE to display the first 5 rows of the dataframe
df.head()

# Methods and attributes
# df.head()
# df.info()
# df.describe()
# df.shape
# df.values
# df.columns
# df.index

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 366 entries, 2020-01-01 to 2020-12-31
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Sales   366 non-null    float64
dtypes: float64(1)
memory usage: 5.7 KB


DatetimeIndex(['2020-01-01', '2020-01-02', '2020-01-03', '2020-01-04',
               '2020-01-05', '2020-01-06', '2020-01-07', '2020-01-08',
               '2020-01-09', '2020-01-10',
               ...
               '2020-12-22', '2020-12-23', '2020-12-24', '2020-12-25',
               '2020-12-26', '2020-12-27', '2020-12-28', '2020-12-29',
               '2020-12-30', '2020-12-31'],
              dtype='datetime64[ns]', name='Date', length=366, freq=None)

2. **Generate a quick statistical summary of the 'Sales' column.** Insert your code below:


In [None]:
# INSERT CODE HERE to generate a statistical summary for 'Sales'
df["Sales"].describe()

count    366.000000
mean     198.310915
std       67.669889
min       51.917902
25%      153.612585
50%      195.180813
75%      247.201742
max      345.940247
Name: Sales, dtype: float64

#### Task 2: Time-Series Analysis

Dive deeper into the time-series data to identify trends and patterns.


1. **Calculate the monthly average sales.** Insert your code below:


In [None]:
# INSERT CODE HERE to resample the data by month and calculate average sales
df.resample("ME").mean()

Unnamed: 0_level_0,Sales
Date,Unnamed: 1_level_1
2020-01-31,195.460937
2020-02-29,153.416845
2020-03-31,134.496599
2020-04-30,162.798116
2020-05-31,176.783977
2020-06-30,180.825404
2020-07-31,205.66419
2020-08-31,226.765056
2020-09-30,253.600081
2020-10-31,257.95879


2. **Identify any obvious trends in monthly average sales.** (For now, describe the trend in a markdown cell. Later, you'll learn how to visualize these trends for better insight.)


In [None]:
# INSERT YOUR OBSERVATION HERE about any trends in monthly average sales
df['Sales']

3. **Calculate the rolling 7-day average of sales to smooth out any short-term fluctuations.** Insert your code below:


In [None]:
# INSERT CODE HERE to calculate a rolling 7-day average of sales
df.rolling(window=7).mean()

Unnamed: 0_level_0,Sales
Date,Unnamed: 1_level_1
2020-01-01,
2020-01-02,
2020-01-03,
2020-01-04,
2020-01-05,
...,...
2020-12-27,213.487064
2020-12-28,208.759776
2020-12-29,213.083616
2020-12-30,233.800616


#### Task 3: Insights Reporting

Reflect on the analysis performed and answer the following questions in a markdown cell in your Jupyter Notebook:

1. What are the overall trends that you can observe in the data?
2. Did you notice any seasonal variations in monthly average sales?
3. How did the 7-day rolling average compare to the daily sales figures? What does this tell you about the volatility of the sales data?


1. INSERT YOUR ANSWER HERE
2. INSERT YOUR ANSWER HERE
3. INSERT YOUR ANSWER HERE
