In [1]:
# q1 

In [None]:
Time-dependent seasonal components refer to patterns or variations in data that occur periodically over time, with the characteristics of these patterns changing over different time periods. These components are typically observed in time series data and are influenced by recurring events or factors that follow a seasonal or cyclical pattern.

To better understand time-dependent seasonal components, let's consider an example. Suppose we have a dataset that records the daily sales of ice cream in a beach town over several years. The data would consist of a timestamp (date) and the corresponding sales value for each day.

Upon analyzing this dataset, we might observe certain seasonal patterns that repeat each year. For example, we may notice that the sales of ice cream are generally higher during the summer months compared to the rest of the year. Within each summer season, we may also observe smaller cyclical patterns, such as increased sales during weekends or holidays.

These patterns can be considered time-dependent seasonal components. They exhibit a periodic nature that occurs within specific time intervals (e.g., yearly or within a particular season) and can be attributed to factors like weather conditions, tourist influx, or vacation periods.

By identifying and modeling these time-dependent seasonal components, we can gain insights into the underlying patterns and make more accurate predictions or forecasts. This can be particularly useful for businesses to optimize inventory management, marketing strategies, and resource allocation based on seasonal fluctuations.

In summary, time-dependent seasonal components represent recurring patterns or variations in data that occur periodically over time, with the characteristics of these patterns changing across different time periods. They provide valuable information about seasonal trends and can be leveraged for forecasting and decision-making purposes.

In [4]:
#q2

In [None]:
To identify time-dependent seasonal components in time series data, you can use various techniques such as visual inspection, autocorrelation analysis, and time series decomposition. Here's an overview of each method:

1. Visual Inspection:
   - Plot the time series data and examine it for recurring patterns or cycles. Look for regular fluctuations that occur at fixed intervals, such as daily, weekly, monthly, or yearly patterns.

2. Autocorrelation Analysis:
   - Autocorrelation measures the correlation between a time series and a lagged version of itself. By calculating the autocorrelation function (ACF) or partial autocorrelation function (PACF) of the time series, you can identify significant lags that indicate seasonal patterns.
   - In Python, you can use the `statsmodels` library to calculate autocorrelation functions. The `plot_acf` and `plot_pacf` functions can help visualize the autocorrelation and partial autocorrelation plots, respectively.

3. Time Series Decomposition:
   - Decomposition separates a time series into its constituent components: trend, seasonal, and residual (or error) components.
   - The decomposition methods commonly used are:
     - Additive decomposition: Assumes that the seasonal component has a constant magnitude throughout the time series.
     - Multiplicative decomposition: Assumes that the seasonal component's magnitude varies with the level of the time series.
   - Python libraries like `statsmodels` and `seasonal` offer functions to perform time series decomposition. The resulting seasonal component can be analyzed and visualized.

It's important to note that identifying time-dependent seasonal components may require a combination of these methods, and the choice of approach depends on the characteristics of your data and the nature of the seasonality you expect to find. Additionally, domain knowledge and further analysis might be needed to validate and interpret the identified seasonal components.

In [None]:
import pandas as pd
import statsmodels.api as sm

# Load your time series data into a pandas DataFrame
# Assuming your data has two columns: 'date' and 'value'
data = pd.read_csv('your_data.csv')

# Convert the 'date' column to datetime type
data['date'] = pd.to_datetime(data['date'])

# Set the 'date' column as the DataFrame index
data.set_index('date', inplace=True)

# Perform seasonal decomposition using the additive model
decomposition = sm.tsa.seasonal_decompose(data['value'], model='additive')

# Retrieve the seasonal component from the decomposition
seasonal_component = decomposition.seasonal

# Plot the seasonal component
seasonal_component.plot()


In [None]:
#q3

In [None]:
Time-dependent seasonal components in time series data can be influenced by various factors. Here are some key factors that can affect the presence and characteristics of seasonal patterns:

1. Nature of the Phenomenon: The underlying nature of the phenomenon being observed can significantly impact the presence and type of seasonal patterns. For example, in retail sales data, you might expect to see strong seasonal variations due to holidays and shopping seasons. In weather data, seasonal patterns can be influenced by changes in temperature or precipitation levels throughout the year.

2. Human Behavior and Culture: Human behavior, cultural events, and societal factors can introduce seasonal patterns in data. For instance, consumer behavior during holiday seasons, festivals, or vacation periods can lead to regular patterns in sales, travel, or other economic indicators.

3. Calendar Effects: The structure of the calendar, including the presence of holidays, weekends, and special events, can introduce seasonal components. For example, weekly patterns might arise due to variations in behavior between weekdays and weekends, while annual patterns might be influenced by holidays falling on specific dates.

4. Climate and Weather: Time series related to climate and weather often exhibit strong seasonal components. The change in weather conditions across different seasons can lead to predictable variations, such as temperature fluctuations, precipitation patterns, or daylight hours.

5. Business and Industry Factors: Industry-specific factors can introduce seasonal components. For instance, the demand for certain products or services may vary seasonally, such as swimwear in summer or ski equipment in winter. Agricultural data might exhibit seasonal patterns related to planting and harvesting seasons.

6. Economic Factors: Economic conditions can impact seasonal patterns. For example, seasonal fluctuations in employment rates, consumer spending, or tourism can affect various industries and lead to seasonality in corresponding time series.

7. Policy and Regulations: Seasonal patterns can arise due to policy changes or regulatory factors. For example, tax seasons, changes in interest rates, or government policies related to subsidies or incentives can influence seasonal components in economic or financial data.

8. Demographic Factors: Demographic factors, such as population changes or migration patterns, can introduce seasonality in data. For instance, tourist arrivals might exhibit seasonal patterns due to vacation seasons or cultural events in different regions.

It's important to consider these factors and domain-specific knowledge when analyzing time-dependent seasonal components in time series data, as they can provide valuable insights into the underlying causes and help in modeling and forecasting future patterns.

In [None]:
#q4

In [None]:
Autoregressive (AR) models are commonly used in time series analysis and forecasting to capture the dependence of a variable on its past values. An autoregressive model predicts the future values of a time series based on its own previous values.

The general form of an autoregressive model of order p, denoted as AR(p), is as follows:

Y(t) = c + φ1 * Y(t-1) + φ2 * Y(t-2) + ... + φp * Y(t-p) + ε(t)

In this equation:

Y(t) represents the value of the time series at time t.
c is a constant term.
φ1, φ2, ..., φp are the autoregressive coefficients that determine the impact of previous values on the current value.
Y(t-1), Y(t-2), ..., Y(t-p) are the lagged values of the time series.
ε(t) is the error term or residual, representing the random noise or unexplained variation.
To use an autoregressive model in time series analysis and forecasting, you typically follow these steps:

Data Preparation: Ensure your time series data is stationary, meaning it has a constant mean and variance over time. If necessary, perform transformations like differencing or log transformations to achieve stationarity.

Model Identification: Determine the appropriate order p for the autoregressive model. This can be done by analyzing the autocorrelation function (ACF) and partial autocorrelation function (PACF) plots of the time series. The significant lags indicate the order of autoregression.

Model Estimation: Estimate the autoregressive coefficients (φ1, φ2, ..., φp) using methods like least squares estimation or maximum likelihood estimation. The estimation technique depends on the specific implementation or library you're using.

Model Diagnostics: Assess the goodness of fit and diagnostic statistics of the autoregressive model. This involves examining residuals, checking for stationarity, and performing statistical tests such as the Ljung-Box test for autocorrelation in the residuals.

Forecasting: Once the autoregressive model is validated, you can use it for forecasting future values. To forecast, provide the model with the previous values of the time series and use the estimated autoregressive coefficients to predict the next values.

In [None]:
import pandas as pd
import statsmodels.api as sm

# Load your time series data into a pandas DataFrame
# Assuming your data has two columns: 'date' and 'value'
data = pd.read_csv('your_data.csv')

# Convert the 'date' column to datetime type
data['date'] = pd.to_datetime(data['date'])

# Set the 'date' column as the DataFrame index
data.set_index('date', inplace=True)

# Create an AR(1) model
ar_model = sm.tsa.AR(data['value'])

# Fit the AR(1) model
ar_result = ar_model.fit(maxlag=1)

# Print the estimated parameters
print(ar_result.params)

# Forecast next 5 values
forecast = ar_result.predict(start=len(data), end=len(data)+4)

# Print the forecasted values
print(forecast)


In [None]:
#q5

In [None]:
Fit the Autoregressive Model: First, you need to fit the autoregressive model to your historical time series data. This involves estimating the autoregressive coefficients using methods like least squares estimation or maximum likelihood estimation. The specific approach depends on the library or tool you're using for modeling.

Determine the Lag Order: Determine the appropriate lag order (p) for the autoregressive model. This can be done by analyzing the autocorrelation function (ACF) and partial autocorrelation function (PACF) plots of the time series. Significant lags indicate the order of autoregression.

Obtain the Lagged Values: To predict future time points, you need to provide the model with the previous values of the time series. Collect the lagged values of the time series up to the lag order (p) that you determined.

Make Predictions: Once the model is fitted and you have the lagged values, you can use the estimated autoregressive coefficients to make predictions for future time points. To make a single-step ahead prediction, multiply each lagged value by its corresponding coefficient and sum them up. The sum represents the predicted value for the next time point.

In [6]:
class AR:
  def __init__(self, p):
    self.p = p
    self.model = LinearRegression()
    self.sigma = None

  def generate_train_x(self, X):
    n = len(X)
    ans = X[:n-self.p]
    ans = np.reshape(ans, (-1, 1))
    for k in range(1, self.p):
      temp = X[k:n-self.p+k]
      temp = np.reshape(temp, (-1, 1))
      ans = np.hstack((ans, temp))
    return ans
  
  def generate_train_y(self, X):
    return X[self.p:]

  def fit(self, X):
    self.sigma = np.std(X)
    train_x = self.generate_train_x(X)
    train_y = self.generate_train_y(X)
    self.model.fit(train_x, train_y)

  def predict(self, X, num_predictions, mc_depth):
    X = np.array(X)
    ans = np.array([])

    for j in range(mc_depth):
      ans_temp = []
      a = X[-self.p:]

      for i in range(num_predictions):
        next = self.model.predict(np.reshape(a, (1, -1))) + np.random.normal(loc=0, scale=self.sigma)

        ans_temp.append(next)
        
        a = np.roll(a, -1)
        a[-1] = next
      
      if j==0:
        ans = np.array(ans_temp)
      
      else:
        ans += np.array(ans_temp)
    
    ans /= mc_depth

    return ans

In [None]:
#q6

In [None]:
A Moving Average (MA) model is a time series model that aims to capture the dependence on the past forecast errors (residuals) rather than the past values of the variable itself. It is different from other time series models such as Autoregressive (AR) and Autoregressive Moving Average (ARMA) models. 

The key idea behind the MA model is to model the current value of a time series as a linear combination of the past residual errors. The general form of an MA model of order q, denoted as MA(q), is as follows:

Y(t) = μ + ε(t) + θ1 * ε(t-1) + θ2 * ε(t-2) + ... + θq * ε(t-q)

In this equation:
- Y(t) represents the value of the time series at time t.
- μ is the constant term or mean of the time series.
- ε(t), ε(t-1), ..., ε(t-q) are the past residual errors (forecast errors) at different lags.
- θ1, θ2, ..., θq are the parameters (coefficients) that represent the influence of the past residual errors on the current value of the time series.

The key differences between MA models and other time series models are:

1. Autoregressive (AR) Models: AR models capture the dependence of the current value on its past values. They assume that the current value depends on a linear combination of its previous values. In contrast, MA models assume that the current value depends on the past forecast errors.

2. Autoregressive Moving Average (ARMA) Models: ARMA models combine both autoregressive and moving average components. They consider the past values of the time series (autoregressive component) and the past forecast errors (moving average component) to model the current value. ARMA models can capture both short-term dependencies and dependencies due to past forecast errors.

3. Autoregressive Integrated Moving Average (ARIMA) Models: ARIMA models include an additional differencing step to achieve stationarity before applying the ARMA modeling. The differencing helps remove trends and seasonality from the time series.

MA models are useful for modeling time series data that exhibit persistence in the forecast errors or have a dependence on the past forecast errors rather than the past values. They can be combined with other models (such as AR or ARMA) to form more complex models like Autoregressive Moving Average (ARMA) or Autoregressive Integrated Moving Average (ARIMA) models, depending on the characteristics of the time series and the specific modeling requirements.

In [None]:
#q7

In [None]:
A mixed Autoregressive Moving Average (ARMA) model, also known as an ARMA(p, q) model, combines both autoregressive (AR) and moving average (MA) components to capture the dependence on both the past values and the past forecast errors of a time series. It differs from pure AR or MA models in that it incorporates both components simultaneously.

The general form of an ARMA(p, q) model is as follows:

Y(t) = c + φ1 * Y(t-1) + φ2 * Y(t-2) + ... + φp * Y(t-p) + θ1 * ε(t-1) + θ2 * ε(t-2) + ... + θq * ε(t-q) + ε(t)

In this equation:
- Y(t) represents the value of the time series at time t.
- c is a constant term.
- φ1, φ2, ..., φp are the autoregressive coefficients that determine the impact of previous values on the current value.
- θ1, θ2, ..., θq are the moving average coefficients that determine the impact of past forecast errors on the current value.
- ε(t), ε(t-1), ..., ε(t-q) are the past forecast errors at different lags.
- ε(t) is the error term or residual, representing the random noise or unexplained variation.

The key differences between mixed ARMA models and pure AR or MA models are as follows:

1. AR Models: AR models capture the dependence of the current value on its own past values. They assume that the current value depends on a linear combination of its previous values.

2. MA Models: MA models capture the dependence of the current value on past forecast errors. They assume that the current value depends on a linear combination of the past forecast errors.

3. ARMA Models: ARMA models combine both AR and MA components. They consider the past values of the time series (AR component) and the past forecast errors (MA component) to model the current value. ARMA models can capture both short-term dependencies and dependencies due to past forecast errors.

Mixed ARMA models provide a more comprehensive framework for modeling time series data that exhibit both autoregressive and moving average properties. They allow for capturing different types of dependencies in the data and can be useful for forecasting, anomaly detection, and understanding the underlying dynamics of the time series.

In [None]:
from pandas import read_csv
from pandas import datetime
from matplotlib import pyplot
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_squared_error
from math import sqrt
# load dataset
def parser(x):
 return datetime.strptime('190'+x, '%Y-%m')
series = read_csv('shampoo-sales.csv', header=0, index_col=0, parse_dates=True, squeeze=True, date_parser=parser)
series.index = series.index.to_period('M')
# split into train and test sets
X = series.values
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()
# walk-forward validation
for t in range(len(test)):
 model = ARIMA(history, order=(5,1,0))
 model_fit = model.fit()
 output = model_fit.forecast()
 yhat = output[0]
 predictions.append(yhat)
 obs = test[t]
 history.append(obs)
 print('predicted=%f, expected=%f' % (yhat, obs))
# evaluate forecasts
rmse = sqrt(mean_squared_error(test, predictions))
print('Test RMSE: %.3f' % rmse)
# plot forecasts against actual outcomes
pyplot.plot(test)
pyplot.plot(predictions, color='red')
pyplot.show()