# Import & Install tscausalinference

```python
!pip install tscausalinference
```

In [None]:
from tscausalinference import tscausalinference

import pandas as pd
import numpy as np

## Generating synthetic time series data
This data is created using a Python script that simulates two time series: a control time series and a treatment time series. These two time series are then merged into a single dataset. This is done to simulate the different scenarios that can occur in real-world experiments.

To generate synthetic time series data, we use a formula that consists of different components. The first component is a trend component, which is a linear function that represents the general behavior of the data over time. The second component is a seasonal component, which represents the periodic fluctuations in the data that occur at fixed intervals. The third component is the noise component, which represents the random variation in the data that cannot be explained by the trend or the seasonality. As we are working with time series and regressions, we assume that this is the usual formula to predict our results.

The formula for generating synthetic time series data is: 

`y(t) = trend(t) + seasonality(t) + noise(t)`

Where `y(t)` is the value of the time series at time t, `trend(t)` is the value of the trend component at time t, `seasonality(t)` is the value of the seasonal component at time t, and `noise(t)` is the value of the noise component at time t.

To generate the trend component, we assume that the trend is a linear function of time. That is, the trend can be represented by the equation:

`trend(t) = alpha + beta * t`

Where alpha is the intercept of the trend, beta is the slope of the trend, and t is the time index.

To generate the seasonal component, we assume that the seasonality is a periodic function of time. That is, the seasonality can be represented by the equation:

`seasonality(t) = sum_i=1^m gamma_i * cos(2 * pi * i * t / m) + delta`

where gamma_i is the amplitude of the i-th seasonality component, m is the number of seasons in a year, and delta is the baseline level of the seasonality.

To generate the noise component, we assume that the noise is a random variable that follows a normal distribution with zero mean and a standard deviation sigma.

The main assumptions in the generation of synthetic time series data are that the trend and seasonality components are deterministic functions of time, and that the noise component is a random variable that is independent and identically distributed over time.

Overall, the generation of synthetic time series data is a critical step in the sensitivity analysis methodology, as it allows us to simulate different scenarios and assess the robustness of our analysis to different levels of noise and effect sizes.

In [None]:
# Define the length of the time series and the parameters for the trend, seasonality, and noise
n = 200
eff_n = 15
trend = 0.05
seasonality = 7
noise_power = 0.2
simulated_effect = 0.16

# Create a time index
control_index = pd.date_range('2022-01-01', periods=n, freq='D')
# Create a time index
treatment_index = pd.date_range(str(control_index.max()- pd.Timedelta(days=eff_n-1)), periods=eff_n, freq='D')

# Create the second time series
trend_component = np.arange(n) * trend
seasonality_component = np.cos(np.arange(n) * 2 * np.pi / seasonality)
data_control = trend_component + seasonality_component + np.random.normal(scale=noise_power, size=n)

# Create the first time series
effect_component = np.arange(len(treatment_index)) * simulated_effect
data_treatment = data_control[-len(treatment_index):] * simulated_effect

df = pd.merge(
    pd.DataFrame({'control':data_control, 'ds':control_index}),
    pd.DataFrame({'treatment':data_treatment, 'ds':treatment_index}),
    on = 'ds',
    how = 'left'
    ).fillna(0)

df['y'] = df.control  + df.treatment

df.columns = ['yhat', 'ds', 'treatment_test', 'y']
df['ds'] = pd.to_datetime(df.ds)

df.info()

In [None]:
df.y.plot()

In [None]:
intervention = ['2022-07-04', '2022-07-19']

In [None]:
data = tscausalinference(data = df, intervention = intervention)

In [None]:
data.summary_intervention()

In [None]:
data.summary()

In [None]:
data.plot_intervention()

In [None]:
data.plot_simulations()