# _ASSOCIATOR_

The _ASSOCIATOR_ provides a high-level overview of numerous time series by identifying clusters of similar time series patterns. In addition, it analyzes the most recent trend behaviour of each time series, resulting in a clear trend categorization.

In [None]:
from futureexpert import ExpertClient
import dotenv

dotenv.load_dotenv()
client = ExpertClient()

First, upload your data to the _future_ platform. You can do this using the `check_in_time_series` method. For further information how to upload your data, check the coresponding [check-in notebook](checkin_configuration_options.ipynb)

In [None]:
from futureexpert import DataDefinition, TsCreationConfig
import futureexpert.checkin as checkin
data_definition = DataDefinition(
    value_columns=[
        checkin.ValueColumn(name="Value")
    ],
    group_columns=[checkin.GroupColumn(name="Index")],
    date_column=checkin.DateColumn(name="Date", format="%Y-%m-%d")
)

ts_creation_config = TsCreationConfig(
    time_granularity="monthly",
    grouping_level=["Index"],
    value_columns_to_save=["Value"],
)

version_id = client.check_in_time_series(
    raw_data_source="../example_data/consumer_index.csv",
    data_definition=data_definition,
    config_ts_creation=ts_creation_config,
)

Once the data is uploaded, you can start the _ASSOCIATOR_. For a minimal configuraion, provide the `version_id` and a `report_note`.

In [None]:
from futureexpert import AssociatorConfig, DataSelection, ClusteringConfiguration
config = AssociatorConfig(data_selection=DataSelection(version=version_id),
                          clustering=ClusteringConfiguration(),
                          report_note="Consumer Indicies Exploration")
associator_id = client.start_associator(config=config)

Once the _ASSOCIATOR_ has finished, you can get the results using the function `get_associator_results`.

In [None]:
from futureexpert.associator import export_associator_results_to_pandas
results = client.get_associator_results(associator_id)
export_associator_results_to_pandas(results)

## Using _ASSOCIATOR_ Results in _MATCHER_

The clustering results can be used in a post-selection step to refine the initial matcher ranking. This process condenses the output by removing redundant covariates.

Specifically, if multiple covariates from the same cluster are selected with the same lag, only the highest-ranked covariate is retained. The others are discarded, based on the assumption that they provide similar information for forecasting. Additionally, this process helps to prevent multicollinearity among the covariates in the forecast models.

In [None]:
sales_data_definition = DataDefinition(
    value_columns=[checkin.ValueColumn(name='Sales')],
    date_column=checkin.DateColumn(name='Date', format='%Y-%m-%d')
)

sales_ts_creation_config = TsCreationConfig(
    time_granularity='monthly',
    value_columns_to_save=['Sales']
)

actuals_version_id = client.check_in_time_series(
    raw_data_source="../example_data/consumer_sales.csv",
    data_definition=sales_data_definition,
    config_ts_creation=sales_ts_creation_config
)

In [None]:
from futureexpert import MatcherConfig

matcher_config = MatcherConfig(
    actuals_version=actuals_version_id,
    covs_versions=[version_id],
    associator_report_id=associator_id.report_id,
    use_clustering_results=True,
    title='Consumer Sales Matcher with Associator Insights'
)

matcher_id = client.start_matcher(config=matcher_config)

In [None]:
import time

# Watch the current status of the forecasting report
while not (current_status := client.get_report_status(id=matcher_id)).is_finished:
    current_status.print()
    print('Waiting another 30 seconds to finish forecasting...')
    time.sleep(30)  # Wait between status requests
current_status.print()


In [None]:
matcher_results = client.get_matcher_results(matcher_id)