## AI-assisted Data Transformation for CHECK-IN
This notebook helps you streamline your data transformation to the CHECK-IN data format using an AI assistant. We provide a prompt template that guides a chatbot (like ChatGPT, GitHub Copilot, or Gemini) to write the necessary Python code for you, saving you from having to learn the specific format requirements.

### Workflow

- **Use the Prompt Template**: Copy the content of [notebooks/prompt_templates/PromptTemplate.docx](notebooks/prompt_templates/PromptTemplate.docx) and paste it into the AI chatbot of your choice.

- **Provide Metadata**: The chatbot will start an interactive dialogue. Follow its instructions to provide the required metadata (file and column names). The prompt is designed to work without needing your actual data content.

- **Review the Generated Code:** Treat the Python script from the chatbot as a draft and check it for accuracy, security and compliance with your internal policies.

- **Execute the Code**: Paste the validated script into the next cell and run it.

### Important Considerations

- **Third-Party Service**: Please be aware that by using an external chatbot, you are interacting with a third-party service outside the futureEXPERT environment.

- **Confidentiality**: If your metadata (file or column names) is confidential, use a company-approved, private AI system.

- **Your Responsibility**: You are solely responsible for any prompts you share and for the code you execute. By using this template, you agree that our organisation accepts no liability for the output generated by third-party AI services.

### Paste your code snippet here
Example generated by ChatGPT4o: 
Replace when using own code snippet



In [1]:
# Import required libraries
import pandas as pd

# Step 1: Load input CSV files with specified delimiter and decimal
ordr = pd.read_csv("../example_data/ORDR.csv", delimiter=',', decimal='.')
itm = pd.read_csv("../example_data/OITM.csv", delimiter=',', decimal='.')

# Step 2: Join ORDR (left) with ITM (inner) on 'ArtikelNr'
df = pd.merge(ordr, itm, how='left', on='ArtikelNr')

# Step 3: Construct a proper datetime column from LieferdatumPositionTag, LieferdatumPositionMonat, LieferdatumPositionJahr
df['Date'] = pd.to_datetime(df[['LieferdatumPositionJahr', 'LieferdatumPositionMonat', 'LieferdatumPositionTag']].rename(
    columns={
        'LieferdatumPositionJahr': 'year',
        'LieferdatumPositionMonat': 'month',
        'LieferdatumPositionTag': 'day'
    }
), errors='coerce')

# Step 4: Select columns to keep for final output
final_cols = ['Date', 'Menge', 'ArtikelGruppe', 'ArtikelBez']
output_df = df[final_cols]

# Optional: Drop rows with missing Date or Menge
output_df = output_df.dropna(subset=['Date', 'Menge'])

# Step 5: Export to CSV with defaults: comma delimiter, period decimal
output_df.to_csv("prepared_timeseries.csv", index=False)

# Explanation:
# - This script reads your two CSV files with proper parsing.
# - Joins them on 'ArtikelNr'.
# - Creates a single 'Date' column by combining year, month, day.
# - Keeps your target column 'Menge' and extra columns 'ArtikelGruppe' and 'ArtikelBez'.
# - Outputs the prepared file as 'prepared_timeseries.csv' ready for timeseries forecasting.


After preparing your dataset, we connect to the client using our future username and password.

In [2]:
from futureexpert import (DataDefinition,
                          ExpertClient,
                          FileSpecification,
                          TsCreationConfig)
import futureexpert.checkin as checkin

client = ExpertClient()

INFO:futureexpert.expert_client:Successfully logged in for group group-expert.


### *CHECK-IN* Configuration

Now, let’s use *CHECK-IN* to transform our data and upload the resulting time series to our database. We define the location of the CSV file and its specifications (`separator`, `decimal`). For our use case, we need to consider the following settings:
- The data contains some missing values, which are equivalent to 'no demand' observations. Therefore, we set the `missing_value_handler` to `setToZero`.
- Material `Hydraulikpumpe HYP-201` is an article that is no longer sold but is still part of the dataset. We do not need any forecasts for this material, so we can set a corresponding `FilterSettings`.
- The columns ArtikelGruppe and ArtikelBez contain structural information about the data, so we define them as `GroupColumns`. For this use case, however, we only want to create forecasts at the material level, so we set that column as the `grouping_level` in the `TsCreationConfig`.

In [None]:
actuals_version_id = client.check_in_time_series(raw_data_source='prepared_timeseries.csv',
                                                  file_specification=FileSpecification(delimiter=',', decimal='.'),
                                                  data_definition=DataDefinition(date_columns=checkin.DateColumn(name='Date', format='%Y-%m-%d', name_new='Date'),
                                                                                 value_columns=[checkin.ValueColumn(name='Menge', name_new='Menge')],
                                                                                 group_columns=[checkin.GroupColumn(name='ArtikelGruppe', name_new='ArtikelGruppe'),
                                                                                                checkin.GroupColumn(name='ArtikelBez', name_new='ArtikelBez')]),
                                                  config_ts_creation=TsCreationConfig(time_granularity='monthly',
                                                                                      start_date='2007-10-01',
                                                                                      end_date='2024-06-01',
                                                                                      value_columns_to_save=['Menge'],
                                                                                      grouping_level=['ArtikelBez'],
                                                                                      missing_value_handler='setToZero',
                                                                                      filter=[checkin.FilterSettings(type='exclusion', variable='ArtikelBez', items=['Hydraulikpumpe HYP-201'])]))

INFO:futureexpert.expert_client:Transforming input data...
INFO:futureexpert.expert_client:Creating time series using CHECK-IN...
INFO:futureexpert.expert_client:Finished time series creation.


## Next Steps
You've successfully created and checked in your new data file. It contains a date column, one or more value columns, and any optional grouping or additional columns you included.

You can now use the ``actuals_version_id`` to continue with your forecast. For further guidance, please use our templates, such as the [Getting-Started-Notebook](getting_started.ipynb).