# Multi-Date Expected Proceeds Prediction Workflow (Local Version)

This notebook runs the expected proceeds prediction workflow for multiple inference dates and combines the results into a single DataFrame that gets pushed to Snowflake.

In [1]:
!pip3 install numpy==1.23.5
!pip3 install pandas==1.5.3
!pip3 install pyarrow==10.0.1
!pip3 install "snowflake-connector-python[pandas]"
!pip3 install snowflake-snowpark-python
!pip3 install tqdm

Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools

In [2]:
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import os
from tqdm import tqdm

# Import custom modules
from country_utils import add_signup_country_group
from data_utils import split_data_by_date, split_data_by_user_type
from trial_predictions import TrialPredictionModel
from direct_purchase_predictions import DirectPurchasePredictionModel
from lag_purchase_predictions import LagPurchasePredictionModel

## Connect to Snowflake

We'll use the configuration from our `config.py` file to connect to Snowflake.

In [3]:
# Import Snowflake connection config
import config

# Snowflake connection
from snowflake.snowpark import Session

def get_snowflake_session():
    """Create and return a Snowflake session"""
    connection_parameters = {
        "account": config.SNOWFLAKE_ACCOUNT,
        "user": config.SNOWFLAKE_USER,
        "role": config.SNOWFLAKE_ROLE,
        "warehouse": config.SNOWFLAKE_WAREHOUSE,
        "database": config.SNOWFLAKE_DATABASE,
        "schema": config.SNOWFLAKE_SCHEMA,
        "authenticator": config.SNOWFLAKE_AUTHENTICATOR
    }
    
    session = Session.builder.configs(connection_parameters).create()
    print(f"Connected to Snowflake as {config.SNOWFLAKE_USER}")
    return session

# Create a Snowflake session
session = get_snowflake_session()

2025-03-18 19:07:38,860 - snowflake.connector.connection - INFO - Snowflake Connector for Python Version: 3.14.0, Python Version: 3.9.6, Platform: macOS-15.2-arm64-arm-64bit
2025-03-18 19:07:38,860 - snowflake.connector.connection - INFO - Connecting to GLOBAL Snowflake domain
2025-03-18 19:07:38,861 - snowflake.connector.connection - INFO - This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.


Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Going to open: https://blinkist-useast_1_virginia.snowflakecomputing.com/console/login?login_name=meri-kris.jaama%40go1.com&browser_mode_redirect_port=64537&proof_key=%2BG6D3kg%2Brzw2bLvo1JjeVoL%2F8Q6hCQb9XRaRAwocKXI%3D to authenticate...


2025-03-18 19:07:45,481 - snowflake.snowpark.session - INFO - Snowpark Session information: 
"version" : 1.29.1,
"python.version" : 3.9.6,
"python.connector.version" : 3.14.0,
"python.connector.session.id" : 9893985661975982,
"os.name" : Darwin



Connected to Snowflake as meri-kris.jaama@go1.com


## 1. Define Inference Dates

Specify the dates for which you want to run the workflow.

In [4]:
# Define inference dates
# Option 1: Manually specify dates
inference_dates = [
    '2025-01-01',
    '2024-12-01',
    '2024-12-02'
]

# Option 2: Generate a range of dates
start_date = '2024-1-01'
end_date = '2025-01-31'
start = datetime.strptime(start_date, '%Y-%m-%d')
end = datetime.strptime(end_date, '%Y-%m-%d')
inference_dates = [(start + timedelta(days=i)).strftime('%Y-%m-%d') for i in range((end - start).days + 1)]

print(f"Running workflow for {len(inference_dates)} dates: {inference_dates}")

Running workflow for 62 dates: ['2024-12-01', '2024-12-02', '2024-12-03', '2024-12-04', '2024-12-05', '2024-12-06', '2024-12-07', '2024-12-08', '2024-12-09', '2024-12-10', '2024-12-11', '2024-12-12', '2024-12-13', '2024-12-14', '2024-12-15', '2024-12-16', '2024-12-17', '2024-12-18', '2024-12-19', '2024-12-20', '2024-12-21', '2024-12-22', '2024-12-23', '2024-12-24', '2024-12-25', '2024-12-26', '2024-12-27', '2024-12-28', '2024-12-29', '2024-12-30', '2024-12-31', '2025-01-01', '2025-01-02', '2025-01-03', '2025-01-04', '2025-01-05', '2025-01-06', '2025-01-07', '2025-01-08', '2025-01-09', '2025-01-10', '2025-01-11', '2025-01-12', '2025-01-13', '2025-01-14', '2025-01-15', '2025-01-16', '2025-01-17', '2025-01-18', '2025-01-19', '2025-01-20', '2025-01-21', '2025-01-22', '2025-01-23', '2025-01-24', '2025-01-25', '2025-01-26', '2025-01-27', '2025-01-28', '2025-01-29', '2025-01-30', '2025-01-31']


## 2. Load Data

Load the necessary data from Snowflake.

In [5]:
# Fetch data from Snowflake
print("Loading data from Snowflake...")
input_query = """
    SELECT 
        *
    FROM blinkist_dev.dbt_mjaama.exp_proceeds_input
    """
    
input_df = session.sql(input_query).to_pandas()

product_query = """
    select sku as product_name, price 
    from BLINKIST_PRODUCTION.reference_tables.product_dim
    where is_purchasable;
    """
    
product_df = session.sql(product_query).to_pandas()

print(f"Loaded {len(input_df)} input records and {len(product_df)} product records")

# Convert column names to lowercase for compatibility with utility functions
input_df.columns = input_df.columns.str.lower()
print("Column names converted to lowercase")

Loading data from Snowflake...
Loaded 8060581 input records and 833 product records
Column names converted to lowercase


## 3. Add Country Groups

Add country groups to the input data to help with analysis.

In [6]:
# Add country groups
print("Adding country groups...")
input_df = add_signup_country_group(input_df)
print("Country groups added")

Adding country groups...


  us_df = temp_df.loc[(temp_df.report_date >= six_months_ago) & (temp_df.signup_country == "US") & (
  row_df = temp_df.loc[(temp_df.report_date >= six_months_ago) & (


Country groups added


## 4. Define Workflow Function

Define a function to run the workflow for a single inference date.

In [7]:
def run_workflow_for_date(input_df, product_df, inference_date, training_window_days=180):
    """Run the workflow for a single inference date"""
    print(f"\nProcessing inference date: {inference_date}")
    
    # 1. Split data by date
    inference_df, training_d8_df, training_d100_df = split_data_by_date(
        input_df,
        inference_date=inference_date,
        training_window_days=training_window_days,
        date_column='report_date'
    )
    
    print(f"Inference data: {len(inference_df)} records")
    print(f"Training d8 data: {len(training_d8_df)} records")
    print(f"Training d100 data: {len(training_d100_df)} records")
    
    # Check if we have enough inference data
    if len(inference_df) == 0:
        print(f"WARNING: No inference data found for {inference_date}. Skipping.")
        return None
    
    # 2. Split data by user type
    trial_inference, day0_payers_inference, other_inference = split_data_by_user_type(inference_df)
    trial_training_d8, day0_payers_training_d8, other_training_d8 = split_data_by_user_type(training_d8_df)
    trial_training_d100, day0_payers_training_d100, other_training_d100 = split_data_by_user_type(training_d100_df)
    
    print(f"Trial users: {len(trial_inference)} inference, {len(trial_training_d8)} training d8, {len(trial_training_d100)} training d100")
    print(f"Day 0 payers: {len(day0_payers_inference)} inference, {len(day0_payers_training_d8)} training d8, {len(day0_payers_training_d100)} training d100")
    print(f"Other users: {len(other_inference)} inference, {len(other_training_d8)} training d8, {len(other_training_d100)} training d100")
    
    # 3. Train models
    print("Training models...")
    
    print("Train trial model")
    # Trial model
    trial_model = TrialPredictionModel(product_dim_df=product_df)
    trial_model.fit(trial_training_d8, trial_training_d100)
    
    print("Train direct purchase model")
    # Direct purchase model
    direct_model = DirectPurchasePredictionModel()
    direct_model.fit(day0_payers_training_d100)
    
    print("Train lag purchase model")
    # Lag purchase model
    lag_model = LagPurchasePredictionModel(product_dim_df=product_df)
    lag_model.fit(other_training_d8, other_training_d100)
    
    # 4. Make predictions
    print("Making predictions...")
    predictions = []
    
    if not trial_inference.empty:
        trial_predictions = trial_model.predict(trial_inference)
        trial_predictions['user_type'] = 'trial'
        predictions.append(trial_predictions)
        print(f"Generated {len(trial_predictions)} predictions for trial users")
    
    if not day0_payers_inference.empty:
        direct_predictions = direct_model.predict(day0_payers_inference)
        direct_predictions['user_type'] = 'day0_payer'
        predictions.append(direct_predictions)
        print(f"Generated {len(direct_predictions)} predictions for day 0 payers")
    
    if not other_inference.empty:
        lag_predictions = lag_model.predict(other_inference)
        lag_predictions['user_type'] = 'other'
        predictions.append(lag_predictions)
        print(f"Generated {len(lag_predictions)} predictions for other users")
    
    # Check if we have any predictions
    if not predictions:
        print("No predictions generated - inference data may be empty")
        return None
    
    # Combine predictions
    all_predictions = pd.concat(predictions, ignore_index=True)
    
    # Add inference date as a column
    all_predictions['inference_date'] = inference_date
    
    print(f"Total predictions: {len(all_predictions)}")
    return all_predictions

## 5. Run Workflow for All Dates

Run the workflow for each inference date and collect the results.

In [8]:
# Run workflow for all dates
all_results = []

# Use standard tqdm instead of the notebook version
for date in tqdm(inference_dates, desc="Processing dates"):
    try:
        result = run_workflow_for_date(input_df, product_df, date)
        if result is not None:
            all_results.append(result)
    except Exception as e:
        print(f"Error processing date {date}: {str(e)}")

# Combine all results
if all_results:
    combined_results = pd.concat(all_results, ignore_index=True)
    print(f"\nCombined results: {len(combined_results)} records across {len(all_results)} dates")
else:
    print("No results to combine")

Processing dates:   0%|          | 0/62 [00:00<?, ?it/s]


Processing inference date: 2024-12-01
Inference data: 10398 records
Training d8 data: 1524968 records
Training d100 data: 1228202 records
Trial users: 290 inference, 85090 training d8, 110275 training d100
Day 0 payers: 1766 inference, 50951 training d8, 26998 training d100
Other users: 8326 inference, 1388151 training d8, 1090691 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 24948 records with non-zero proceeds
Global average proceeds: 56.1808405867729
Calculated average proceeds for 68 products
Calculated average proceeds for 105 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 60142, 1s: 24948
Removed 0 outliers from 85090 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:   2%|▏         | 1/62 [05:01<5:06:38, 301.62s/it]


Processing inference date: 2024-12-02
Inference data: 9856 records
Training d8 data: 1526960 records
Training d100 data: 1229944 records
Trial users: 252 inference, 84714 training d8, 110011 training d100
Day 0 payers: 1780 inference, 51352 training d8, 27045 training d100
Other users: 7815 inference, 1390119 training d8, 1092645 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 24796 records with non-zero proceeds
Global average proceeds: 56.25669739517142
Calculated average proceeds for 68 products
Calculated average proceeds for 106 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 59918, 1s: 24796
Removed 0 outliers from 84714 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:   3%|▎         | 2/62 [10:10<5:05:45, 305.76s/it]


Processing inference date: 2024-12-03
Inference data: 9624 records
Training d8 data: 1530130 records
Training d100 data: 1232896 records
Trial users: 260 inference, 84423 training d8, 109907 training d100
Day 0 payers: 1541 inference, 51978 training d8, 27150 training d100
Other users: 7820 inference, 1392952 training d8, 1095593 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 24683 records with non-zero proceeds
Global average proceeds: 56.32570767194321
Calculated average proceeds for 70 products
Calculated average proceeds for 106 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 59740, 1s: 24683
Removed 0 outliers from 84423 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:   5%|▍         | 3/62 [15:23<5:03:58, 309.13s/it]


Processing inference date: 2024-12-04
Inference data: 8856 records
Training d8 data: 1532800 records
Training d100 data: 1236232 records
Trial users: 228 inference, 84112 training d8, 109989 training d100
Day 0 payers: 1282 inference, 52420 training d8, 27303 training d100
Other users: 7343 inference, 1395491 training d8, 1098686 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 24569 records with non-zero proceeds
Global average proceeds: 56.4091747887588
Calculated average proceeds for 70 products
Calculated average proceeds for 106 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 59543, 1s: 24569
Removed 0 outliers from 84112 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:   6%|▋         | 4/62 [20:34<4:59:26, 309.77s/it]


Processing inference date: 2024-12-05
Inference data: 8032 records
Training d8 data: 1534995 records
Training d100 data: 1238994 records
Trial users: 198 inference, 83772 training d8, 109931 training d100
Day 0 payers: 1013 inference, 53081 training d8, 27429 training d100
Other users: 6817 inference, 1397364 training d8, 1101378 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 24421 records with non-zero proceeds
Global average proceeds: 56.49352965801398
Calculated average proceeds for 70 products
Calculated average proceeds for 106 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 59351, 1s: 24421
Removed 0 outliers from 83772 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:   8%|▊         | 5/62 [25:37<4:52:12, 307.59s/it]


Processing inference date: 2024-12-06
Inference data: 7334 records
Training d8 data: 1537489 records
Training d100 data: 1242201 records
Trial users: 193 inference, 83451 training d8, 109832 training d100
Day 0 payers: 841 inference, 53741 training d8, 27529 training d100
Other users: 6299 inference, 1399516 training d8, 1104579 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 24297 records with non-zero proceeds
Global average proceeds: 56.54521173493742
Calculated average proceeds for 70 products
Calculated average proceeds for 106 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 59154, 1s: 24297
Removed 0 outliers from 83451 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  10%|▉         | 6/62 [31:00<4:51:58, 312.83s/it]


Processing inference date: 2024-12-07
Inference data: 7176 records
Training d8 data: 1538720 records
Training d100 data: 1245723 records
Trial users: 220 inference, 83129 training d8, 109806 training d100
Day 0 payers: 824 inference, 54661 training d8, 27643 training d100
Other users: 6124 inference, 1400143 training d8, 1108006 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 24176 records with non-zero proceeds
Global average proceeds: 56.60107402633172
Calculated average proceeds for 70 products
Calculated average proceeds for 106 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 58953, 1s: 24176
Removed 0 outliers from 83129 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  11%|█▏        | 7/62 [36:29<4:51:25, 317.91s/it]


Processing inference date: 2024-12-08
Inference data: 7662 records
Training d8 data: 1537563 records
Training d100 data: 1249165 records
Trial users: 249 inference, 82633 training d8, 109733 training d100
Day 0 payers: 1025 inference, 55927 training d8, 27687 training d100
Other users: 6380 inference, 1398214 training d8, 1111473 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 24017 records with non-zero proceeds
Global average proceeds: 56.67727068240964
Calculated average proceeds for 70 products
Calculated average proceeds for 106 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 58616, 1s: 24017
Removed 0 outliers from 82633 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  13%|█▎        | 8/62 [43:02<5:07:41, 341.89s/it]


Processing inference date: 2024-12-09
Inference data: 6912 records
Training d8 data: 1538694 records
Training d100 data: 1251569 records
Trial users: 259 inference, 82228 training d8, 109501 training d100
Day 0 payers: 479 inference, 57258 training d8, 27635 training d100
Other users: 6168 inference, 1398407 training d8, 1114155 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 23876 records with non-zero proceeds
Global average proceeds: 56.749193844705395
Calculated average proceeds for 70 products
Calculated average proceeds for 106 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 58352, 1s: 23876
Removed 0 outliers from 82228 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  15%|█▍        | 9/62 [49:28<5:14:19, 355.84s/it]


Processing inference date: 2024-12-10
Inference data: 7697 records
Training d8 data: 1540763 records
Training d100 data: 1254986 records
Trial users: 289 inference, 81905 training d8, 109458 training d100
Day 0 payers: 304 inference, 58891 training d8, 27668 training d100
Other users: 7095 inference, 1399151 training d8, 1117578 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 23761 records with non-zero proceeds
Global average proceeds: 56.82975960156266
Calculated average proceeds for 70 products
Calculated average proceeds for 105 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 58144, 1s: 23761
Removed 0 outliers from 81905 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  16%|█▌        | 10/62 [55:58<5:17:29, 366.34s/it]


Processing inference date: 2024-12-11
Inference data: 8715 records
Training d8 data: 1542620 records
Training d100 data: 1258774 records
Trial users: 278 inference, 81547 training d8, 109332 training d100
Day 0 payers: 366 inference, 60535 training d8, 27756 training d100
Other users: 8066 inference, 1399714 training d8, 1121400 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 23625 records with non-zero proceeds
Global average proceeds: 56.884834782022374
Calculated average proceeds for 69 products
Calculated average proceeds for 106 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 57922, 1s: 23625
Removed 0 outliers from 81547 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  18%|█▊        | 11/62 [1:02:30<5:17:58, 374.09s/it]


Processing inference date: 2024-12-12
Inference data: 7860 records
Training d8 data: 1544633 records
Training d100 data: 1261894 records
Trial users: 251 inference, 81144 training d8, 109271 training d100
Day 0 payers: 340 inference, 61980 training d8, 27839 training d100
Other users: 7265 inference, 1400682 training d8, 1124496 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 23482 records with non-zero proceeds
Global average proceeds: 56.96484345643776
Calculated average proceeds for 69 products
Calculated average proceeds for 104 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 57662, 1s: 23482
Removed 0 outliers from 81144 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  19%|█▉        | 12/62 [1:07:56<4:59:35, 359.51s/it]


Processing inference date: 2024-12-13
Inference data: 6948 records
Training d8 data: 1546418 records
Training d100 data: 1264621 records
Trial users: 197 inference, 80731 training d8, 109135 training d100
Day 0 payers: 367 inference, 63145 training d8, 27932 training d100
Other users: 6381 inference, 1401714 training d8, 1127266 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 23300 records with non-zero proceeds
Global average proceeds: 57.02393237620464
Calculated average proceeds for 69 products
Calculated average proceeds for 104 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 57431, 1s: 23300
Removed 0 outliers from 80731 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  21%|██        | 13/62 [1:13:01<4:40:11, 343.09s/it]


Processing inference date: 2024-12-14
Inference data: 7663 records
Training d8 data: 1547201 records
Training d100 data: 1267397 records
Trial users: 224 inference, 80218 training d8, 108869 training d100
Day 0 payers: 392 inference, 64067 training d8, 28063 training d100
Other users: 7036 inference, 1402085 training d8, 1130169 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 23083 records with non-zero proceeds
Global average proceeds: 57.11317466947797
Calculated average proceeds for 69 products
Calculated average proceeds for 104 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 57135, 1s: 23083
Removed 0 outliers from 80218 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  23%|██▎       | 14/62 [1:19:05<4:39:29, 349.37s/it]


Processing inference date: 2024-12-15
Inference data: 8305 records
Training d8 data: 1546815 records
Training d100 data: 1269566 records
Trial users: 305 inference, 79554 training d8, 108527 training d100
Day 0 payers: 462 inference, 64791 training d8, 28255 training d100
Other users: 7536 inference, 1401640 training d8, 1132487 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 22863 records with non-zero proceeds
Global average proceeds: 57.19976109985827
Calculated average proceeds for 69 products
Calculated average proceeds for 104 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 56691, 1s: 22863
Removed 0 outliers from 79554 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  24%|██▍       | 15/62 [1:25:11<4:37:25, 354.16s/it]


Processing inference date: 2024-12-16
Inference data: 7861 records
Training d8 data: 1546662 records
Training d100 data: 1270553 records
Trial users: 273 inference, 79096 training d8, 107948 training d100
Day 0 payers: 359 inference, 65506 training d8, 28398 training d100
Other users: 7226 inference, 1401222 training d8, 1133908 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 22713 records with non-zero proceeds
Global average proceeds: 57.29551985000457
Calculated average proceeds for 69 products
Calculated average proceeds for 104 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 56383, 1s: 22713
Removed 0 outliers from 79096 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  26%|██▌       | 16/62 [1:31:03<4:31:00, 353.49s/it]


Processing inference date: 2024-12-17
Inference data: 9905 records
Training d8 data: 1547801 records
Training d100 data: 1272701 records
Trial users: 294 inference, 78642 training d8, 107551 training d100
Day 0 payers: 376 inference, 66437 training d8, 28649 training d100
Other users: 9228 inference, 1401877 training d8, 1136196 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 22516 records with non-zero proceeds
Global average proceeds: 57.44002683367148
Calculated average proceeds for 69 products
Calculated average proceeds for 104 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 56126, 1s: 22516
Removed 0 outliers from 78642 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  27%|██▋       | 17/62 [1:36:23<4:17:45, 343.68s/it]


Processing inference date: 2024-12-18
Inference data: 10547 records
Training d8 data: 1549123 records
Training d100 data: 1276266 records
Trial users: 329 inference, 78297 training d8, 107166 training d100
Day 0 payers: 422 inference, 66814 training d8, 28957 training d100
Other users: 9787 inference, 1403161 training d8, 1139825 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 22388 records with non-zero proceeds
Global average proceeds: 57.53324690570087
Calculated average proceeds for 69 products
Calculated average proceeds for 103 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 55909, 1s: 22388
Removed 0 outliers from 78297 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  29%|██▉       | 18/62 [1:41:25<4:02:45, 331.04s/it]


Processing inference date: 2024-12-19
Inference data: 10202 records
Training d8 data: 1551696 records
Training d100 data: 1279752 records
Trial users: 321 inference, 78037 training d8, 106858 training d100
Day 0 payers: 409 inference, 67027 training d8, 29165 training d100
Other users: 9466 inference, 1405773 training d8, 1143408 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 22269 records with non-zero proceeds
Global average proceeds: 57.61263241300884
Calculated average proceeds for 69 products
Calculated average proceeds for 102 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 55768, 1s: 22269
Removed 0 outliers from 78037 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  31%|███       | 19/62 [1:46:30<3:51:41, 323.28s/it]


Processing inference date: 2024-12-20
Inference data: 11184 records
Training d8 data: 1555420 records
Training d100 data: 1283800 records
Trial users: 284 inference, 77770 training d8, 106514 training d100
Day 0 payers: 412 inference, 67314 training d8, 29414 training d100
Other users: 10486 inference, 1409472 training d8, 1147550 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 22137 records with non-zero proceeds
Global average proceeds: 57.67996652382237
Calculated average proceeds for 69 products
Calculated average proceeds for 102 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 55633, 1s: 22137
Removed 0 outliers from 77770 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting 

Processing dates:  32%|███▏      | 20/62 [1:51:40<3:43:21, 319.09s/it]


Processing inference date: 2024-12-21
Inference data: 11192 records
Training d8 data: 1557814 records
Training d100 data: 1288835 records
Trial users: 314 inference, 77405 training d8, 106256 training d100
Day 0 payers: 437 inference, 67555 training d8, 29651 training d100
Other users: 10433 inference, 1411986 training d8, 1152599 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 21966 records with non-zero proceeds
Global average proceeds: 57.775067782905055
Calculated average proceeds for 69 products
Calculated average proceeds for 100 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 55439, 1s: 21966
Removed 0 outliers from 77405 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting

Processing dates:  34%|███▍      | 21/62 [1:57:07<3:39:44, 321.56s/it]


Processing inference date: 2024-12-22
Inference data: 12093 records
Training d8 data: 1559063 records
Training d100 data: 1292808 records
Trial users: 337 inference, 76951 training d8, 105811 training d100
Day 0 payers: 506 inference, 67820 training d8, 29870 training d100
Other users: 11243 inference, 1413422 training d8, 1156793 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 21786 records with non-zero proceeds
Global average proceeds: 57.85472457988247
Calculated average proceeds for 69 products
Calculated average proceeds for 98 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 55165, 1s: 21786
Removed 0 outliers from 76951 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  35%|███▌      | 22/62 [2:02:33<3:35:16, 322.92s/it]


Processing inference date: 2024-12-23
Inference data: 11854 records
Training d8 data: 1561304 records
Training d100 data: 1294847 records
Trial users: 338 inference, 76611 training d8, 105211 training d100
Day 0 payers: 447 inference, 68114 training d8, 29981 training d100
Other users: 11065 inference, 1415699 training d8, 1159319 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 21650 records with non-zero proceeds
Global average proceeds: 57.92686674783234
Calculated average proceeds for 69 products
Calculated average proceeds for 98 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54961, 1s: 21650
Removed 0 outliers from 76611 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  37%|███▋      | 23/62 [2:08:04<3:31:22, 325.20s/it]


Processing inference date: 2024-12-24
Inference data: 11894 records
Training d8 data: 1564217 records
Training d100 data: 1297625 records
Trial users: 330 inference, 76365 training d8, 104740 training d100
Day 0 payers: 445 inference, 68482 training d8, 30201 training d100
Other users: 11117 inference, 1418488 training d8, 1162350 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 21524 records with non-zero proceeds
Global average proceeds: 58.03500068194918
Calculated average proceeds for 69 products
Calculated average proceeds for 98 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54841, 1s: 21524
Removed 0 outliers from 76365 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  39%|███▊      | 24/62 [2:13:23<3:24:55, 323.57s/it]


Processing inference date: 2024-12-25
Inference data: 12026 records
Training d8 data: 1566549 records
Training d100 data: 1300867 records
Trial users: 389 inference, 76076 training d8, 104348 training d100
Day 0 payers: 464 inference, 68768 training d8, 30522 training d100
Other users: 11168 inference, 1420821 training d8, 1165662 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 21395 records with non-zero proceeds
Global average proceeds: 58.12689583708886
Calculated average proceeds for 69 products
Calculated average proceeds for 98 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54681, 1s: 21395
Removed 0 outliers from 76076 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  40%|████      | 25/62 [2:18:34<3:17:14, 319.86s/it]


Processing inference date: 2024-12-26
Inference data: 15336 records
Training d8 data: 1570956 records
Training d100 data: 1302206 records
Trial users: 518 inference, 75730 training d8, 103828 training d100
Day 0 payers: 586 inference, 69056 training d8, 30700 training d100
Other users: 14231 inference, 1425280 training d8, 1167343 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 21219 records with non-zero proceeds
Global average proceeds: 58.2053867394533
Calculated average proceeds for 69 products
Calculated average proceeds for 99 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54511, 1s: 21219
Removed 0 outliers from 75730 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  42%|████▏     | 26/62 [2:23:59<3:12:49, 321.37s/it]


Processing inference date: 2024-12-27
Inference data: 15297 records
Training d8 data: 1576091 records
Training d100 data: 1301048 records
Trial users: 520 inference, 75490 training d8, 103318 training d100
Day 0 payers: 582 inference, 69380 training d8, 30886 training d100
Other users: 14192 inference, 1430322 training d8, 1166504 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 21085 records with non-zero proceeds
Global average proceeds: 58.287497339865595
Calculated average proceeds for 70 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54405, 1s: 21085
Removed 0 outliers from 75490 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting

Processing dates:  44%|████▎     | 27/62 [2:29:27<3:08:30, 323.16s/it]


Processing inference date: 2024-12-28
Inference data: 16447 records
Training d8 data: 1580479 records
Training d100 data: 1299954 records
Trial users: 537 inference, 75187 training d8, 102841 training d100
Day 0 payers: 734 inference, 69674 training d8, 31034 training d100
Other users: 15166 inference, 1434713 training d8, 1165735 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 20887 records with non-zero proceeds
Global average proceeds: 58.362878822783344
Calculated average proceeds for 70 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54300, 1s: 20887
Removed 0 outliers from 75187 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting

Processing dates:  45%|████▌     | 28/62 [2:34:58<3:04:30, 325.60s/it]


Processing inference date: 2024-12-29
Inference data: 17984 records
Training d8 data: 1585128 records
Training d100 data: 1298120 records
Trial users: 584 inference, 74764 training d8, 102206 training d100
Day 0 payers: 899 inference, 69958 training d8, 31093 training d100
Other users: 16492 inference, 1439500 training d8, 1164478 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 20649 records with non-zero proceeds
Global average proceeds: 58.487834606959815
Calculated average proceeds for 70 products
Calculated average proceeds for 99 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54115, 1s: 20649
Removed 0 outliers from 74764 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting 

Processing dates:  47%|████▋     | 29/62 [2:40:20<2:58:27, 324.47s/it]


Processing inference date: 2024-12-30
Inference data: 17260 records
Training d8 data: 1590382 records
Training d100 data: 1295160 records
Trial users: 586 inference, 74406 training d8, 101364 training d100
Day 0 payers: 738 inference, 70300 training d8, 31037 training d100
Other users: 15929 inference, 1444762 training d8, 1162417 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 20467 records with non-zero proceeds
Global average proceeds: 58.57511667154512
Calculated average proceeds for 70 products
Calculated average proceeds for 98 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53939, 1s: 20467
Removed 0 outliers from 74406 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  48%|████▊     | 30/62 [2:45:50<2:54:02, 326.33s/it]


Processing inference date: 2024-12-31
Inference data: 15617 records
Training d8 data: 1596644 records
Training d100 data: 1293868 records
Trial users: 491 inference, 74061 training d8, 100765 training d100
Day 0 payers: 733 inference, 70701 training d8, 31106 training d100
Other users: 14377 inference, 1450963 training d8, 1161643 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 20307 records with non-zero proceeds
Global average proceeds: 58.662992400268024
Calculated average proceeds for 70 products
Calculated average proceeds for 99 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53754, 1s: 20307
Removed 0 outliers from 74061 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting 

Processing dates:  50%|█████     | 31/62 [2:51:12<2:47:47, 324.76s/it]


Processing inference date: 2025-01-01
Inference data: 20189 records
Training d8 data: 1602771 records
Training d100 data: 1293924 records
Trial users: 689 inference, 73760 training d8, 100238 training d100
Day 0 payers: 1050 inference, 71068 training d8, 31308 training d100
Other users: 18434 inference, 1457020 training d8, 1162017 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 20160 records with non-zero proceeds
Global average proceeds: 58.72823042025364
Calculated average proceeds for 71 products
Calculated average proceeds for 99 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53600, 1s: 20160
Removed 0 outliers from 73760 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting 

Processing dates:  52%|█████▏    | 32/62 [2:56:54<2:45:02, 330.07s/it]


Processing inference date: 2025-01-02
Inference data: 21824 records
Training d8 data: 1608995 records
Training d100 data: 1293187 records
Trial users: 787 inference, 73444 training d8, 99700 training d100
Day 0 payers: 986 inference, 71415 training d8, 31438 training d100
Other users: 20040 inference, 1463211 training d8, 1161681 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 19972 records with non-zero proceeds
Global average proceeds: 58.84432258864497
Calculated average proceeds for 71 products
Calculated average proceeds for 99 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53472, 1s: 19972
Removed 0 outliers from 73444 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  53%|█████▎    | 33/62 [3:02:30<2:40:24, 331.88s/it]


Processing inference date: 2025-01-03
Inference data: 22097 records
Training d8 data: 1615850 records
Training d100 data: 1292371 records
Trial users: 780 inference, 73265 training d8, 99167 training d100
Day 0 payers: 913 inference, 71807 training d8, 31592 training d100
Other users: 20402 inference, 1469849 training d8, 1161241 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 19860 records with non-zero proceeds
Global average proceeds: 58.92247205015578
Calculated average proceeds for 71 products
Calculated average proceeds for 98 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53405, 1s: 19860
Removed 0 outliers from 73265 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  55%|█████▍    | 34/62 [3:08:02<2:34:54, 331.94s/it]


Processing inference date: 2025-01-04
Inference data: 21143 records
Training d8 data: 1625535 records
Training d100 data: 1291320 records
Trial users: 733 inference, 73124 training d8, 98582 training d100
Day 0 payers: 915 inference, 72291 training d8, 31690 training d100
Other users: 19483 inference, 1479191 training d8, 1160673 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 19701 records with non-zero proceeds
Global average proceeds: 59.03863694621826
Calculated average proceeds for 72 products
Calculated average proceeds for 97 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53423, 1s: 19701
Removed 0 outliers from 73124 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  56%|█████▋    | 35/62 [3:13:28<2:28:31, 330.07s/it]


Processing inference date: 2025-01-05
Inference data: 19874 records
Training d8 data: 1633772 records
Training d100 data: 1291116 records
Trial users: 798 inference, 72764 training d8, 98043 training d100
Day 0 payers: 996 inference, 72736 training d8, 31856 training d100
Other users: 18066 inference, 1487341 training d8, 1160838 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 19503 records with non-zero proceeds
Global average proceeds: 59.17044612295697
Calculated average proceeds for 72 products
Calculated average proceeds for 98 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53261, 1s: 19503
Removed 0 outliers from 72764 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  58%|█████▊    | 36/62 [3:18:52<2:22:15, 328.27s/it]


Processing inference date: 2025-01-06
Inference data: 16979 records
Training d8 data: 1643909 records
Training d100 data: 1288228 records
Trial users: 733 inference, 72622 training d8, 97345 training d100
Day 0 payers: 795 inference, 73374 training d8, 31837 training d100
Other users: 15446 inference, 1496972 training d8, 1158664 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 19362 records with non-zero proceeds
Global average proceeds: 59.32632704674099
Calculated average proceeds for 73 products
Calculated average proceeds for 99 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53260, 1s: 19362
Removed 0 outliers from 72622 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  60%|█████▉    | 37/62 [3:24:12<2:15:46, 325.85s/it]


Processing inference date: 2025-01-07
Inference data: 13749 records
Training d8 data: 1655772 records
Training d100 data: 1286965 records
Trial users: 570 inference, 72557 training d8, 96725 training d100
Day 0 payers: 636 inference, 74182 training d8, 31974 training d100
Other users: 12531 inference, 1508083 training d8, 1157881 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 19248 records with non-zero proceeds
Global average proceeds: 59.44626970003952
Calculated average proceeds for 73 products
Calculated average proceeds for 99 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53309, 1s: 19248
Removed 0 outliers from 72557 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  61%|██████▏   | 38/62 [3:29:50<2:11:47, 329.49s/it]


Processing inference date: 2025-01-08
Inference data: 13708 records
Training d8 data: 1667031 records
Training d100 data: 1287169 records
Trial users: 538 inference, 72445 training d8, 96263 training d100
Day 0 payers: 590 inference, 74815 training d8, 32202 training d100
Other users: 12570 inference, 1518814 training d8, 1158312 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 19124 records with non-zero proceeds
Global average proceeds: 59.53937836100511
Calculated average proceeds for 73 products
Calculated average proceeds for 100 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53321, 1s: 19124
Removed 0 outliers from 72445 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  63%|██████▎   | 39/62 [3:35:27<2:07:12, 331.83s/it]


Processing inference date: 2025-01-09
Inference data: 12448 records
Training d8 data: 1676909 records
Training d100 data: 1287685 records
Trial users: 525 inference, 72342 training d8, 95773 training d100
Day 0 payers: 562 inference, 75452 training d8, 32361 training d100
Other users: 11354 inference, 1528143 training d8, 1159148 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 18985 records with non-zero proceeds
Global average proceeds: 59.65699457935536
Calculated average proceeds for 73 products
Calculated average proceeds for 100 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53357, 1s: 18985
Removed 0 outliers from 72342 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  65%|██████▍   | 40/62 [3:41:00<2:01:43, 331.98s/it]


Processing inference date: 2025-01-10
Inference data: 12160 records
Training d8 data: 1691392 records
Training d100 data: 1289871 records
Trial users: 454 inference, 72430 training d8, 95323 training d100
Day 0 payers: 496 inference, 76407 training d8, 32531 training d100
Other users: 11203 inference, 1541567 training d8, 1161608 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 18914 records with non-zero proceeds
Global average proceeds: 59.81240203940302
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53516, 1s: 18914
Removed 0 outliers from 72430 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  66%|██████▌   | 41/62 [3:46:31<1:56:09, 331.87s/it]


Processing inference date: 2025-01-11
Inference data: 11970 records
Training d8 data: 1706935 records
Training d100 data: 1292968 records
Trial users: 460 inference, 72439 training d8, 95014 training d100
Day 0 payers: 583 inference, 77296 training d8, 32872 training d100
Other users: 10911 inference, 1556201 training d8, 1164662 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 18775 records with non-zero proceeds
Global average proceeds: 59.991876257259484
Calculated average proceeds for 74 products
Calculated average proceeds for 102 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53664, 1s: 18775
Removed 0 outliers from 72439 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting 

Processing dates:  68%|██████▊   | 42/62 [3:52:12<1:51:27, 334.37s/it]


Processing inference date: 2025-01-12
Inference data: 12574 records
Training d8 data: 1722133 records
Training d100 data: 1295453 records
Trial users: 525 inference, 72323 training d8, 94678 training d100
Day 0 payers: 669 inference, 78100 training d8, 33179 training d100
Other users: 11371 inference, 1570710 training d8, 1167165 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 18599 records with non-zero proceeds
Global average proceeds: 60.26402749522449
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53724, 1s: 18599
Removed 0 outliers from 72323 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  69%|██████▉   | 43/62 [3:58:14<1:48:34, 342.84s/it]


Processing inference date: 2025-01-13
Inference data: 11746 records
Training d8 data: 1736605 records
Training d100 data: 1297207 records
Trial users: 445 inference, 72347 training d8, 94245 training d100
Day 0 payers: 520 inference, 78920 training d8, 33462 training d100
Other users: 10773 inference, 1584326 training d8, 1169058 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 18490 records with non-zero proceeds
Global average proceeds: 60.469292011808164
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53857, 1s: 18490
Removed 0 outliers from 72347 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting 

Processing dates:  71%|███████   | 44/62 [4:03:56<1:42:42, 342.39s/it]


Processing inference date: 2025-01-14
Inference data: 11534 records
Training d8 data: 1750005 records
Training d100 data: 1299331 records
Trial users: 431 inference, 72446 training d8, 93896 training d100
Day 0 payers: 475 inference, 79826 training d8, 33800 training d100
Other users: 10624 inference, 1596707 training d8, 1171190 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 18423 records with non-zero proceeds
Global average proceeds: 60.69393397574629
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54023, 1s: 18423
Removed 0 outliers from 72446 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  73%|███████▎  | 45/62 [4:09:37<1:36:53, 341.98s/it]


Processing inference date: 2025-01-15
Inference data: 11183 records
Training d8 data: 1760698 records
Training d100 data: 1302379 records
Trial users: 408 inference, 72522 training d8, 93566 training d100
Day 0 payers: 444 inference, 80550 training d8, 34168 training d100
Other users: 10324 inference, 1606595 training d8, 1174188 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 18353 records with non-zero proceeds
Global average proceeds: 60.8658110477033
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54169, 1s: 18353
Removed 0 outliers from 72522 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  74%|███████▍  | 46/62 [4:15:25<1:31:43, 343.94s/it]


Processing inference date: 2025-01-16
Inference data: 11749 records
Training d8 data: 1768752 records
Training d100 data: 1305126 records
Trial users: 362 inference, 72419 training d8, 93273 training d100
Day 0 payers: 427 inference, 81116 training d8, 34458 training d100
Other users: 10956 inference, 1614176 training d8, 1176933 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 18230 records with non-zero proceeds
Global average proceeds: 61.00571695999672
Calculated average proceeds for 74 products
Calculated average proceeds for 102 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54189, 1s: 18230
Removed 0 outliers from 72419 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting p

Processing dates:  76%|███████▌  | 47/62 [4:21:17<1:26:33, 346.25s/it]


Processing inference date: 2025-01-17
Inference data: 9200 records
Training d8 data: 1776147 records
Training d100 data: 1307724 records
Trial users: 279 inference, 72292 training d8, 93012 training d100
Day 0 payers: 411 inference, 81625 training d8, 34789 training d100
Other users: 8507 inference, 1621181 training d8, 1179453 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 18062 records with non-zero proceeds
Global average proceeds: 61.16009759016014
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54230, 1s: 18062
Removed 0 outliers from 72292 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pre

Processing dates:  77%|███████▋  | 48/62 [4:27:00<1:20:36, 345.44s/it]


Processing inference date: 2025-01-18
Inference data: 8925 records
Training d8 data: 1781383 records
Training d100 data: 1309977 records
Trial users: 297 inference, 72146 training d8, 92774 training d100
Day 0 payers: 418 inference, 82115 training d8, 35070 training d100
Other users: 8200 inference, 1626067 training d8, 1181656 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 17907 records with non-zero proceeds
Global average proceeds: 61.29524133458141
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54239, 1s: 17907
Removed 0 outliers from 72146 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pre

Processing dates:  79%|███████▉  | 49/62 [4:32:42<1:14:36, 344.35s/it]


Processing inference date: 2025-01-19
Inference data: 9641 records
Training d8 data: 1785951 records
Training d100 data: 1312307 records
Trial users: 307 inference, 71742 training d8, 92529 training d100
Day 0 payers: 484 inference, 82502 training d8, 35354 training d100
Other users: 8849 inference, 1630646 training d8, 1183937 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 17645 records with non-zero proceeds
Global average proceeds: 61.48017976142294
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54097, 1s: 17645
Removed 0 outliers from 71742 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pre

Processing dates:  81%|████████  | 50/62 [4:38:22<1:08:35, 342.99s/it]


Processing inference date: 2025-01-20
Inference data: 8380 records
Training d8 data: 1790895 records
Training d100 data: 1314327 records
Trial users: 276 inference, 71682 training d8, 92182 training d100
Day 0 payers: 368 inference, 82912 training d8, 35632 training d100
Other users: 7733 inference, 1635225 training d8, 1186020 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 17558 records with non-zero proceeds
Global average proceeds: 61.60930293649977
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54124, 1s: 17558
Removed 0 outliers from 71682 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pre

Processing dates:  82%|████████▏ | 51/62 [4:43:56<1:02:25, 340.46s/it]


Processing inference date: 2025-01-21
Inference data: 8093 records
Training d8 data: 1795519 records
Training d100 data: 1317708 records
Trial users: 298 inference, 71556 training d8, 91880 training d100
Day 0 payers: 390 inference, 83422 training d8, 35932 training d100
Other users: 7401 inference, 1639458 training d8, 1189393 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 17495 records with non-zero proceeds
Global average proceeds: 61.73710484770531
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 54061, 1s: 17495
Removed 0 outliers from 71556 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pre

Processing dates:  84%|████████▍ | 52/62 [4:49:42<57:00, 342.00s/it]  


Processing inference date: 2025-01-22
Inference data: 7718 records
Training d8 data: 1799129 records
Training d100 data: 1322548 records
Trial users: 278 inference, 71353 training d8, 91725 training d100
Day 0 payers: 351 inference, 83794 training d8, 36300 training d100
Other users: 7085 inference, 1642892 training d8, 1194005 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 17422 records with non-zero proceeds
Global average proceeds: 61.851340536030364
Calculated average proceeds for 74 products
Calculated average proceeds for 101 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53931, 1s: 17422
Removed 0 outliers from 71353 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  85%|████████▌ | 53/62 [4:55:19<51:05, 340.58s/it]


Processing inference date: 2025-01-23
Inference data: 7671 records
Training d8 data: 1802463 records
Training d100 data: 1327687 records
Trial users: 242 inference, 71204 training d8, 91520 training d100
Day 0 payers: 345 inference, 84087 training d8, 36642 training d100
Other users: 7081 inference, 1646080 training d8, 1198998 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 17330 records with non-zero proceeds
Global average proceeds: 61.95617499855132
Calculated average proceeds for 74 products
Calculated average proceeds for 100 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53874, 1s: 17330
Removed 0 outliers from 71204 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pre

Processing dates:  87%|████████▋ | 54/62 [5:01:03<45:32, 341.54s/it]


Processing inference date: 2025-01-24
Inference data: 7387 records
Training d8 data: 1805680 records
Training d100 data: 1332673 records
Trial users: 231 inference, 71039 training d8, 91394 training d100
Day 0 payers: 356 inference, 84317 training d8, 36979 training d100
Other users: 6798 inference, 1649227 training d8, 1203763 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 17234 records with non-zero proceeds
Global average proceeds: 62.06423118424153
Calculated average proceeds for 74 products
Calculated average proceeds for 99 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53805, 1s: 17234
Removed 0 outliers from 71039 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting prep

Processing dates:  89%|████████▊ | 55/62 [5:06:44<39:49, 341.41s/it]


Processing inference date: 2025-01-25
Inference data: 9253 records
Training d8 data: 1809104 records
Training d100 data: 1338243 records
Trial users: 308 inference, 70744 training d8, 91291 training d100
Day 0 payers: 384 inference, 84515 training d8, 37294 training d100
Other users: 8548 inference, 1652746 training d8, 1209113 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 17092 records with non-zero proceeds
Global average proceeds: 62.18970362287199
Calculated average proceeds for 74 products
Calculated average proceeds for 98 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53652, 1s: 17092
Removed 0 outliers from 70744 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting prep

Processing dates:  90%|█████████ | 56/62 [5:12:36<34:26, 344.47s/it]


Processing inference date: 2025-01-26
Inference data: 10997 records
Training d8 data: 1808779 records
Training d100 data: 1345173 records
Trial users: 334 inference, 70183 training d8, 91141 training d100
Day 0 payers: 415 inference, 84681 training d8, 37570 training d100
Other users: 10244 inference, 1652819 training d8, 1215907 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 16895 records with non-zero proceeds
Global average proceeds: 62.35092182990397
Calculated average proceeds for 74 products
Calculated average proceeds for 97 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53288, 1s: 16895
Removed 0 outliers from 70183 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pr

Processing dates:  92%|█████████▏| 57/62 [5:18:19<28:40, 344.03s/it]


Processing inference date: 2025-01-27
Inference data: 9606 records
Training d8 data: 1809315 records
Training d100 data: 1351781 records
Trial users: 287 inference, 69810 training d8, 90974 training d100
Day 0 payers: 285 inference, 84935 training d8, 37798 training d100
Other users: 9029 inference, 1653468 training d8, 1222449 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 16789 records with non-zero proceeds
Global average proceeds: 62.45620225325805
Calculated average proceeds for 74 products
Calculated average proceeds for 96 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 53021, 1s: 16789
Removed 0 outliers from 69810 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting prep

Processing dates:  94%|█████████▎| 58/62 [5:24:06<23:00, 345.07s/it]


Processing inference date: 2025-01-28
Inference data: 9177 records
Training d8 data: 1811620 records
Training d100 data: 1359656 records
Trial users: 265 inference, 69469 training d8, 90869 training d100
Day 0 payers: 287 inference, 85243 training d8, 38130 training d100
Other users: 8620 inference, 1655811 training d8, 1230086 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 16666 records with non-zero proceeds
Global average proceeds: 62.59715956290246
Calculated average proceeds for 73 products
Calculated average proceeds for 96 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 52803, 1s: 16666
Removed 0 outliers from 69469 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting prep

Processing dates:  95%|█████████▌| 59/62 [5:30:18<17:39, 353.02s/it]


Processing inference date: 2025-01-29
Inference data: 8935 records
Training d8 data: 1812623 records
Training d100 data: 1367369 records
Trial users: 236 inference, 69136 training d8, 90793 training d100
Day 0 payers: 276 inference, 85427 training d8, 38499 training d100
Other users: 8414 inference, 1656961 training d8, 1237495 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 16555 records with non-zero proceeds
Global average proceeds: 62.6965305607158
Calculated average proceeds for 73 products
Calculated average proceeds for 95 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 52581, 1s: 16555
Removed 0 outliers from 69136 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting prepr

Processing dates:  97%|█████████▋| 60/62 [5:36:11<11:46, 353.07s/it]


Processing inference date: 2025-01-30
Inference data: 8964 records
Training d8 data: 1812454 records
Training d100 data: 1373409 records
Trial users: 242 inference, 68828 training d8, 90641 training d100
Day 0 payers: 269 inference, 85651 training d8, 38769 training d100
Other users: 8443 inference, 1656873 training d8, 1243412 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 16433 records with non-zero proceeds
Global average proceeds: 62.817657515099285
Calculated average proceeds for 72 products
Calculated average proceeds for 95 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 52395, 1s: 16433
Removed 0 outliers from 68828 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pre

Processing dates:  98%|█████████▊| 61/62 [5:42:06<05:53, 353.46s/it]


Processing inference date: 2025-01-31
Inference data: 9272 records
Training d8 data: 1811920 records
Training d100 data: 1379328 records
Trial users: 280 inference, 68501 training d8, 90527 training d100
Day 0 payers: 248 inference, 85845 training d8, 39006 training d100
Other users: 8739 inference, 1656468 training d8, 1249199 training d100
Training models...
Train trial model
Starting trial model fitting...
Calculating historical proceeds...
Found 16309 records with non-zero proceeds
Global average proceeds: 62.937066821183656
Calculated average proceeds for 72 products
Calculated average proceeds for 94 product-country combinations
Target values before outlier removal: [0 1]
Target distribution: 0s: 52192, 1s: 16309
Removed 0 outliers from 68501 samples (0.00%)
Numerical features: ['eur_marketing_spend', 'impressions', 'clicks']
Categorical features: ['channel_group', 'marketing_network_id', 'target_market', 'signup_country_group', 'signup_client_platform', 'plan_tier']
Fitting pre

Processing dates: 100%|██████████| 62/62 [5:47:52<00:00, 336.65s/it]


Combined results: 707467 records across 62 dates





## 6. Save Results

Save the combined results locally and to Snowflake.

In [9]:

# Create output directory if it doesn't exist
os.makedirs('output', exist_ok=True)

# Save to CSV
date_range = f"{inference_dates[0]}_to_{inference_dates[-1]}"
output_path = f"output/predictions_{date_range}.csv"
combined_results.to_csv(output_path, index=False)
print(f"Predictions saved locally to {output_path}")

# Save to Snowflake
print("\nSaving results to Snowflake...")
# Reconnect to Snowflake
session = get_snowflake_session()


# Convert to Snowpark DataFrame
snowpark_df = session.create_dataframe(combined_results)

# Save to Snowflake table
table_name = "BLINKIST_DEV.DBT_MJAAMA.MULTI_DATE_EXPECTED_PROCEEDS_20250319"

# Append to existing table or create new one
snowpark_df.write.mode("append").save_as_table(table_name)

print(f"Predictions for {len(inference_dates)} dates saved to {table_name}")


2025-03-19 01:00:13,956 - snowflake.connector.connection - INFO - Snowflake Connector for Python Version: 3.14.0, Python Version: 3.9.6, Platform: macOS-15.2-arm64-arm-64bit
2025-03-19 01:00:13,957 - snowflake.connector.connection - INFO - Connecting to GLOBAL Snowflake domain
2025-03-19 01:00:13,957 - snowflake.connector.connection - INFO - This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.


Predictions saved locally to output/predictions_2024-12-01_to_2025-01-31.csv

Saving results to Snowflake...
Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Going to open: https://blinkist-useast_1_virginia.snowflakecomputing.com/console/login?login_name=meri-kris.jaama%40go1.com&browser_mode_redirect_port=52619&proof_key=zTvIpLRx%2F%2BcvJvPBwcRfSHmyC08FKBDG%2BRTYUCG1VMI%3D to authenticate...


2025-03-19 04:27:57,876 - snowflake.snowpark.session - INFO - Snowpark Session information: 
"version" : 1.29.1,
"python.version" : 3.9.6,
"python.connector.version" : 3.14.0,
"python.connector.session.id" : 9893985662164914,
"os.name" : Darwin



Connected to Snowflake as meri-kris.jaama@go1.com
Predictions for 62 dates saved to BLINKIST_DEV.DBT_MJAAMA.MULTI_DATE_EXPECTED_PROCEEDS_20250319


## 7. Close Snowflake Session

Close the Snowflake session when done.

In [10]:
# Close Snowflake session
session.close()
print("Snowflake session closed")

2025-03-19 04:28:40,894 - snowflake.snowpark.session - INFO - Closing session: 9893985662164914
2025-03-19 04:28:40,895 - snowflake.snowpark.session - INFO - Canceling all running queries
2025-03-19 04:28:41,110 - snowflake.connector.connection - INFO - closed
2025-03-19 04:28:41,250 - snowflake.connector.connection - INFO - No async queries seem to be running, deleting session
2025-03-19 04:28:41,427 - snowflake.snowpark.session - INFO - Closed session: 9893985662164914


Snowflake session closed


In [12]:
# Import necessary libraries
import pandas as pd
import numpy as np
from snowflake.snowpark import Session
import config
from prediction_processor import PredictionProcessor

# Connect to Snowflake
def get_snowflake_session():
    connection_parameters = {
        "account": config.SNOWFLAKE_ACCOUNT,
        "user": config.SNOWFLAKE_USER,
        "role": config.SNOWFLAKE_ROLE,
        "warehouse": config.SNOWFLAKE_WAREHOUSE,
        "database": config.SNOWFLAKE_DATABASE,
        "schema": config.SNOWFLAKE_SCHEMA,
        "authenticator": config.SNOWFLAKE_AUTHENTICATOR
    }
    session = Session.builder.configs(connection_parameters).create()
    print(f"Connected to Snowflake as {config.SNOWFLAKE_USER}")
    return session

# Initialize Snowflake session
session = get_snowflake_session()

# Define source and target table names
source_table = "BLINKIST_DEV.DBT_MJAAMA.MULTI_DATE_EXPECTED_PROCEEDS_20250319"
target_table = "BLINKIST_DEV.DBT_MJAAMA.AGGREGATED_EXPECTED_PROCEEDS_20250320"

# Read data from Snowflake
print(f"Reading data from {source_table}...")
query = f"SELECT * FROM {source_table}"
predictions_df = session.sql(query).to_pandas()
print(f"Read {len(predictions_df)} rows from source table")

# Initialize the PredictionProcessor with the fixed implementation
processor = PredictionProcessor()

# Process the predictions
print("Processing predictions...")
aggregated_df = processor.process_predictions(predictions_df)
print(f"Processed data into {len(aggregated_df)} aggregated rows")

# Check for NaN values in the calculated columns
roi_columns = ['roi_d8', 'exp_roi_d8', 'roi_d100', 'exp_roi_d100', 
               'actual_roi_d8', 'actual_roi_d100', 'cpa']
nan_counts = {col: aggregated_df[col].isna().sum() for col in roi_columns if col in aggregated_df.columns}

print("\nNaN value check:")
for col, count in nan_counts.items():
    print(f"{col}: {count} NaN values out of {len(aggregated_df)} rows ({count/len(aggregated_df)*100:.2f}%)")

if any(count > 0 for count in nan_counts.values()):
    print("\nWARNING: NaN values detected in calculated columns. Review the data before proceeding.")
    # Optional: Uncomment to stop execution if NaN values are found
    # raise ValueError("NaN values detected in calculated columns")
else:
    print("\nNo NaN values found in calculated columns. Proceeding with Snowflake upload.")

# Display sample of the aggregated data
print("\nSample of aggregated data:")
display(aggregated_df.head())

# Write the aggregated data back to Snowflake
print(f"\nWriting aggregated data to {target_table}...")

# Check if the target table exists
table_exists = session.sql(f"SHOW TABLES LIKE '{target_table}'").collect()

if table_exists:
    print(f"Target table {target_table} exists. It will be overwritten.")
    # Drop the existing table
    session.sql(f"DROP TABLE IF EXISTS {target_table}").collect()
    print(f"Existing table {target_table} dropped.")

# Convert pandas DataFrame to Snowpark DataFrame
aggregated_snowpark_df = session.create_dataframe(aggregated_df)

# Write to Snowflake with overwrite mode
aggregated_snowpark_df.write.mode("overwrite").save_as_table(target_table)

print(f"Successfully wrote {len(aggregated_df)} rows to {target_table}")

# Close Snowflake session
session.close()
print("Snowflake session closed")

2025-03-19 09:07:27,287 - snowflake.connector.connection - INFO - Snowflake Connector for Python Version: 3.14.0, Python Version: 3.9.6, Platform: macOS-15.2-arm64-arm-64bit
2025-03-19 09:07:27,288 - snowflake.connector.connection - INFO - Connecting to GLOBAL Snowflake domain
2025-03-19 09:07:27,289 - snowflake.connector.connection - INFO - This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.


Initiating login request with your identity provider. A browser window should have opened for you to complete the login. If you can't see it, check existing browser windows, or your OS settings. Press CTRL+C to abort and try again...
Going to open: https://blinkist-useast_1_virginia.snowflakecomputing.com/console/login?login_name=meri-kris.jaama%40go1.com&browser_mode_redirect_port=56700&proof_key=%2F2jbKVPBsfLxbTYRj2Fd6rTZ7TvZiMs3F79%2FaKe%2B4t0%3D to authenticate...


2025-03-19 09:07:33,854 - snowflake.snowpark.session - INFO - Snowpark Session information: 
"version" : 1.29.1,
"python.version" : 3.9.6,
"python.connector.version" : 3.14.0,
"python.connector.session.id" : 9893985662438446,
"os.name" : Darwin



Connected to Snowflake as meri-kris.jaama@go1.com
Reading data from BLINKIST_DEV.DBT_MJAAMA.MULTI_DATE_EXPECTED_PROCEEDS_20250319...
Read 1076388 rows from source table
Processing predictions...
Processed data into 4817 aggregated rows

NaN value check:
exp_roi_d8: 0 NaN values out of 4817 rows (0.00%)
exp_roi_d100: 0 NaN values out of 4817 rows (0.00%)
actual_roi_d8: 0 NaN values out of 4817 rows (0.00%)
actual_roi_d100: 0 NaN values out of 4817 rows (0.00%)
cpa: 0 NaN values out of 4817 rows (0.00%)

No NaN values found in calculated columns. Proceeding with Snowflake upload.

Sample of aggregated data:


Unnamed: 0,report_date,channel_group,marketing_network_id,account_id,campaign_name,campaign_id,adgroup_name,adgroup_id,target_market,total_users,...,eur_proceeds_d0,eur_proceeds_d8,eur_proceeds_d100,expected_proceeds_d8,expected_proceeds_d100,exp_roi_d8,exp_roi_d100,actual_roi_d8,actual_roi_d100,cpa
0,2024-12-01,paid_content,facebook,927736373941355,20240705_FB_PRO_ES_Web2App_iOS_S2S_MAN_MNC_Sub...,6595671883259,20240705_FB_PRO_ES_Web2App_iOS_S2S_MAN_MNC_Sub...,6595671882659,row,1,...,0.0,0.0,0.0,0.330357,0.511162,0.0,0.0,0.0,0.0,0.0
1,2024-12-01,paid_content,facebook,927736373941355,20240705_FB_PRO_ES_Web2App_iOS_S2S_MAN_MNC_Sub...,6595671883259,20240815_FB_PRO_ES_Web2App_ALL_S2S_MAN_MNC_Sub...,6602384189259,row,4,...,103.332614,103.332614,103.332614,103.400599,103.400599,0.0,0.0,0.0,0.0,0.0
2,2024-12-01,paid_content,facebook,927736373941355,20240705_FB_PRO_ES_Web2App_iOS_S2S_MAN_MNC_Sub...,6595671883259,20241014_FB_PRO_ES_Web2App_ALL_S2S_MAN_MNC_Sub...,6614643557859,row,1,...,0.0,0.0,0.0,0.067985,0.067985,0.0,0.0,0.0,0.0,0.0
3,2024-12-01,paid_content,facebook,927736373941355,20240712_FB_PRO_T1_Web2App_ALL_S2S_MAN_MNC_Sub...,6596549145459,20240712_FB_PRO_T1_Web2App_ALL_S2S_MAN_MNC_Sub...,6596549145259,row,1,...,0.0,0.0,0.0,0.143144,0.229921,0.0,0.0,0.0,0.0,0.0
4,2024-12-01,paid_content,facebook,927736373941355,20240712_FB_PRO_T1_Web2App_ALL_S2S_MAN_MNC_Sub...,6596549145459,20241121_FB_PRO_T1_Web2App_ALL_S2S_MAN_MNC_Sub...,6622384529259,row,3,...,25.268073,25.268073,25.268073,25.781309,26.405292,0.076004,0.077843,0.074491,0.074491,113.07



Writing aggregated data to BLINKIST_DEV.DBT_MJAAMA.AGGREGATED_EXPECTED_PROCEEDS_20250320...


2025-03-19 09:07:59,812 - snowflake.snowpark.session - INFO - Closing session: 9893985662438446
2025-03-19 09:07:59,813 - snowflake.snowpark.session - INFO - Canceling all running queries


Successfully wrote 4817 rows to BLINKIST_DEV.DBT_MJAAMA.AGGREGATED_EXPECTED_PROCEEDS_20250320


2025-03-19 09:08:00,043 - snowflake.connector.connection - INFO - closed
2025-03-19 09:08:00,176 - snowflake.connector.connection - INFO - No async queries seem to be running, deleting session
2025-03-19 09:08:00,363 - snowflake.snowpark.session - INFO - Closed session: 9893985662438446


Snowflake session closed
