# BAföG Prosimos Simulation

- Group 10
- Module/Course: BusinessProcessManagement

Simulation overview (brief)
- Units: All times in seconds.
- Arrivals (exponential by calendar):
  - Mon–Fri 08:00–12:00: mean every 20 s
  - Mon–Fri 12:00–14:00: mean every 60 s
  - Mon–Fri 14:00–17:30: mean every 30 s
  - Mon–Fri 17:30–24:00: mean every 180 s
  - Sat–Sun 00:00–24:00: mean every 300 s
- Resources:
  - System (24/7), high capacity, automated tasks (e.g., send/generate)
  - BAföG Office (3 caseworkers), calendar: Mon–Fri 08:00–17:30
- Task durations (excerpt):
  - Generate Confirmation Mail: uniform [1, 3]
  - Send Confirmation Mail: uniform [1, 3]
  - Store Application: uniform [2, 5]
  - Send Mail To Parents: uniform [2, 5]
  - Review Documents: normal (mean 600, std 120), truncated per config
  - Request Resubmission: normal (mean 300, std 60)
  - Assess Application: normal (mean 1800, std 300)
  - Calculate Claim: normal (mean 900, std 180)
  - Write Notification: normal (mean 600, std 120)
  - Write Rejection: normal (mean 600, std 120)
  - Receive Parents Data: waiting time (see config, e.g., exponential/normal in days)
- Gateways: branching probabilities per JSON (gateway_branching_probabilities).

Celonis export
- File: prosimos_output_for_celonis.csv
- Mapping: Case = CASE_KEY, Activity = ACTIVITY, Event Time = EVENTTIME (start time, yyyy-MM-dd HH:mm:ss), Sorting Column = SORTING, Resource = RESOURCE

## 1. Setup

In [92]:
# Optional: Installation
# !pip install -q prosimos pm4py graphviz

In [None]:
import os
import json
from datetime import datetime
import logging

# Reduce log noise
logging.basicConfig(level=logging.WARNING, format='%(message)s')
logger = logging.getLogger(__name__)


## 2. Dateien prüfen

In [None]:
def verify_files():
    req = {'bpmn': 'bafoeg_process.bpmn', 'config': 'bafoeg_prosimos_config.json'}
    missing = [v for v in req.values() if not os.path.exists(v)]
    if missing:
        print("Missing files:", ", ".join(missing))
        return False
    print("OK: required files present")
    return True

verify_files()


OK: Dateien vorhanden


True

## 3. Konfiguration laden

In [None]:
def load_and_validate_config():
    try:
        with open('bafoeg_prosimos_config.json', 'r', encoding='utf-8') as f:
            cfg = json.load(f)
        print("Config loaded")
        return cfg
    except Exception as e:
        print(f"Config error: {e}")
        return None

config = load_and_validate_config()


Config geladen


## 4. Simulation ausführen

In [None]:
def run_simulation(num_cases=2000, start_date="2024-09-15T00:00:00Z"):
    try:
        from prosimos.simulation_engine import run_simulation as prosimos_run
        print(f"Running simulation: {num_cases} cases starting at {start_date}...")
        result = prosimos_run(
            bpmn_path='bafoeg_process.bpmn',
            json_path='bafoeg_prosimos_config.json',
            total_cases=num_cases,
            starting_at=start_date,
            log_out_path='./prosimos_output.csv'
        )
        print("Done: prosimos_output.csv")
        return result
    except ImportError:
        print("Prosimos not installed. Please install with: pip install prosimos")
        return None
    except Exception as e:
        print(f"Error: {e}")
        return None


In [97]:
# Start
result = run_simulation(num_cases=2000, start_date="2024-09-15T00:00:00Z")

Simulation: 2000 Cases ab 2024-09-15T00:00:00Z ...
Fertig: prosimos_output.csv


## 5. Kurz-Überblick

In [None]:
import pandas as pd
import glob

files = glob.glob('*.csv')
if files:
    df = pd.read_csv('prosimos_output.csv') if 'prosimos_output.csv' in files else pd.read_csv(files[0])
    print(f"Events: {len(df)}, Cases: {df['case_id'].nunique()}, Activities: {df['activity'].nunique()}")
else:
    print("No CSV found.")


Events: 22903, Cases: 2000, Activities: 14


## 6. Postprocessing für Celonis

In [None]:
# 7. CSV sorted by start time and produce single Celonis file (with resource)
import pandas as pd
import os

def postprocess_prosimos_csv(src='prosimos_output.csv', dst_celonis='prosimos_output_for_celonis.csv'):
    if not os.path.exists(src):
        print(f"⚠ File not found: {src}")
        return None

    df = pd.read_csv(src)

    # Parse timestamps (keep UTC)
    for col in ['enable_time', 'start_time', 'end_time']:
        if col in df.columns:
            df[col] = pd.to_datetime(df[col], utc=True, errors='coerce')

    # Event time preference: start_time > end_time > enable_time
    if 'start_time' in df.columns:
        df['event_time'] = df['start_time']
    else:
        df['event_time'] = pd.NaT
    if 'end_time' in df.columns:
        df['event_time'] = df['event_time'].fillna(df['end_time'])
    if 'enable_time' in df.columns:
        df['event_time'] = df['event_time'].fillna(df['enable_time'])

    # Stable sort per case: event_time, end_time (if present), then original row
    df['_row'] = range(len(df))
    sort_cols = ['case_id', 'event_time']
    if 'end_time' in df.columns:
        sort_cols.append('end_time')
    sort_cols.append('_row')
    df = df.sort_values(sort_cols).reset_index(drop=True)

    # Sequence per case (1..n) as Celonis sorting column
    df['sorting'] = df.groupby('case_id').cumcount() + 1
    df['sorting'] = df['sorting'].astype('int64')

    # Celonis export (single file, include resource)
    cel = pd.DataFrame({
        'CASE_KEY': df['case_id'],
        'ACTIVITY': df['activity'],
        'EVENTTIME': df['event_time'],
        'SORTING': df['sorting'],
        'RESOURCE': df.get('resource')
    })

    # Format: yyyy-MM-dd HH:mm:ss (no ms, no timezone)
    cel['EVENTTIME'] = (
        pd.to_datetime(cel['EVENTTIME'], utc=True, errors='coerce')
          .dt.tz_convert('UTC')
          .dt.tz_localize(None)
          .dt.strftime('%Y-%m-%d %H:%M:%S')
    )

    cel.to_csv(dst_celonis, index=False)

    print(f"✅ Celonis file (with SORTING & RESOURCE) written to: {dst_celonis}")
    print("Note: In Celonis map 'Event Time' = EVENTTIME, 'Sorting Column' = SORTING, 'Case ID' = CASE_KEY, 'Activity' = ACTIVITY, 'Resource' = RESOURCE.")
    return dst_celonis

celonis_csv = postprocess_prosimos_csv('prosimos_output.csv', 'prosimos_output_for_celonis.csv')


✅ Celonis-Datei (mit SORTING & RESOURCE) geschrieben nach: prosimos_output_for_celonis.csv
Hinweis: In Celonis 'Event Time' = EVENTTIME, 'Sorting Column' = SORTING, 'Case ID' = CASE_KEY, 'Activity' = ACTIVITY, 'Resource' = RESOURCE.


## 8. Import instructions for Celonis
- Event Time: `EVENTTIME` (format: yyyy-MM-dd HH:mm:ss, based on start time)
- Sorting Column: `SORTING` (INTEGER, strictly increasing per case)
- Case ID: `CASE_KEY`
- Activity: `ACTIVITY`
- Resource: `RESOURCE`

Steps in Celonis:
1) Import `prosimos_output_for_celonis.csv`.
2) Map columns as described above.
3) Enable the Sorting Column and select `SORTING`.
