# BAföG Prosimos Simulation

- Group 10
- Module/Course: BusinessProcessManagement

Simulation overview (brief)
- Units: All times in seconds.
- Arrivals (exponential by calendar):
  - Mon–Fri 08:00–12:00: mean every 20 s
  - Mon–Fri 12:00–14:00: mean every 60 s
  - Mon–Fri 14:00–17:30: mean every 30 s
  - Mon–Fri 17:30–24:00: mean every 180 s
  - Sat–Sun 00:00–24:00: mean every 300 s
- Resources:
  - System (24/7), high capacity, automated tasks (e.g., send/generate)
  - BAföG Office (3 caseworkers), calendar: Mon–Fri 08:00–17:30
- Task durations (excerpt):
  - Generate Confirmation Mail: uniform [1, 3]
  - Send Confirmation Mail: uniform [1, 3]
  - Store Application: uniform [2, 5]
  - Send Mail To Parents: uniform [2, 5]
  - Review Documents: normal (mean 600, std 120), truncated per config
  - Request Resubmission: normal (mean 300, std 60)
  - Assess Application: normal (mean 1800, std 300)
  - Calculate Claim: normal (mean 900, std 180)
  - Write Notification: normal (mean 600, std 120)
  - Write Rejection: normal (mean 600, std 120)
  - Receive Parents Data: waiting time (see config, e.g., exponential/normal in days)
- Gateways: branching probabilities per JSON (gateway_branching_probabilities).

Celonis export
- File: prosimos_output_for_celonis.csv
- Mapping: Case = CASE_KEY, Activity = ACTIVITY, Event Time = EVENTTIME (start time, yyyy-MM-dd HH:mm:ss), Sorting Column = SORTING, Resource = RESOURCE

## 1. Setup

In [23]:
# Optional: Installation
# !pip install -q prosimos pm4py graphviz

In [None]:
import sys
from pathlib import Path

# if your notebook is in /notebooks
PROJECT_ROOT = Path.cwd().parent
sys.path.insert(0, str(PROJECT_ROOT))

from config import get_asset, CSV_DIR

bpmn_asset = get_asset("bafoeg_process")
bpmn_path = bpmn_asset.bpmn_path
prosimos_cfg = bpmn_asset.prosimos_config_path

csv_file_name = "prosimos_output.csv"
csv_celonis_file_name = "prosimos_output_for_celonis.csv"
output_csv_file_path = CSV_DIR / csv_file_name
output_csv_celonis_file_path = CSV_DIR / csv_celonis_file_name


print(CSV_DIR)

print(output_csv_celonis_file_path)


from pathlib import Path

print("CWD:", Path.cwd())
print("given:", output_csv_file_path)
print("resolved:", Path(output_csv_file_path).resolve())
print("exists:", Path(output_csv_file_path).exists())




C:\Users\abodu\Desktop\Clutter Desktop\اوراق الجامعة\Semesters\WinterSemester 25&26\Buisness Process Management\pm4py\Mining-tests\data\outputs\event_logs\csv
C:\Users\abodu\Desktop\Clutter Desktop\اوراق الجامعة\Semesters\WinterSemester 25&26\Buisness Process Management\pm4py\Mining-tests\data\outputs\event_logs\csv\prosimos_output_for_celonis.csv
CWD: c:\Users\abodu\Desktop\Clutter Desktop\اوراق الجامعة\Semesters\WinterSemester 25&26\Buisness Process Management\pm4py\Mining-tests\notebooks
given: C:\Users\abodu\Desktop\Clutter Desktop\اوراق الجامعة\Semesters\WinterSemester 25&26\Buisness Process Management\pm4py\Mining-tests\data\outputs\event_logs\csv\prosimos_output.csv
resolved: C:\Users\abodu\Desktop\Clutter Desktop\اوراق الجامعة\Semesters\WinterSemester 25&26\Buisness Process Management\pm4py\Mining-tests\data\outputs\event_logs\csv\prosimos_output.csv
exists: True


In [25]:
import os
import json
from datetime import datetime
import logging

# Reduce log noise
logging.basicConfig(level=logging.WARNING, format='%(message)s')
logger = logging.getLogger(__name__)


## 2. Dateien prüfen

In [26]:
import os

def verify_files():
    req = {
        "bpmn": str(bpmn_path),
        "config": str(prosimos_cfg),
    }
    missing = [p for p in req.values() if not os.path.exists(p)]
    if missing:
        print("Missing files:", ", ".join(missing))
        return False
    print("OK: required files present")
    return True

verify_files()


OK: required files present


True

In [47]:
import os

def verify_files():
    req = {
        "csv": str(output_csv_file_path),
        "csv_celonis": str(output_csv_celonis_file_path),
    }
    missing = [p for p in req.values() if not os.path.exists(p)]
    if missing:
        print("Missing files:", ", ".join(missing))
        return False
    print("OK: required files present")
    return True

verify_files()

OK: required files present


True

## 3. Konfiguration laden

In [27]:
import json

def load_and_validate_config():
    try:
        with open(str(prosimos_cfg), 'r', encoding='utf-8') as f:
            cfg = json.load(f)
        print("Config loaded")
        return cfg
    except Exception as e:
        print(f"Config error: {e}")
        return None

config = load_and_validate_config()


Config loaded


## 4. Simulation ausführen

In [28]:
def run_simulation(num_cases=2000, start_date="2024-09-15T00:00:00Z"):
    try:
        from prosimos.simulation_engine import run_simulation as prosimos_run
        print(f"Running simulation: {num_cases} cases starting at {start_date}...")
        result = prosimos_run(
            bpmn_path,
            prosimos_cfg,
            total_cases=num_cases,
            starting_at=start_date,
            log_out_path=output_csv_file_path
        )
        print("Done: prosimos_output.csv")
        return result
    except ImportError:
        print("Prosimos not installed. Please install with: pip install prosimos")
        return None
    except Exception as e:
        print(f"Error: {e}")
        return None


In [29]:
# Start
result = run_simulation(num_cases=2000, start_date="2024-09-15T00:00:00Z")

Prosimos not installed. Please install with: pip install prosimos


## 5. Kurz-Überblick

In [61]:
from pathlib import Path
import pandas as pd

p = Path(output_csv_file_path)  # can be "prosimos_output.csv" or "output/prosimos_output.csv"

if p.exists():
    df = pd.read_csv(p)
    print(f"Loaded: {p}")
    print(f"Events: {len(df)}, Cases: {df['case_id'].nunique()}, Activities: {df['activity'].nunique()}")
else:
    print(f"No CSV found at: {p.resolve()}")


Loaded: C:\Users\abodu\Desktop\Clutter Desktop\اوراق الجامعة\Semesters\WinterSemester 25&26\Buisness Process Management\pm4py\Mining-tests\data\outputs\event_logs\csv\prosimos_output.csv
Events: 22903, Cases: 2000, Activities: 14


## 6. Postprocessing für Celonis

In [92]:
# 7. CSV sorted by start time and produce single Celonis file (with resource)
import pandas as pd
from pathlib import Path
def postprocess_prosimos_csv(input, output):
    if not os.path.exists(input):
        print(f"⚠ File not found: {input}")
        return None
    
    df = pd.read_csv(input)

    # Parse timestamps (keep UTC)
    for col in ['enable_time', 'start_time', 'end_time']:
        if col in df.columns:
            df[col] = pd.to_datetime(df[col], utc=True, errors='coerce')

    # Event time preference: start_time > end_time > enable_time
    if 'start_time' in df.columns:
        df['event_time'] = df['start_time']
    else:
        df['event_time'] = pd.NaT
    if 'end_time' in df.columns:
        df['event_time'] = df['event_time'].fillna(df['end_time'])
    if 'enable_time' in df.columns:
        df['event_time'] = df['event_time'].fillna(df['enable_time'])

    # Stable sort per case: event_time, end_time (if present), then original row
    df['_row'] = range(len(df))
    sort_cols = ['case_id', 'event_time']
    if 'end_time' in df.columns:
        sort_cols.append('end_time')
    sort_cols.append('_row')
    df = df.sort_values(sort_cols).reset_index(drop=True)

    # Sequence per case (1..n) as Celonis sorting column
    df['sorting'] = df.groupby('case_id').cumcount() + 1
    df['sorting'] = df['sorting'].astype('int64')

    # Celonis export (single file, include resource)
    cel = pd.DataFrame({
        'CASE_KEY': df['case_id'],
        'ACTIVITY': df['activity'],
        'EVENTTIME': df['event_time'],
        'SORTING': df['sorting'],
        'RESOURCE': df.get('resource')
    })

    # Format: yyyy-MM-dd HH:mm:ss (no ms, no timezone)
    cel['EVENTTIME'] = (
        pd.to_datetime(cel['EVENTTIME'], utc=True, errors='coerce')
          .dt.tz_convert('UTC')
          .dt.tz_localize(None)
          .dt.strftime('%Y-%m-%d %H:%M:%S')
    )

    cel.to_csv(output, index=False)

    print(f"✅ Celonis file (with SORTING & RESOURCE) written to: {output}")
    print("Note: In Celonis map 'Event Time' = EVENTTIME, 'Sorting Column' = SORTING, 'Case ID' = CASE_KEY, 'Activity' = ACTIVITY, 'Resource' = RESOURCE.")
    return output

celonis_csv = postprocess_prosimos_csv(output_csv_file_path, output_csv_celonis_file_path)


✅ Celonis file (with SORTING & RESOURCE) written to: C:\Users\abodu\Desktop\Clutter Desktop\اوراق الجامعة\Semesters\WinterSemester 25&26\Buisness Process Management\pm4py\Mining-tests\data\outputs\event_logs\csv\prosimos_output_for_celonis.csv
Note: In Celonis map 'Event Time' = EVENTTIME, 'Sorting Column' = SORTING, 'Case ID' = CASE_KEY, 'Activity' = ACTIVITY, 'Resource' = RESOURCE.


## 8. Import instructions for Celonis
- Event Time: `EVENTTIME` (format: yyyy-MM-dd HH:mm:ss, based on start time)
- Sorting Column: `SORTING` (INTEGER, strictly increasing per case)
- Case ID: `CASE_KEY`
- Activity: `ACTIVITY`
- Resource: `RESOURCE`

Steps in Celonis:
1) Import `prosimos_output_for_celonis.csv`.
2) Map columns as described above.
3) Enable the Sorting Column and select `SORTING`.
