 # Reports #

 After constructing a Synthorus system, different reports are available on the system. In general, the reports are created by reading model definitions files and the reports are written to the "reports" subdirectory in the directory.

Some report can be made by the `make_model_definition_files` script and some reports can be made by running a script after the model definition files are generated.

The following table summarises the reports and how they may be generated.

| report                      | file name                | make_model_definition_files | manual script function   |
|:----------------------------|:-------------------------|:----------------------------|:-------------------------|
| detailed cross-table report | crosstabs.csv            | ✓                           | ✗                        |
| model specification report  | report_on_model_spec.txt | ✓                           | `make_model_spec_report` |
| privacy report              | report_on_privacy.txt    | ✓                           | `make_privacy_report`    |
| utility report              | report_on_utility.txt    | ✗                           | `make_utility_report`    |
| detailed utility results    | utility_results.csv      | ✗                           | `make_utility_report`    |


The following code uses `make_model_definition_files`  to create a Synthorus system, but only creates the cross-table report.

Note that a demo output directory is used but not in a `with` context. This is so we can have a running example.

In [1]:
from synthorus_demos.utils.file_helper import print_file_tree
from synthorus_demos.utils.file_helper import cat
from synthorus.utils.print_function import NO_LOG
from synthorus.workflows.make_model_definition_files import make_model_definition_files
from synthorus_demos.demo_files import SPEC_FILES
from synthorus.spec_file.interpret_spec_file import load_spec_file
from synthorus.model.model_spec import ModelSpec
from synthorus_demos.utils.output_directory import output_directory


# Create a managed demo output directory for the output model definition files.
model_definition_dir = output_directory('demo_reports')

model_spec: ModelSpec = load_spec_file(SPEC_FILES / 'spec_simple_pjm.py')
make_model_definition_files(
    model_spec,
    model_definition_dir,
    make_crosstab_report=True,
    make_privacy_report=False,
    make_model_spec_report=False,
    overwrite=True,
    log=NO_LOG,
)

# Show what files were created
print('-------------------------------------------')
print_file_tree(model_definition_dir)
print('-------------------------------------------')


-------------------------------------------
demo_reports/
  clean_cross_tables/
    _event_duration.pkl
    _event_duration_since_last.pkl
    _event_type.pkl
    _patient_age.pkl
  model_index.json
  model_spec.json
  noisy_cross_tables/
    _event_duration.pkl
    _event_duration_since_last.pkl
    _event_type.pkl
    _patient_age.pkl
  pgms/
    event.py
    patient.py
  reports/
    crosstabs.csv
  simulator_spec.json
-------------------------------------------


## Cross-table report ##

Here is the resulting cross-table report. In this example we use Pandas to format the CSV file for display.

In [2]:
import pandas as pd

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

pd.read_csv(model_definition_dir / 'reports' / 'crosstabs.csv')

Unnamed: 0,Cross-table,Data-source,Number-of-rvs,RVs,State-space-size,Number-of-rows,Number-of-suppressed-rows,Min-weight,Max-weight,Total-weight,Sensitivity,Epsilon,Min-cell-size,Max-add-rows,Orig-rows,Lost-rows,Lost-rows%,Added-rows,Added-rows%,Final-rows,Final-rows%
0,_patient_age,patient_age__event_type,1,'patient_age',3,3,0,30.0,48.0,108.0,0.0,0.0,0.0,1000000,3,0,0.0,0,0.0,3,100.0
1,_event_type,patient_age__event_type,1,'event_type',4,4,0,3.0,63.0,108.0,0.0,0.0,0.0,1000000,4,0,0.0,0,0.0,4,100.0
2,_event_duration,event_duration,1,'event_duration',9,9,0,0.01,2.56,5.11,0.0,0.0,0.0,1000000,9,0,0.0,0,0.0,9,100.0
3,_event_duration_since_last,event_duration_since_last,1,'event_duration_since_last',9,9,0,0.01,2.56,5.11,0.0,0.0,0.0,1000000,9,0,0.0,0,0.0,9,100.0


## Model specification report ##

This shows how to manually create a model specification report.

In [3]:
from synthorus.workflows.report_spec import make_model_spec_report

make_model_spec_report(model_definition_dir)

# Show the report
print('-------------------------------------------')
print_file_tree(model_definition_dir / 'reports')
print('-------------------------------------------')
print()
cat(model_definition_dir / 'reports' / 'report_on_model_spec.txt')


-------------------------------------------
reports/
  crosstabs.csv
  report_on_model_spec.txt
-------------------------------------------

Model Spec Report

Report date: 2025-11-11 16:34:50 (+1100)
Report author: user "barry"

Model name: spec_simple_pjm
Model author: Barry Drake

Cross-tables:
    Cross-table: _event_duration
        Random variables: event_duration
        Datasource: event_duration
        Sensitivity: 0
        Epsilon: 0
        Min cell size: 0
        State space size: 9
        Number of rows: 9
        Number of suppressed rows: 0
        Min weight: 0.01
        Max weight: 2.56
        Total weight: 5.11

    Cross-table: _event_duration_since_last
        Random variables: event_duration_since_last
        Datasource: event_duration_since_last
        Sensitivity: 0
        Epsilon: 0
        Min cell size: 0
        State space size: 9
        Number of rows: 9
        Number of suppressed rows: 0
        Min weight: 0.01
        Max weight: 2.56
      

## Privacy report ##

This shows how to manually create a privacy report.

In [4]:
from synthorus.workflows.report_privacy import make_privacy_report

make_privacy_report(model_definition_dir)

# Show the report
print('-------------------------------------------')
print_file_tree(model_definition_dir / 'reports')
print('-------------------------------------------')
print()
cat(model_definition_dir / 'reports' / 'report_on_privacy.txt')


-------------------------------------------
reports/
  crosstabs.csv
  report_on_model_spec.txt
  report_on_privacy.txt
-------------------------------------------

Privacy Report

Report date: 2025-11-11 16:34:50 (+1100)
Report author: barry

Model name: spec_simple_pjm
Model author: Barry Drake

Random number generator security level: 4 (equivalent to AES128 - adequate security)
Privacy budget: 0 (no privacy risk)

Datasources with zero sensitivity (3 of 3):
    event_duration
    event_duration_since_last
    patient_age__event_type

Datasources (3):
    Datasource name: event_duration
    Sensitivity: 0
    Random variables: event_duration
    Distinct rows: 9
    Total weight: 5.11
    Entropy: 1.9795669564757818

    Datasource name: event_duration_since_last
    Sensitivity: 0
    Random variables: event_duration_since_last
    Distinct rows: 9
    Total weight: 5.11
    Entropy: 1.979566956475782

    Datasource name: patient_age__event_type
    Sensitivity: 0
    Random variab

## Utility report ##

This shows how to create a utility report.

In [5]:
from synthorus.workflows.report_utility import make_utility_report

make_utility_report(model_definition_dir)

# Show the report
print('-------------------------------------------')
print_file_tree(model_definition_dir / 'reports')
print('-------------------------------------------')
print()
cat(model_definition_dir / 'reports' / 'report_on_utility.txt')

generating report on utility
entity: patient
loading PGM
compiling PGM
constructing evaluation trials
entity: event
loading PGM
compiling PGM
constructing evaluation trials
running evaluation trials
trial 1 of 8, event HI event_duration: 0.0
trial 2 of 8, event KL event_duration: 1.0
trial 3 of 8, event HI event_duration_since_last: 0.0
trial 4 of 8, event KL event_duration_since_last: 1.0
trial 5 of 8, event HI event_type: 0.0
trial 6 of 8, event KL event_type: 1.0
trial 7 of 8, patient HI patient_age: 0.0
trial 8 of 8, patient KL patient_age: 1.0
-------------------------------------------
reports/
  crosstabs.csv
  report_on_model_spec.txt
  report_on_privacy.txt
  report_on_utility.txt
  utility_results.csv
-------------------------------------------

Utility Report

Report date: 2025-11-11 16:34:50 (+1100)
Report author: user "barry"

Model name: spec_simple_pjm
Model author: Barry Drake

Analysis summary:

    Number of entities: 2

    Entity: event
        Worst histogram-inter

And here are the detailed utility results.

In [6]:
pd.read_csv(model_definition_dir / 'reports' / 'utility_results.csv')


Unnamed: 0,entity,number_of_rvs,rvs,measure_name,measurement
0,event,1,event_duration,histogram-intersection,0.0
1,event,1,event_duration,KL-divergence,1.0
2,event,1,event_duration_since_last,histogram-intersection,0.0
3,event,1,event_duration_since_last,KL-divergence,1.0
4,event,1,event_type,histogram-intersection,0.0
5,event,1,event_type,KL-divergence,1.0
6,patient,1,patient_age,histogram-intersection,0.0
7,patient,1,patient_age,KL-divergence,1.0


In [7]:
# Perform any cleanup operations of the managed demo output directory.
model_definition_dir.cleanup()