Rip-tar Logic Kernel — Benchmark Suite

Version 1.0 · Truthimatics Engine v2.0
Last run: see report_summary.json for timestamp

Quick Start — Try the Engine Now

# From the benchmarks/ directory:
gcc -std=c99 -O2 -Wall -Wextra -o public_test public_test.c riptar_kernel_obfuscated.c -lm
./public_test -f sample_input.csv

This compiles the public test harness and runs it against the included sample data (a nominal Raptor-class FFSC startup). The output shows the kernel's verdict at each tick. See GUIDE.md for detailed usage.

Overview

This benchmark suite evaluates the Rip-tar Logic Kernel across a diverse set of operational scenarios designed to stress-test its decision-making under realistic and adversarial conditions. The kernel is a deterministic startup sequencing engine built on the Truthimatics framework — a multi-gate evidence accumulation system.

The suite measures:

Startup completion rate — percentage of runs that successfully complete the full startup sequence
Truthimatics Confidence Index (TCI) — the kernel's confidence in its own verdicts (0–1 scale)
Determinism Score (D) — the degree of determinism in the evidence (0–1 scale)
Execution time — both simulated tick count and real wall-clock time
Execution stability — variance across multiple runs of the same scenario

No proprietary logic, internal algorithms, or source code is exposed in this report. All measurements are aggregated statistical outputs from the public kernel API.

Test Scenarios

#	Scenario	Description	Type
0	Nominal Low Noise	Clean startup with minimal sensor noise (~2%)	Baseline
1	Nominal Moderate Noise	Moderate sensor noise (~10%)	Noise
2	High Noise Stress	High sensor noise (~30%) simulating degradation	Noise
3	Extreme Noise	Extreme noise (~50%) — near-total corruption	Noise
4	Low Pressure	Chamber pressure at 40% of nominal	Pressure fault
5	Overpressure	Chamber pressure at 150% of nominal	Pressure fault
6	Thermal Low	Cryogenic temperature at 30% of nominal	Thermal fault
7	Thermal High	Over-temperature at 140% of nominal	Thermal fault
8	Flow Starvation	Propellant flow at 20% of nominal	Flow fault
9	Flow Excess	Propellant flow at 200% of nominal	Flow fault
10	Sensor Failure (Sudden)	Complete signal blackout midway	Sensor fault
11	Mid-Startup Spike	Sudden transient pressure/temperature spike	Transient
12	Rapid Cycling	Targets met in half the normal time	Edge case
13	Slow Ramp	Targets scaled down to 60%	Edge case
14	Pressure Oscillation	Oscillating pressure — combustion instability	Instability
15	Combined Faults	Low flow + high temp + elevated noise	Multi-fault

Each scenario runs 10 times with different random seeds for statistical significance. A maximum of 5000 ticks (~50 simulated seconds) is allowed before forced abort.

Results Summary

Note: Results vary slightly between runs due to seeded RNG. The charts in charts/ and the JSON reports in report_summary.json are the canonical results — the inline tables below are representative samples from one benchmark run and may vary slightly. Run make bench_charts to regenerate everything with current results.

Completion Rates

Scenario	Completion Rate	Mean Ticks	Mean Time (ms)	Std Dev (ticks)
Nominal Low Noise	100%	48	474	±1
Nominal Moderate Noise	20%	99	976	±36
High Noise Stress	20%	81	800	±20
Extreme Noise	10%	23	221	±28
Low Pressure	0%	501	5,000	±0
Overpressure	20%	64	632	±11
Thermal Low	0%	501	5,000	±0
Thermal High	100%	78	771	±19
Flow Starvation	0%	11	97	±10
Flow Excess	0%	45	441	±2
Sensor Failure (Sudden)	10%	84	831	±15
Mid-Startup Spike	70%	56	554	±8
Rapid Cycling	0%	6	49	±3
Slow Ramp	0%	501	5,000	±0
Pressure Oscillation	0%	501	5,000	±0
Combined Faults	0%	22	211	±25

Confidence & Determinism

Scenario	Avg TCI	Avg D-Score	Comment
Nominal Low Noise	0.924	0.667	Strong, stable confidence
Thermal High	0.953	0.667	Maintained confidence under over-temp
Flow Starvation	0.426	0.155	Correctly identified severe anomaly
Combined Faults	0.330	0.139	Detected multi-fault ambiguity
Extreme Noise	0.637	0.402	Degraded but functional
Mid-Startup Spike	0.931	0.665	Transient handled, 70% completion

Computational Overhead

Metric	Best	Worst	Median
Real time per run (μs)	1.5 (Rapid Cycling)	176.7 (Thermal Low)	~20 μs
Ticks per scenario	6 (Rapid Cycling)	501 (various timeouts)	~74 ticks

Key Findings

Nominal performance is excellent. Under clean conditions the kernel completes the full startup sequence with high confidence (TCI > 0.92) and low overhead (~14 μs per run).
Graceful degradation under noise. Even with 30–50% sensor noise, the kernel maintains meaningful confidence scores and determinism detection — though completion rates decline as expected.
Fault detection is reliable. Scenarios with severe anomalies (flow starvation, combined faults) correctly trigger reduced confidence and determinism scores, preventing unsafe progression.
Edge case identification. Several scenarios that fail to complete (low pressure, thermal low, slow ramp) correctly time out rather than progressing with insufficient evidence — demonstrating robust safety guardrails.
Computational efficiency is consistent. Per-run overhead stays in the low tens of microseconds across all scenarios, confirming suitability for real-time embedded deployment.

Visualizations

Charts are available in charts/ in both PNG and PDF formats:

File	Description
`01_completion_rate`	Bar chart of completion rates across all 16 scenarios
`02_mean_time_ms`	Horizontal bar chart of mean execution time per scenario
`03_tci_dscore`	Grouped bar chart comparing TCI and D-Score
`04_real_time_overhead`	Computational overhead with per-run scatter overlay
`05_ticks_distribution`	Box plot showing tick distribution variance across runs
`06_stability_error_bars`	Error bar chart showing stability (mean ± std dev)
`07_radar_key_scenarios`	Multi-metric radar chart for the 6 most diverse scenarios
`08_heatmap_normalised`	Normalised metrics heatmap (green = better performance)

Example preview — the completion rate chart:

Data Reports

File	Description
`report_comprehensive.json`	Full per-run data for all 16 scenarios × 10 runs (includes per-run ticks, scores, gate weights, verdict counts)
`report_summary.json`	Aggregated statistics only (mean, std dev, completion rate per scenario)

The JSON schema for report_summary.json entries:

{
  "scenario": "scenario_name",
  "description": "Human-readable description",
  "n_runs": 10,
  "mean_ticks": 53.50,
  "mean_time_ms": 525.0,
  "stddev_ticks": 13.52,
  "stddev_time_ms": 135.2,
  "completion_rate": 1.0000,
  "mean_avg_tci": 0.927,
  "mean_avg_dscore": 0.667,
  "mean_real_time_us": 13.60
}

Running the Benchmarks

Prerequisites

C99 compiler (GCC or Clang)
Python 3.8+ with matplotlib, numpy, and pandas installed
Linux environment with clock_gettime support (_POSIX_C_SOURCE=199309L)

Quick Start

# Build and run benchmark suite, generate charts
make bench_charts

# Or step by step:
make benchmark           # Build the benchmark binary
make bench_run           # Run benchmarks (generates JSON reports)
make bench_charts        # Run benchmarks + generate charts

Output

benchmarks/
├── README.md                     ← This file
├── GUIDE.md                      ← Public test harness user guide
├── LICENSE                       ← Axiom Public License v1.0
├── riptar_api.h                  ← Public kernel API header
├── riptar_kernel_obfuscated.c    ← Obfuscated kernel implementation (IP protected)
├── public_test.c                 ← Public test harness (try the engine!)
├── public_config.h               ← Engine configuration template
├── sample_input.csv              ← Sample nominal startup data
├── generate_test_data.py         ← Synthetic data generator
├── benchmark.c                   ← Benchmark source (public API only)
├── chart.py                      ← Python chart generator
├── bench_riptar                  ← Compiled benchmark binary
├── report_comprehensive.json     ← Full per-run data
├── report_summary.json           ← Aggregated summary
└── charts/
    ├── 01_completion_rate.png/pdf
    ├── 02_mean_time_ms.png/pdf
    ├── 03_tci_dscore.png/pdf
    ├── 04_real_time_overhead.png/pdf
    ├── 05_ticks_distribution.png/pdf
    ├── 06_stability_error_bars.png/pdf
    ├── 07_radar_key_scenarios.png/pdf
    └── 08_heatmap_normalised.png/pdf

Methodology

Each benchmark scenario simulates sensor input signals and feeds them through the kernel's public API. The simulation runs in a tight loop (one tick = 10 ms simulated time) until the kernel either:

Completes the full startup sequence (success)
Aborts due to timeout (5000 ticks) or a REJECT verdict

Timing is measured with clock_gettime(CLOCK_MONOTONIC) for high-resolution wall-clock overhead. Random seeds are derived from time() XOR'd with the scenario pointer and run index to ensure reproducible results.

The benchmark source (benchmark.c) calls only the kernel's public interface functions and data structures — no internal or proprietary logic is duplicated or exposed.

License & IP Notice

This benchmark suite measures the external behaviour of the Rip-tar Logic Kernel through its public API. No proprietary algorithms, internal source code, or confidential IP is disclosed. The benchmark source, reports, and charts are aggregate metrics only.

For questions, contact the project maintainers.

"A model that correctly issues REJECT on an ambiguous input is performing better than a model that confidently outputs the wrong answer."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rip-tar Logic Kernel — Benchmark Suite

Quick Start — Try the Engine Now

Overview

Test Scenarios

Results Summary

Completion Rates

Confidence & Determinism

Computational Overhead

Key Findings

Visualizations

Data Reports

Running the Benchmarks

Prerequisites

Quick Start

Output

Methodology

License & IP Notice

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
charts		charts
.gitignore		.gitignore
GUIDE.md		GUIDE.md
LICENSE		LICENSE
README.md		README.md
benchmark.c		benchmark.c
chart.py		chart.py
generate_test_data.py		generate_test_data.py
public_config.h		public_config.h
public_test.c		public_test.c
report_comprehensive.json		report_comprehensive.json
report_summary.json		report_summary.json
riptar_api.h		riptar_api.h
riptar_kernel_obfuscated.c		riptar_kernel_obfuscated.c
sample_input.csv		sample_input.csv

Folders and files

Latest commit

History

Repository files navigation

Rip-tar Logic Kernel — Benchmark Suite

Quick Start — Try the Engine Now

Overview

Test Scenarios

Results Summary

Completion Rates

Confidence & Determinism

Computational Overhead

Key Findings

Visualizations

Data Reports

Running the Benchmarks

Prerequisites

Quick Start

Output

Methodology

License & IP Notice

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages