Security Policy Evaluation Framework

Overview

The Security Policy Evaluation Framework is a testing and benchmarking system designed to evaluate the robustness and correctness of various authorization policy engines. It provides a consistent, automated environment for executing policy test cases across multiple languages. This framework is primarily intended for researchers, security engineers, and policy developers who want to benchmark how different policy engines behave under predefined test conditions.

Currently, the framework supports the following policy engines:

The goal is to provide a common interface to evaluate each language's response to a series of security-related scenarios.

Setup

To set up the framework:

git clone https://github.com/doyensec/policy-languages-framework.git
cd policy-languages-framework
pip install -r requirements.txt

Alternative, you might want to use a Python virtual environment:

Create a virtual environment using python3 -m venv path/to/venv. Make sure you have python3-full installed.
Then, use path/to/venv/bin/pip to install all dependencies
Finally, run the software using path/to/venv/bin/python

Please note that this tool has been tested on macOS 15.4.1 (arm64).

Note: Docker must be running on your system. Docker is required to execute policy evaluations within isolated containerized environments.

Usage

To start running the framework:

python main.py

Note: This assumes that Docker has been installed with the post-installation steps to allow non-privileged users to run Docker commands.

Arguments:

--start: (Optional) Integer ID of the first test case to execute. Defaults to the first available.
--max: (Optional) Integer ID of the last test case to execute.
--only: (Optional) Comma-separated list of specific test case IDs to run (e.g., --specific 01,03,07).

Note: This framework spawns containers to evaluate test cases in isolated environments. For proper execution, the following ports must be free and available on the host system, as they are assigned to the respective policy engine containers:

8911 → Cedar
8910, 8912 → OpenFGA
8913 → Rego
8914 → Teleport ACD

If needed, these ports can be customized in main.py.

Output

The framework produces a final HTML report summarizing test case results. Each test case is evaluated independently per policy engine, and results are recorded in a matrix table. Results are saved within the policy-languages-framework/results folder and can be easily displayed with any browser.

Possible test outcomes include:

PASS: The policy engine produced the expected output under correct conditions.
FAIL: The policy engine produced an output that contradicts the expected result.
NOT APPLICABLE: The test case is not relevant or cannot be executed for the given engine.
ERROR: An internal error occurred during test execution, such as malformed input or unsupported constructs.

Each result is presented in a tabular HTML file automatically generated at the end of execution.

Test Case Results Overview

The following table presents the current results of all implemented test cases evaluated across the supported policy engines. Each row represents a specific test case, while each column corresponds to a policy engine. The results indicate whether the engine passed, failed, timed out, or did not apply for the given scenario.

Result Interpretation Notes:

NOT APPLICABLE: The test case is not relevant or implementable for this specific policy engine, either due to architectural limitations or incompatibility with the engine's capabilities. No logical test case could be meaningfully defined.
PASS (Predefined Result) / FAIL (Predefined Result): The policy engine was not capable of executing a meaningful logic-based test for this case. A static result was assigned based on known behavior, documentation, or limitations, instead of a runtime-evaluated policy scenario.

Adding a New Test Case

To add a new test case:

Create a new folder under testcases/ named using the pattern testcase-XX, where XX is the next available numerical index.
Inside this folder, create a manifest.yaml file with the following structure:

id: testcase-XX
scenario: <short scenario name>
description: <detailed description>
rego:
  - query: <query_file>
    policy: <policy_file>
    expected_result:
      - status: success|error
        condition: <condition to check>
cedar:
  - entities: <entities_file>
    query: <query_file>
    expected_result:
      - status: success|error
        condition: <optional condition>
openfga:
  - authorization_model: <model_file>
    tuples: <tuples_file>
    query: <query_file>
    expected_result:
      - status: success|error
        condition: <optional condition>
teleportacd:
  - type: <evaluation_type>
    config: <config_file>
    expected_result:
      - status: success|error

Each field (e.g., query, policy, entities) should reference a file located within a subdirectory named after the policy engine (e.g., rego/, cedar/, etc.).

Expected Result Format

Each engine entry must define an expected_result with:

status: Indicates if a success (evaluation completed and produced output) or error (evaluation failed).
condition (optional): A logical condition to assert on the evaluation result JSON.

Example condition:

condition: decision["result"] == "allow"

Supports functions like contains(), isdigit(), endswith(), doesnotcontain(), etc.

If the test case is not applicable to an engine, omit the engine entry from the manifest.

Example Test Cases

Test Case ID	Description
testcase-01	Policy Engine Must Enforce Deny Rules Even When Runtime Errors Occur
testcase-02	Arithmetic Overflow/Underflow and Missing Entities Cause Validation Errors
testcase-03	Handling Undefined Values in Deny/Allow Rules Without Impacting Policy Decisions
testcase-04	Negations on Undefined Values Does Not Cause Expected Denials
testcase-05	Policy Must Produce Explicit Forbid/Allow
testcase-06	Built-in Functions Do Not Introduce Side-Effects or Non-Deterministic Behavior

This is only a preview. For a full list, see testcases.md

License

See LICENSE file for details.

Acknowledgments

This framework builds upon concepts and threat modeling research by Trail of Bits.

This project was a collaboration between Teleport and Doyensec. The framework was created by the Doyensec team with inspiration and funding from Teleport.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Dockerfiles		Dockerfiles
docs		docs
framework		framework
misc		misc
scripts		scripts
testcases		testcases
LICENSE.md		LICENSE.md
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Security Policy Evaluation Framework

Overview

Setup

Usage

Arguments:

Output

Test Case Results Overview

Adding a New Test Case

Expected Result Format

Example Test Cases

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

gravitational/policy-languages-framework

Folders and files

Latest commit

History

Repository files navigation

Security Policy Evaluation Framework

Overview

Setup

Usage

Arguments:

Output

Test Case Results Overview

Adding a New Test Case

Expected Result Format

Example Test Cases

License

Acknowledgments

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages