Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 9, 2025

Static MITRE ATT&CK and D3FEND data files (127KB) were shipped in the package and required manual updates. This replaces them with dynamic loaders that fetch latest data from official sources on first use, cached in memory for the process lifetime.

Changes

  • New modules: sigma/data/mitre_attack_data.py and sigma/data/mitre_d3fend_data.py

    • Download data from github.com/mitre-attack/attack-stix-data and D3FEND ontology
    • Parse STIX 2.0 and JSON-LD formats respectively
    • Use __getattr__ for backward-compatible module-level attribute access
    • Cache loaded data globally in _cache variable
  • Updated validators: Modified imports in sigma/validators/core/tags.py

    # Old
    from sigma.data.mitre_attack import mitre_attack_tactics, ...
    
    # New
    from sigma.data import mitre_attack_data
    # Access via: mitre_attack_data.mitre_attack_tactics
  • Test mocking: Added tests/conftest.py with autouse fixture

    • Monkeypatches _get_cached_data() to return mock data
    • Prevents network calls in test suite
  • Removed files:

    • sigma/data/mitre_attack.py (2471 lines)
    • sigma/data/mitre_d3fend.py (882 lines)
    • tools/update-*.py scripts (298 lines)

API Compatibility

Existing code continues to work unchanged. Validators initialize identically:

validator = ATTACKTagValidator()  # Triggers data load on first instantiation

Data loading is transparent with 30s timeout and proper error handling via RuntimeError on network failures.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • d3fend.mitre.org
    • Triggering command: python3 -c import requests try: # Check if the D3FEND JSON ontology is accessible response = requests.get('REDACTED', timeout=10) print(f'D3FEND JSON status: {response.status_code}') print(f'Content-Type: {response.headers.get("Content-Type")}') if response.status_code == 200: data = response.json() print(f'Keys in response: {list(data.keys())[:5]}') except Exception as e: print(f'Error: {e}') (dns block)
    • Triggering command: /home/REDACTED/.cache/pypoetry/virtualenvs/pysigma-Tjs8INP8-py3.12/bin/python /home/REDACTED/.cache/pypoetry/virtualenvs/pysigma-Tjs8INP8-py3.12/bin/pytest tests/ -q (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Replace static MITRE ATT&CK and D3FEND data with on-demand access via API</issue_title>
<issue_description>Currently MITRE data is shipped statically in the pySigma library in the sigma.data package and must be updated regularly and is only used for the tag validators. The static data should be removed and instead the validator should access the latest live MITRE data by downloading it on demand from their respective repositories. The data should not be downloaded multiple times on instantiation of the validator that uses it. Instead the data should be cached in memory and reused on further instantiations of the validator. In the pySigma tests the data shouldn't be downloaded but mocked instead.</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 3 commits November 9, 2025 10:02
- Created mitre_attack_data.py with on-demand ATT&CK data loading
- Created mitre_d3fend_data.py with on-demand D3FEND data loading
- Updated tag validators to use new data loaders
- Added test fixtures to mock data loading in tests
- All tests passing with mocked data

Co-authored-by: thomaspatzke <1845601+thomaspatzke@users.noreply.github.com>
- Removed sigma/data/mitre_attack.py (static data)
- Removed sigma/data/mitre_d3fend.py (static data)
- Removed tools/update-data-online.py (no longer needed)
- Removed tools/update-mitre_attack.py (no longer needed)
- Added tests/conftest.py with global mock fixtures
- All 1237 tests passing

Co-authored-by: thomaspatzke <1845601+thomaspatzke@users.noreply.github.com>
- Added proper type hints for Dict and List types
- Fixed return type for _get_external_id to handle Any properly
- All mypy checks passing

Co-authored-by: thomaspatzke <1845601+thomaspatzke@users.noreply.github.com>
Copilot AI changed the title [WIP] Replace static MITRE ATT&CK and D3FEND data with API access Replace static MITRE data with on-demand API loading Nov 9, 2025
Copilot AI requested a review from thomaspatzke November 9, 2025 10:13
@thomaspatzke thomaspatzke marked this pull request as ready for review November 9, 2025 11:12
@thomaspatzke thomaspatzke merged commit 07eed3b into main Nov 9, 2025
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace static MITRE ATT&CK and D3FEND data with on-demand access via API

2 participants