Skip to content

YAML administration data sources

Ruben Bouman edited this page Oct 4, 2022 · 21 revisions

In this YAML file you can administrate your available data sources and score their quality. You can find more information on data quality here.

Sample file: data-sources-endpoints.yaml

Current version: version 1.1

Version 1.0 can be found here.

File content

Name Type Required Description
version string yes Version of this data source administration file. The current version is 1.1.
file_type string yes Used to indicate what type of YAML file it is. Possible values: data-source-administration, technique-administration and group-administration. For data source administration the value should be: data-source-administration.
name string yes Describes for what type of assets you describe the data sources for. E.g. endpoints. It is just a name which will be used in different places of the output.
domain string yes (defaults to enterprise-attack) Specify the ATT&CK domain using the value enterprise-attack, ics-attack or mobile-attack
systems list of system objects yes Contains a list of applicable_to values, as used within the data source details object, combined with the relevant ATT&CK platforms.
data_sources list with data source global objects yes Contains all the data sources for which information is administrated. See the description of the data source global object.
exceptions list with ATT&CK technique IDs no Contains a list of ATT&CK technique IDs.

Adding an ID result in removing that technique from any derived output. For example, it will be excluded when generating a technique YAML administration file using: python dettect.py ds -f sample-data/data-sources-endpoints.yaml --yaml
notes string no An optional field to include notes on this groups administration file.

Systems

This object specifies the ATT&CK platforms that apply to the applicable_to values as used within the data source details object.

Name Type Required Description
applicable_to string Yes Specify the type of system. This is a free format text field, so any string value is valid except for 'all'.

The value 'all' is reserved and cannot be used. 'all' can be used as a value in a data source details object's kv-pair applicable_to. The value 'all' will result in that particular data source details object to be effective for all type of systems (i.e. applicable_to's) specified in this Systems object.
platform string or list of strings yes Indicates ATT&CK platform(s) relevant for this particular applicable_to value. Possible values (in the list) are the MITRE ATT&CK platform values or 'all' to select all platforms: PRE, Windows, Linux, macOS, Office 365, Azure AD, Google Workspace, IaaS, SaaS, Network, Containers.

Data source global object

This object only contains the name of the data source and the key-value pair data_source, which includes all the details regarding the data source within a data source details object.

Name Type Required Description
data_source_name string yes The name of the data source according to MITRE ATT&CK. E.g. Process Creation. A specific data source name value can only be part of a data source administration file once.
data_source list of data source details objects yes Contains the details regarding data sources divided using the key-value pair applicable_to.

Data source details object

You can have multiple data source details objects to score the same data source (e.g. Process monitoring), but for different types of systems (e.g. Windows endpoints, Crown jewel X, Linux servers, Azure IaaS, etc.). The key-value pair applicable_to indicates the type of system(s) the details apply to. Please note that a system can only be part of one applicable_to key-value pair for the same data source.

We recommend using the same applicable_to values between your data source and technique administration file. More info on this topic can be found on the Wiki.

Name Type Required Description
applicable_to list of strings yes Specify to which type of system this data source applies by choosing one of the system's applicable_to values. Use the value ['all'] to let it apply to every type of system.

When setting the value to `['all'] no other values should be used.
date_registered date yyyy-mm-dd yes Date of registration of the data source in this YAML file.
date_connected date yyyy-mm-dd yes Date when the data source is connected to your security data lake. This date is used to draw a graph indicating the progress of connected data sources.
products list yes A list with one or more products where the data source data is located. E.g. Windows event log.
available_for_data_analytics boolean yes Indicates if the data source is available in such a way that blue teamers can use it for data analytics.

This property has no impact on the output generated by the DeTTECT CLI. It can influence the output when used within an EQL query.
comment string yes An option to comment on this data source.

If you want to have a multiline comment in the Excel output. We recommend making use of |. For more info have a look at: https://yaml-multiline.info/.
data_quality data quality object yes The scores on the five different data quality dimensions. See the description of the data quality object.

Data quality object

The five data quality dimensions are explained here.

Name Type Required Description
device_completeness int yes Score between 0-5. Scoring this aspect is explained in a separate section.
data_field_completeness int yes Score between 0-5. Scoring this aspect is explained in a separate section.
timeliness int yes Score between 0-5. Scoring this aspect is explained in a separate section.
consistency int yes Score between 0-5. Scoring this aspect is explained in a separate section.
retention int yes Score between 0-5. Scoring this aspect is explained in a separate section.
Clone this wiki locally