Skip to content

Commit

Permalink
[1.x] Introduce --strict flag (#937) (#975)
Browse files Browse the repository at this point in the history
  • Loading branch information
ebeahan committed Sep 23, 2020
1 parent d5820b9 commit e6ba4c4
Show file tree
Hide file tree
Showing 7 changed files with 99 additions and 10 deletions.
5 changes: 3 additions & 2 deletions CHANGELOG.next.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,6 @@ Thanks, you're awesome :-) -->

### Schema Changes

* Added `threat.technique.subtechnique` to capture MITRE ATT&CK® subtecqhniques. #951

#### Breaking changes

#### Bugfixes
Expand All @@ -21,6 +19,7 @@ Thanks, you're awesome :-) -->
#### Added

* Added Mime Type fields to HTTP request and response. #944
* Added `threat.technique.subtechnique` to capture MITRE ATT&CK® subtechniques. #951

#### Improvements

Expand All @@ -38,6 +37,8 @@ Thanks, you're awesome :-) -->

#### Added

* Introduced `--strict` flag to perform stricter schema validation when running the generator script. #937

#### Improvements

* Field details Jinja2 template components have been consolidated into one template #897
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ generate: legacy_use_cases codegen generator
# Run the new generator
.PHONY: generator
generator:
$(PYTHON) scripts/generator.py --include "${INCLUDE}"
$(PYTHON) scripts/generator.py --strict --include "${INCLUDE}"

# Generate Go code from the schema.
.PHONY: gocodegen
Expand Down
45 changes: 45 additions & 0 deletions USAGE.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ relevant artifacts for their unique set of data sources.
+ [Subset](#subset)
+ [Ref](#ref)
+ [Mapping & Template Settings](#mapping--template-settings)
+ [Strict Mode](#strict-mode)
+ [Intermediate-Only](#intermediate-only)

## Terminology
Expand Down Expand Up @@ -294,6 +295,50 @@ The `--template-settings` argument defines [index level settings](https://www.el

For `template.json`, the `mappings` object is left empty: `{}`. Likewise the `properties` object remains empty in the `mapping.json` example. This will be filled in automatically by the script.

#### Strict Mode

The `--strict` argument enables "strict mode". Strict mode performs a stricter validation step against the schema's contents.

Basic usage:

```
$ python/generator.py --strict
```

Strict mode requires the following conditions, else the script exits on an exception:

* Short descriptions must be less than or equal to 120 characters.

The current artifacts generated and published in the ECS repo will always be created using strict mode. However, older ECS versions (pre `v1.5.0`) will cause
an exception if attempting to generate them using `--strict`. This is due to schema validation checks introduced after that version was released.

Example:

```
$ python scripts/generator.py --ref v1.4.0 --strict
Loading schemas from git ref v1.4.0
Running generator. ECS version 1.4.0
...
ValueError: Short descriptions must be single line, and under 120 characters (current length: 134).
Offending field or field set: number
Short description:
Unique number allocated to the autonomous system. The autonomous system number (ASN) uniquely identifies each network on the Internet.
```

Removing `--strict` will display a warning message, but the script will finish its run successfully:

```
$ python scripts/generator.py --ref v1.4.0
Loading schemas from git ref v1.4.0
Running generator. ECS version 1.4.0
/Users/ericbeahan/dev/ecs/scripts/generators/ecs_helpers.py:176: UserWarning: Short descriptions must be single line, and under 120 characters (current length: 134).
Offending field or field set: number
Short description:
Unique number allocated to the autonomous system. The autonomous system number (ASN) uniquely identifies each network on the Internet.
This will cause an exception when running in strict mode.
```

#### Intermediate-Only

The `--intermediate-only` argument is used for debugging purposes. It only generates the ["intermediate files"](generated/ecs), `ecs_flat.yml` and `ecs_nested.yml`, without generating the rest of the artifacts.
Expand Down
4 changes: 3 additions & 1 deletion scripts/generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def main():
# ecs_helpers.yaml_dump('ecs.yml', fields)

fields = loader.load_schemas(ref=args.ref, included_files=args.include)
cleaner.clean(fields)
cleaner.clean(fields, strict=args.strict)
finalizer.finalize(fields)
fields = subset_filter.filter(fields, args.subset, out_dir)
nested, flat = intermediate_files.generate(fields, os.path.join(out_dir, 'ecs'), default_dirs)
Expand Down Expand Up @@ -72,6 +72,8 @@ def argument_parser():
help='index template settings to use when generating elasticsearch template')
parser.add_argument('--mapping-settings', action='store',
help='mapping settings to use when generating elasticsearch template')
parser.add_argument('--strict', action='store_true',
help='enforce stricter checking at schema cleanup')
args = parser.parse_args()
# Clean up empty include of the Makefile
if args.include and [''] == args.include:
Expand Down
15 changes: 15 additions & 0 deletions scripts/generators/ecs_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import os
import yaml
import git
import warnings

from collections import OrderedDict
from copy import deepcopy
Expand Down Expand Up @@ -159,3 +160,17 @@ def list_extract_keys(lst, key_name):
def is_intermediate(field):
'''Encapsulates the check to see if a field is an intermediate field or a "real" field.'''
return ('intermediate' in field['field_details'] and field['field_details']['intermediate'])


# Warning helper


def strict_warning(msg):
"""Call warnings.warn(msg) for operations that would throw an Exception
if operating in `--strict` mode. Allows a custom message to be passed.
:param msg: custom text which will be displayed with wrapped boilerplate
for strict warning messages.
"""
warn_message = f"{msg}\n\nThis will cause an exception when running in strict mode."
warnings.warn(warn_message)
18 changes: 12 additions & 6 deletions scripts/schema/cleaner.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,9 @@
# deal with final field names either.


def clean(fields):
def clean(fields, strict=False):
global strict_mode
strict_mode = strict
visitor.visit_fields(fields, fieldset_func=schema_cleanup, field_func=field_cleanup)


Expand All @@ -46,7 +48,7 @@ def schema_cleanup(schema):
else:
schema['schema_details']['prefix'] = schema['field_details']['name'] + '.'
normalize_reuse_notation(schema)
# Final validity check
# Final validity check if in strict mode
schema_assertions_and_warnings(schema)


Expand All @@ -73,7 +75,7 @@ def schema_mandatory_attributes(schema):

def schema_assertions_and_warnings(schema):
'''Additional checks on a fleshed out schema'''
single_line_short_description(schema)
single_line_short_description(schema, strict=strict_mode)


def normalize_reuse_notation(schema):
Expand Down Expand Up @@ -165,7 +167,8 @@ def field_mandatory_attributes(field):
def field_assertions_and_warnings(field):
'''Additional checks on a fleshed out field'''
if not ecs_helpers.is_intermediate(field):
single_line_short_description(field)
# check short description length if in strict mode
single_line_short_description(field, strict=strict_mode)
if field['field_details']['level'] not in ACCEPTABLE_FIELD_LEVELS:
msg = "Invalid level for field '{}'.\nValue: {}\nAcceptable values: {}".format(
field['field_details']['name'], field['field_details']['level'],
Expand All @@ -178,12 +181,15 @@ def field_assertions_and_warnings(field):
SHORT_LIMIT = 120


def single_line_short_description(schema_or_field):
def single_line_short_description(schema_or_field, strict=True):
short_length = len(schema_or_field['field_details']['short'])
if "\n" in schema_or_field['field_details']['short'] or short_length > SHORT_LIMIT:
msg = "Short descriptions must be single line, and under {} characters (current length: {}).\n".format(
SHORT_LIMIT, short_length)
msg += "Offending field or field set: {}\nShort description:\n {}".format(
schema_or_field['field_details']['name'],
schema_or_field['field_details']['short'])
raise ValueError(msg)
if strict:
raise ValueError(msg)
else:
ecs_helpers.strict_warning(msg)
20 changes: 20 additions & 0 deletions scripts/tests/unit/test_schema_cleaner.py
Original file line number Diff line number Diff line change
Expand Up @@ -262,6 +262,26 @@ def test_multiline_short_description_raises(self):
with self.assertRaisesRegex(ValueError, 'single line'):
cleaner.single_line_short_description(schema)

def test_very_long_short_description_warns_strict_disabled(self):
schema = {'field_details': {
'name': 'fake_schema',
'short': "Single line but really long. " * 10}}
try:
with self.assertWarnsRegex(UserWarning, 'under 120 characters \(current length: 290\)'):
cleaner.single_line_short_description(schema, strict=False)
except Exception:
self.fail("cleaner.single_line_short_description() raised Exception unexpectedly.")

def test_multiline_short_description_warns_strict_disabled(self):
schema = {'field_details': {
'name': 'fake_schema',
'short': "multiple\nlines"}}
try:
with self.assertWarnsRegex(UserWarning, 'single line'):
cleaner.single_line_short_description(schema, strict=False)
except Exception:
self.fail("cleaner.single_line_short_description() raised Exception unexpectedly.")

def test_clean(self):
'''A high level sanity test'''
fields = self.schema_process()
Expand Down

0 comments on commit e6ba4c4

Please sign in to comment.