-
Notifications
You must be signed in to change notification settings - Fork 614
Hunting - add generate-json command
#4613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Enhancement - GuidelinesThese guidelines serve as a reminder set of considerations when addressing adding a feature to the code. Documentation and Context
Code Standards and Practices
Testing
Additional Checks
|
hunting/json.py
Outdated
| "license": hunt_config.license | ||
| } | ||
|
|
||
| return json_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking is there a particular reason to use this effectively customized json format? The code generally looks fine if we want to use this format; however, if not, it could be substantially simpler.
For instance in your generate_json you could load each toml file as a Hunt dataclass object and use the inbuilt json dumps function to convert the dataclass to json.
E.g.
import json
from dataclasses import asdict
from hunting.definitions import Hunt
# Assuming you have a Hunt object
hunt = Hunt(
author="Elastic",
description="Example hunt",
integration=["integration1", "integration2"],
uuid="123e4567-e89b-12d3-a456-426614174000",
name="Example Hunt",
language=["esql"],
license="Elastic License",
query=["from logs | stats count() by host.name"],
notes=["Example note"],
mitre=["T1003"],
references=["https://example.com"]
)
# Convert Hunt object to JSON
hunt_json = json.dumps(asdict(hunt), indent=4)
print(hunt_json)Would result in
❯ python test_hunt.py
{
"author": "Elastic",
"description": "Example hunt",
"integration": [
"integration1",
"integration2"
],
"uuid": "123e4567-e89b-12d3-a456-426614174000",
"name": "Example Hunt",
"language": [
"esql"
],
"license": "Elastic License",
"query": [
"from logs | stats count() by host.name"
],
"notes": [
"Example note"
],
"mitre": [
"T1003"
],
"references": [
"https://example.com"
]
}This could be called in generate_json on a glob of the toml files in the provided directory and you could write json objects in a similar way to how are you writing them now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I was over-complicating it because I had copied the Markdown converter and was mimicking the markdown structure which was completely unnecessary, I've implemented your recommendation 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are ok with the less complex json structure, have you considered not using any of the code in hunting/json.py and add the functionality by using something similar to the following?
@hunting.command('generate-json')
@click.option('--path', type=Path, default=None, help="Path to a TOML file or directory containing TOML files.")
@click.option(
'--output-folder',
type=Path,
default=Path("json"),
show_default=True,
help="Output folder to save the generated JSON files. Defaults to './json'."
)
def generate_json(path: Path = None, output_folder: Path = None):
"""Convert TOML hunting queries to JSON format and save to output folder."""
output_folder = Path(output_folder)
output_folder.mkdir(parents=True, exist_ok=True)
# Determine the list of files to process
if path:
path = Path(path)
if path.is_file() and path.suffix == '.toml':
files_to_process = [path]
elif path.is_dir():
files_to_process = list(path.glob('*.toml'))
else:
raise ValueError(f"Invalid path provided: {path}")
else:
raise ValueError("Path must be provided as a file or directory.")
# Process each file
for file_path in files_to_process:
hunt_contents = load_toml(file_path)
json_hunt_contents = json.dumps(asdict(hunt_contents), indent=4)
output_file = output_folder / f"{file_path.stem}.json"
with open(output_file, 'w') as f:
f.write(json_hunt_contents)
click.echo(f"Generated JSON: {output_file}")| markdown_generator.update_index_md() | ||
|
|
||
| @hunting.command('generate-json') | ||
| @click.argument('path', required=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from pathlib import Path
...
@click.argument("path", type=click.Path(dir_okay=True, path_type=Path, exists=True))would ensure the argument path has the correct type, so there will be no need for forced conversion below:
path = Path(path)| from .definitions import Hunt | ||
| from .utils import load_index_file, load_toml | ||
|
|
||
| class JSONGenerator: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is really no need to create for a class here: the grouping of the logic is achieved by using a separate module (though, it's better to rename to avoid the confusion with default json package) and we have no use for state beyond this module.
I suggest refactoring this as separate stateless functions
Pull Request
Summary - What I changed
Add a
generate-jsoncommand which converts the hunting toml files to json.JSON files are added to the
jsondirectory next to in the same way docs generates thedocsfolder.I have added this output to the git ignore.
I have added this because I am investigating adding these prebuilt queries to kibana and wanted them in JSON format.
How To Test
Checklist
bug,enhancement,schema,maintenance,Rule: New,Rule: Deprecation,Rule: Tuning,Hunt: New, orHunt: Tuningso guidelines can be generated