-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic analysis tasks #663
Conversation
…etch into 662-auto-similarity-scorer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, mostly just some small things.
timesketch/api/v1/resources.py
Outdated
task_directory = {u'plaso': run_plaso, | ||
u'csv': run_csv_jsonl, | ||
u'jsonl': run_csv_jsonl} | ||
from timesketch.lib import tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just import this at the top?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the way Celery works together with Flask. In order to avoid circular imports this is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, cool.
timesketch/api/v1/resources.py
Outdated
# If enabled, run sketch analyzers when timeline is added. | ||
try: | ||
if current_app.config[u'ENABLE_SKETCH_ANALYZERS']: | ||
from timesketch.lib import tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same import question here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
sketch_id: The Sketch ID. | ||
""" | ||
self.id = sketch_id | ||
self.sql_sketch = SQLSketch.query.get(sketch_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How safe is this? Should this get wrapped in a try/except block? or will an appropriate exception be raised already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQLalchemy returns None if the sketch doesn't exist. This will raise later in the code path but I have added an early raise here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Maybe add a Raises:
section in the docstring too then? :)
timesketch/lib/tasks.py
Outdated
analyzer_class = manager.AnalysisManager.get_analyzer(analyzer_name) | ||
analyzer = analyzer_class(index_name=index_name, **kwargs) | ||
result = analyzer.run_wrapper() | ||
logging.info('[%s] result: %s' % (analyzer_name, result)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason you're not using {0:s}
and .format()
here? I thought it might be because it's a logging line, but there are instances of using format()
in logging lines in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
# Run psort.py | ||
try: | ||
cmd_output = subprocess.check_output(cmd, stderr=subprocess.STDOUT) | ||
subprocess.check_output(cmd, stderr=subprocess.STDOUT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need to add shell=True here in order for psort_path to work when it is set to just psort.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, but no, it seems to work without shell=True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. OK.
|
||
return cmd_output | ||
return index_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the return type should be changed in the docstrings then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
return {u'Events processed': total_events} | ||
return index_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the return type in docstrings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
db_session.commit() | ||
|
||
# If enabled, run sketch analyzers when timeline is added. | ||
from timesketch.lib import tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not import at the top of the file? Maybe add a comment about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above. Added comment to explain as well.
@aarontp PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Re-approved, :) |
This PR implements automatic analysis of timelines added to Timesketch. It supports analysis plugins to run after indexing, or when a new timeline is added to a sketch.
It implements a simple abstraction for creating new analysis workers taking care of all background tasks scheduling etc.