Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composite timelines: Change the behavior of uploaded data #1567

Closed
12 tasks done
kiddinn opened this issue Jan 27, 2021 · 9 comments
Closed
12 tasks done

Composite timelines: Change the behavior of uploaded data #1567

kiddinn opened this issue Jan 27, 2021 · 9 comments
Assignees

Comments

@kiddinn
Copy link
Contributor

kiddinn commented Jan 27, 2021

Problem

Elastic has a limit on the number of open resources, which means that each deployment only has a limited amount of indices that can be open at any given time.

The current behavior of uploading data is that each and every new data gets assigned a new SearchIndex (SI) and a new Elastic Index. This means that a sketch that has 10 data sources will have 10 open indices.

Elastic has a preference for fewer and bigger indices. Therefore the solution is to change the behavior of file uploads in the following manner:

  1. Each sketch will have a SI per label
  2. Each sketch will have a separate Timeline Object (TO), each one of them pointing to the appropriate SI.

This will mean that instead of each data source having it's own SI, all data sources will share the same SI in the sketch.

This will in part solve: #1200 but it also needs to be added into this design, that is the ability to merge older timelines in a sketch.

This requires multiple changes spread out over the codebase, including a major change in the UI and API client to make sure that searches can still be limited to each data source, or TO, instead of the current behavior that limits to it to hits within each SI.

A list of items to be done:

  • Change uploads to add a __ts_timeline_id field
  • Change psort upload behavior to use elastic_ts output module and move TS specific things to lib/tasks.py
  • Add a way to identify a timeline, whether it is a new world vs an old world timeline.
  • Change the UI so that it can correctly query timelines
  • Change the counts in the UI
  • Make sure labels still work
  • Make sure analyzers still work
  • Make sure aggregations still work
  • Make sure graphs still work
  • Make sure stories still work
  • Make sure saved searches still work
  • This also implies that we can't support tsctl import or uploading data via tsctl anymore, deprecate that.
@kiddinn
Copy link
Contributor Author

kiddinn commented Jan 31, 2021

First phase is in, which fixes upload REST API, the importer and explore REST api and search api object. Now search works across old sketches, new and mixed.

Next phase will be to fix the UI so that filtering based on individual timelines works and counting of events.

Another parallel stream is to fix psort so that plaso ingestion works

@kiddinn
Copy link
Contributor Author

kiddinn commented Feb 1, 2021

#1573 submitted as phase II, the first UI fixes for the new indexing.

kiddinn added a commit that referenced this issue Feb 1, 2021
…d fixing build_query (#1574)

* Changing __timeline_id to __ts_timeline_id and fixing build_query

* More changes from last PR
@kiddinn
Copy link
Contributor Author

kiddinn commented Feb 1, 2021

The needed change in psort can be found here: log2timeline/plaso#3463

@kiddinn
Copy link
Contributor Author

kiddinn commented Feb 2, 2021

fixing UI counts: #1576

@kiddinn
Copy link
Contributor Author

kiddinn commented Feb 4, 2021

Adding aggregation support in #1588

@kiddinn
Copy link
Contributor Author

kiddinn commented Feb 6, 2021

What is left here is to deprecate tsctl uploading of data and then to add more tests

@kiddinn
Copy link
Contributor Author

kiddinn commented Feb 6, 2021

What is also left is to update docker containers to point to the dev PPA once plaso gets a pre-release candidate out in the PPA. This would make plaso uploads work again.

@kiddinn
Copy link
Contributor Author

kiddinn commented Feb 9, 2021

Now we've moved to the testing stage, we just need to test and make sure everything works again.

@kiddinn
Copy link
Contributor Author

kiddinn commented Feb 18, 2021

This has been completed. Bug fixes have been made, but overall the changes required are completed.

@kiddinn kiddinn closed this as completed Feb 18, 2021
@berggren berggren changed the title Change the behavior of uploaded data Composite timelines: Change the behavior of uploaded data Oct 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants