Internal server error - issue with 150+ timelines in a sketch. #1687

jacurutu1984 · 2021-03-13T15:34:38Z

Describe the bug
I imported with timesketch_importer (plaso file) approximatively 150 timelines in one sketch and now when i want to access the data through the UI, i get a blank page and the message "Internal server error"

To Reproduce
Steps to reproduce the behavior:
Import 150 timelines et go to the sketch

Expected behavior
I expected to access the data and to explore it.

Screenshots

Desktop (please complete the following information):

OS: Ubuntu 20.04
Browser Mozilla firefox 85
Version latest version of timesketch

Additional context
wsgi_error.log :
[2021-03-13 15:22:27,298] timesketch.app/ERROR Exception on /api/v1/sketches/1/count/ [GET]
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1949, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.8/dist-packages/flask/app.py", line 1935, in dispatch_request
return self.view_functionsrule.endpoint
File "/usr/local/lib/python3.8/dist-packages/flask_restful/init.py", line 458, in wrapper
resp = resource(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/flask/views.py", line 89, in view
return self.dispatch_request(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/flask_restful/init.py", line 573, in dispatch_request
resp = meth(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/flask_login/utils.py", line 261, in decorated_view
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/timesketch/api/v1/resources/event.py", line 665, in get
count, bytes_on_disk = self.datastore.count(indices)
File "/usr/local/lib/python3.8/dist-packages/timesketch/lib/datastores/elastic.py", line 767, in count
es_stats = self.client.indices.stats(
File "/usr/local/lib/python3.8/dist-packages/elasticsearch/client/utils.py", line 84, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/elasticsearch/client/indices.py", line 769, in stats
return self.transport.perform_request(
File "/usr/local/lib/python3.8/dist-packages/elasticsearch/transport.py", line 351, in perform_request
status, headers_response, data = connection.perform_request(
File "/usr/local/lib/python3.8/dist-packages/elasticsearch/connection/http_urllib3.py", line 261, in perform_request
self._raise_error(response.status, raw_data)
File "/usr/local/lib/python3.8/dist-packages/elasticsearch/connection/base.py", line 181, in _raise_error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
elasticsearch.exceptions.RequestError: RequestError(400, 'too_long_frame_exception', 'An HTTP line is larger than 4096 bytes.')

kiddinn · 2021-03-15T19:47:47Z

ok, I see the error is:

elasticsearch.exceptions.RequestError: RequestError(400, 'too_long_frame_exception', 'An HTTP line is larger than 4096 bytes.')

Few questions:

When you say the latest version, is that the latest release or the latest head?
How did you import the 150 timelines? All using the import client? (what options)

Second, using the API client can you execute the following code:

(I'm assuming your sketch ID is 1, since that is what i looks like from the logs)

from timesketch_api_client import config
ts_client = config.get_client()
sketch = ts_client.get_sketch(1)

indices = set([t.index_name for t in sketch.list_timelines()])
print(len(indices))

The reason for this error is that there are too many ES indices in the sketch, and therefore a HTTP request to get a list of the timelines is too long (since the index name is in the request).

The solution to this was #1567, with several PRs following, hence my question about version. In the new world, the imported timelines should be stored in the same ES index, although having a separate timeline ID, and thus not be subject to this limitation.

kiddinn · 2021-03-15T19:48:24Z

And in this case, you should be able to get more information using the API client rather than the UI, since the UI loads up too many things that could fail in this case.

kiddinn · 2021-03-15T20:06:41Z

Adding a bit more error handling to the ES datastore to catch this error in #1691

jacurutu1984 · 2021-03-15T20:17:52Z

When you say the latest version, is that the latest release or the latest head?

I upgraded the timesketch docker on saturday ( https://github.com/google/timesketch/blob/master/docs/Upgrading.md). so if i correctly understand your question, it's the latest release.
Timesketch_import version :
API Client Version: 20210226
Importer Client Version: 20210225

How did you import the 150 timelines? All using the import client? (what options)
Yes i imported all the timeline using the import client. I made a for loop to go through the 150 files
for file in $(ls *.plaso); do echo $file; d=$(echo $file| cut -d'.' -f1); timesketch_importer --sketch_id 1 --timeline_name $d --host http://plaso-System-Product-Name $file; sleep 5m; done

- Second, using the API client can you execute the following code:
I receive an error after this part : indices = set([t.index_name for t in sketch.list_timelines()])
WARNING:timesketch_api.client:Failed response: [500] Internal Server Error INTERNAL SERVER ERROR
Traceback (most recent call last):
File "", line 1, in
File "/home/plaso/.local/lib/python3.8/site-packages/timesketch_api_client/sketch.py", line 963, in list_timelines
for timeline_dict in sketch['objects'][0]['timelines']:
KeyError: 'objects'

kiddinn · 2021-03-16T09:23:40Z

ok, this needs to be further investigated, I'll need to test this more out on my side, to see if I can reproduce this issue.

The draft PR I've got should at least prevent the sketch object from failing like this, I'll reproduce.

kiddinn · 2021-03-19T13:35:57Z

One quick question, can you try again, with the latest importer client?

The reason I ask is that in the latest importer client it waits until the file has been ingested before it moves on to the next one.

kiddinn · 2021-03-19T13:52:53Z

So the issue here is that since you are doing a loop, and the logic for allocating indices only looked at active timelines in the sketch to compare against, the fact that a plaso file takes often some time to ingest, so that means that when you upload a plaso file into TS in a loop like this, you have uploaded one file before the last file completed it's ingestion, which meant that it wasn't considered to be an active timeline (definition, since it was still being processed).

So solution to this is:

Change the API so that not only active timelines are considered, but also timelines that are still being processed
Change the importer client to wait until the last timeline uploaded has been ingested before exiting the tool
Change you script so that instead of a random 5 minute sleep you wait until the file has been ingested before the next one is uploaded.

I've already implemented nr 1 in that list in a PR that will be soon out, testing it out right now before I send it for a review. Nr 2 has already been implemented and is in the latest importer release and nr 3 might not be needed after 1 and 2 have been implemented.

(regarding nr 2, I see that I haven't pushed the latest version out to pypi, I'm about to do that now)

kiddinn · 2021-03-24T09:17:12Z

So change #2 has been implemented, as well as #1.

Can you test again?

berggren · 2021-03-24T19:44:05Z

The last fix for this has been identified. It was an issue when we send a list of many repeated index names. I have a fix in the works that will get merged tomorrow.

jacurutu1984 added the Bug label Mar 13, 2021

kiddinn self-assigned this Mar 15, 2021

kiddinn changed the title ~~Internal server error~~ Internal server error - issue with 150+ timelines in a sketch. Mar 15, 2021

kiddinn mentioned this issue Mar 15, 2021

Fixed few bugs in the API and alpha sorted saved searches #1691

Merged

4 tasks

berggren mentioned this issue Mar 24, 2021

Always uniq index lists #1718

Merged

kiddinn closed this as completed in #1718 Mar 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Internal server error - issue with 150+ timelines in a sketch. #1687

Internal server error - issue with 150+ timelines in a sketch. #1687

jacurutu1984 commented Mar 13, 2021

kiddinn commented Mar 15, 2021

kiddinn commented Mar 15, 2021

kiddinn commented Mar 15, 2021

jacurutu1984 commented Mar 15, 2021

kiddinn commented Mar 16, 2021

kiddinn commented Mar 19, 2021

kiddinn commented Mar 19, 2021

kiddinn commented Mar 24, 2021

berggren commented Mar 24, 2021

Internal server error - issue with 150+ timelines in a sketch. #1687

Internal server error - issue with 150+ timelines in a sketch. #1687

Comments

jacurutu1984 commented Mar 13, 2021

kiddinn commented Mar 15, 2021

kiddinn commented Mar 15, 2021

kiddinn commented Mar 15, 2021

jacurutu1984 commented Mar 15, 2021

kiddinn commented Mar 16, 2021

kiddinn commented Mar 19, 2021

kiddinn commented Mar 19, 2021

kiddinn commented Mar 24, 2021

berggren commented Mar 24, 2021