Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of large mappings #1540

Closed
avleen opened this issue Oct 6, 2014 · 26 comments
Closed

Improve handling of large mappings #1540

avleen opened this issue Oct 6, 2014 · 26 comments
Labels
bug Fixes for quality problems that affect the customer experience high hanging fruit

Comments

@avleen
Copy link

avleen commented Oct 6, 2014

When Kibana 4 first starts up, it looks for all indices named logstash-*, and then fetches the names of all fields in those indices.

In our case, we have ~28 indices, each with hundreds, or sometimes thousands of fields.
My browser tab hit ~1Gb RAM before it died.

@rashidkpc rashidkpc added the bug Fixes for quality problems that affect the customer experience label Oct 6, 2014
@rashidkpc
Copy link
Contributor

Thanks, the reason we fetch all of the fields is so that we can figure out if there are any mapping inconsistencies that Kibana needs to know about when making fields available for aggregations and search. Of course, as in your case, this can be a HUGE request. We cache the results of the post processing in Elasticsearch so that we don't have todo it again, but the first hit could be big

The plan right now is to allow Elasticsearch to script responses so that we can pre-process the mapping on the Elasticsearch side. See here: elastic/elasticsearch#7401

@avleen
Copy link
Author

avleen commented Oct 6, 2014

Excellent :)

Yes, right now we're looking at a 22Mb response with ~250,000 fields, my
browser quickly runs out of memory before it can finish processing that.
I managed to sneak in a quick change to the index name (set it to
logstash-2014.10.06) and hit "create" before it ran out of memory. But that
means I can just see today's data :-)

On Mon, Oct 6, 2014 at 4:24 PM, Rashid Khan notifications@github.com
wrote:

Thanks, the reason we fetch all of the fields is so that we can figure out
if there are any mapping inconsistencies that Kibana needs to know about
when making fields available for aggregations and search. Of course, as in
your case, this can be a HUGE request. We cache the results of the post
processing in Elasticsearch so that we don't have todo it again, but the
first hit could be big

The plan right now is to allow Elasticsearch to script responses so that
we can pre-process the mapping on the Elasticsearch side. See here:
elastic/elasticsearch#7401
elastic/elasticsearch#7401

Reply to this email directly or view it on GitHub
#1540 (comment)
.

@rashidkpc rashidkpc changed the title Kibana 4 blows up browser memory fetching fields High browser memory when fetching large mappings Oct 6, 2014
@rashidkpc rashidkpc added help wanted adoptme and removed help wanted adoptme labels Oct 6, 2014
@otisg
Copy link

otisg commented Oct 7, 2014

Ouch! One more use-case to keep in mind might be multi-tenant clusters, where one user using Kibana should not load all other users' fields/mappings into the browser.

@avleen
Copy link
Author

avleen commented Nov 11, 2014

Hey folks, and updates on this? Really want to try out Kibana 4 but it's a non-starter due to this issue.

@garyelephant
Copy link

This problem also exists in kibana3.I was wondering why kibana loaded es_host:9200/_all/mapping from elasticsearch, causing my browser died.I have 110+ indices.

@avleen
Copy link
Author

avleen commented Dec 12, 2014

There might be an easier way to fix this.
I noticed that I can get (with about 2Gb RAM) the mapping into my browser to pick the timestamp field.
Interestingly, only the fields which have "timestamp" in the name are shown!
This is great - why don't we do this filtering on the server side, and then only send those fields which have "timestamp" in the name to the browser?

I don't know if this is a half-assed fix, because the problem might present itself elsewhere.
But if someone is able to point me in the direction of the code responsible for this, I'd be happy to try and patch it.

@rashidkpc
Copy link
Contributor

Its not only fields that have timestamp, but rather any date fields. Also, we're going to need to fetch the entire mapping anyway as we need to process and cache it. Its possible the bulk of this work could be done on the backend though.

Note that we'll still need a way to display these in the field list, 250,000 field is just a lot of fields

@rashidkpc
Copy link
Contributor

via @maguec

When loading a kibana setup with thousands of fields Javascript causes chrome to crash with the following error. In firefox the script takes too long to unmarshal all of the fields and asks if I wish to continue and pins the CPU.

Is there a way to turn off the filed typing?

RangeError: Maximum call stack size exceeded at IndexedArray.pop push shift splice unshift reverse.split.forEach.IndexedArray.(anonymous function) [as push] (https://telemetry.REDACTED/stats/index.js?_b=5888:84049:12) at new IndexedArray (https://telemetry.REDACTED/stats/index.js?_b=5888:83967:17) at setIndexedValue (https://telemetry.REDACTED/stats/index.js?_b=5888:85229:21) at IndexPattern.self._indexFields (https://telemetry.REDACTED/stats/index.js?_b=5888:85279:13) at applyESResp (https://telemetry.REDACTED/stats/index.js?_b=5888:85215:18) at deferred.promise.then.wrappedCallback (https://telemetry.REDACTED/stats/index.js?_b=5888:20888:81) at https://telemetry.REDACTED/stats/index.js?_b=5888:20974:26 at Scope.$get.Scope.$eval (https://telemetry.REDACTED/stats/index.js?_b=5888:22017:28) at Scope.$get.Scope.$digest (https://telemetry.REDACTED/stats/index.js?_b=5888:21829:31) at Scope.$get.Scope.$apply (https://telemetry.REDACTED/stats/index.js?_b=5888:22121:24)

Heap and CPU profile information are available below

http://shokunin.co/upload/kibana4.heapsnapshot
http://shokunin.co/upload/kibana4.cpuprofile

@antondollmaier
Copy link

Got the same error as well, directly after opening up Kibana4 after setup (/#/discover?_g=()):

RangeError: Maximum call stack size exceeded
    at new IndexedArray (https://kibana.REDACTED/index.js?_b=5930:83857:17)
    at setIndexedValue (https://kibana.REDACTED/index.js?_b=5930:85119:21)
    at IndexPattern.self._indexFields (https://kibana.REDACTED/index.js?_b=5930:85169:13)
    at applyESResp (https://kibana.REDACTED/index.js?_b=5930:85105:18)
    at wrappedCallback (https://kibana.REDACTED/index.js?_b=5930:20873:81)
    at https://kibana.REDACTED/index.js?_b=5930:20959:26
    at Scope.$eval (https://kibana.REDACTED/index.js?_b=5930:22002:28)
    at Scope.$digest (https://kibana.REDACTED/index.js?_b=5930:21814:31)
    at Scope.$apply (https://kibana.REDACTED/index.js?_b=5930:22106:24)
    at done (https://kibana.REDACTED/index.js?_b=5930:17641:45)

We have logstash with currently 14 days on indizes, piping multiple different applications in Logstash.

@webmstr
Copy link

webmstr commented Apr 3, 2015

Seems like this info should be cached by the K4 server, rather than thrust on the browser.

Either way, our mapping is also too big for K4 to be useful, so +1.

@simmel
Copy link

simmel commented Apr 28, 2015

@rashidkpc as I asked in #3674, do you know where the pain starts? I went down from +1k to 250 and it's still unusable. Will it lag even if closed indexes have +1k fields?

@ppf2
Copy link
Member

ppf2 commented Sep 1, 2015

+1

@eunachen
Copy link

I had same problem.
I input many text file into a index in ES
When I query some strings with kibana, it is very slow.
Sometimes it shows error.

@avleen
Copy link
Author

avleen commented Oct 13, 2015

I was able to able to work around the issue simply by reducing the number of fields.
When I had > 20,000 fields in each day's index, my browser would freeze each time I tried to search, or open kibana, or do most things.
With around 5000 fields/day, it's quite a bit more responsive.

This is with Kibana 4.1.

@eunachen
Copy link

In the index, the number of fields is small. But context is very large.
I input all text file (ex:pdf.ppt...) into ES with mapper-attachments pingin.
Use ES to create a search engine in server.
When I use ES-head pingin query and show the result in browser quickly.
But I need a user interface to query easily. (No need to use search API)
So I use kibana. But it is too slow.

Now, I set discover:sampleSize=5. It faster than before.
But I can not view all hits.

@jnbala
Copy link

jnbala commented Oct 28, 2015

One reason, I have seen this happen is due to unescaped "" in the field values.

For example, "c:\users" field value is interpreted as unicode(\user) and therefore consumes a lot of time.

@gmoskovicz
Copy link
Contributor

I have added some details around how to replicate this issue in #5331. Looks related to the way that we load fields using the jQuery.map function, which can't handle more than x amount of fields rather than amount of memory.

@runesl
Copy link

runesl commented Nov 20, 2015

+1 for a fix

1 similar comment
@ESamir
Copy link

ESamir commented Jan 11, 2016

+1 for a fix

@DimitryDushkin
Copy link

DimitryDushkin commented May 5, 2016

+1
I had dataset with about 660 fields. Discover page load for 2-3 mins on Chrome Canary on Core i5.

@0063292
Copy link

0063292 commented May 27, 2016

I have similar problem. I our case, ~1800 indexes all in the same index pattern. Each index has ~380 fields. The indexes are created 1 per tenant, but all tenants have the same mapping schema, so this is not a problem.

My problem manifests itself where the payload exceeds the allowable size of the javascript object it is being stuffed into on the client, which throws an internal javascript exception.

However, the WORST issue is that the "refresh mapping" completely ignores that fact that it had an exception on the GET and POSTS back an empty array for the kibana field mappings, wiping out all prior field mappings!

Minimally a fix should be made to not POST back the empty data.

I have found a workaround by creating an index pattern that just target one index. As I noted all our indexes have the same mappings, so I can the use fiddler to capture the POST for single index pattern and replace it with my "all indexes" pattern and successfully post back the field mappings for the ~380 unique fields among all the indexes.

Please address this, this is a huge maintenance problem and prone to wipe our our kibana mappings frequently by those unaware of the "killer" POST back nothing "feature".

@rashmivkulkarni
Copy link
Contributor

@elastic/kibana-operations - this issue dates back to K4. Any idea what to do with this ?

@jbudz jbudz added the Team:Operations Team label for Operations Team label Aug 14, 2018
@jbudz
Copy link
Member

jbudz commented Aug 14, 2018

Confirming still an issue.

@0063292
Copy link

0063292 commented Sep 26, 2018

FYI: We are opening a support ticket with elastic as the workaround noted above by myself when we were on a 2.4 cluster is no longer viable in 5.x as the call for mappings happens everytime now, even if current mappings already exist and without using the refresh field list button.

@epixa epixa added Team:Visualizations Visualization editors, elastic-charts and infrastructure and removed Team:Operations Team label for Operations Team labels Oct 1, 2018
@epixa
Copy link
Contributor

epixa commented Oct 1, 2018

@elastic/kibana-app I think this one was mislabeled, so I switched to you.

@ppisljar ppisljar added :Management and removed Team:Visualizations Visualization editors, elastic-charts and infrastructure labels Oct 10, 2018
@ppisljar
Copy link
Member

i created feature request to track this: #23947

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience high hanging fruit
Projects
None yet
Development

No branches or pull requests