New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Java heap space" error thrown exporting a large csv file #5187

Open
infomofo opened this Issue May 24, 2017 · 13 comments

Comments

Projects
None yet
10 participants
@infomofo
Copy link

infomofo commented May 24, 2017

Thanks for being part of the Metabase project!

  • Mac OS 10.12.3
  • Chrome 58.0.3029
  • Metabase v0.24.1
  • Redshift
  • Elastic Beanstalk

Exporting a large file breaks metabase and shows a json error in the browser.

{
message: "Java heap space",
stacktrace: [
"api.dataset$export_to_csv.invokeStatic(dataset.clj:55)",
"api.dataset$export_to_csv.invoke(dataset.clj:52)",
"api.dataset$as_format.invokeStatic(dataset.clj:102)",
"api.dataset$as_format.invoke(dataset.clj:94)",
"api.dataset$fn__28513$fn__28514.invoke(dataset.clj:123)",
"api.common.internal$do_with_caught_api_exceptions.invokeStatic(internal.clj:229)",
"api.common.internal$do_with_caught_api_exceptions.invoke(internal.clj:224)",
"api.dataset$fn__28513.invokeStatic(dataset.clj:116)",
"api.dataset$fn__28513.invoke(dataset.clj:116)"
]
}

image

@salsakran

This comment has been minimized.

Copy link
Contributor

salsakran commented Jun 3, 2017

we should catch this and display a friendlier error page.

@jamesbloomer

This comment has been minimized.

Copy link

jamesbloomer commented Jul 19, 2017

Is there a particular reason this error happens? We're trying to return a table of of ~200K rows with a few joins and are seeing this error.

@camsaul

This comment has been minimized.

Copy link
Member

camsaul commented Jul 19, 2017

It happens because the JVM doesn't have enough memory to process the request. What size instance are you using on Elastic Beanstalk?

@jamesbloomer

This comment has been minimized.

Copy link

jamesbloomer commented Jul 21, 2017

We're running on a t2.medium. After some testing it appears that the CSV works fine but the XLSX export fails, I guess that's creating more objects.

@demorenoc

This comment has been minimized.

Copy link

demorenoc commented Jul 26, 2017

I just run into the same error for a ~34K row x 20 columns query when downloading XLSX. I am also running in AWS EB with a t2.medium.

@thibault-rouby

This comment has been minimized.

Copy link

thibault-rouby commented Aug 9, 2017

When exporting a query on XLSX, my metabase instance memory skyrocket until the Java heap space memory error is thrown.
When exporting a query on CSV, it works fine.
The output file is less then 200MB but with the XLSX tentative export, the memory usage goes over 15GB...
It's running on docker, on a stand-alone server

@thucnc

This comment has been minimized.

Copy link

thucnc commented Aug 25, 2017

Hello, I got the same issue when exporting (20k rows * 11 columns) raw data to csv/xlsx on metabase v0.25.1 docker container.

In addition, when exporting smaller data set, say 5k or 1k rows, I observed RAM usage increased, but not release afterwards. (Redash released memory with the same test case). This situation makes the RAM usage gradually increase and reach the memory limit at late of work-day.

Rgds,

@thucnc

This comment has been minimized.

Copy link

thucnc commented Aug 28, 2017

@salsakran may you give this bug higher priority as it slows down our team productivity a lot? Anyway, Metabase is really outstanding thanks for your team.

@marcelmfs

This comment has been minimized.

Copy link

marcelmfs commented Aug 30, 2017

@thucnc metabase's relying on docjure to export xlsx, and docjure still don't have stream output writer implemented yet, as it's a somewhat low activity project (last commit as of Oct/2016).

@thucnc

This comment has been minimized.

Copy link

thucnc commented Oct 6, 2017

Hello @salsakran, the docjure library causing this issue hasn't been fixed yet, so we are still waiting for it ?

@hai-ld

This comment has been minimized.

Copy link

hai-ld commented Jan 5, 2018

Is there any workaround for this issue? Can we limit the number of rows in one export? Or can we selectively disable XLSX export?

@thucnc

This comment has been minimized.

Copy link

thucnc commented Mar 17, 2018

We are still suffering from this issue (many non-tech staff still need to download results) so this bug is really annoying. Is there any workaround ?

@jornh

This comment has been minimized.

Copy link
Contributor

jornh commented Mar 18, 2018

@thucnc, @hai-ld looks like the PR mjul/docjure#47 with addition of streaming support got stuck on the lack of missing test code demonstrating that it supports existing features of the Docture library. I think that’s a fair request from the Docture maintainer.

Until that fix can happen - which in turn will hopefully also better .xlsx export memory consumption with Metabase here a few work-around suggestions:

  • Buy more memory
  • Teach users to apply (stricter) filters - or reduce number of columns - or export as CSV
  • By looking at your business processes eliminate or reduce the need for .xlsx exports. It is IMO a sign of suboptimal processes if you have to manually move data. Maybe you can bring the Excel consumers to Metabase. Maybe you can automate import of data in another system if that is what the Excel files are used for.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment