Recovery.db.mv.db size crashes Mist #339

utkarshsaraf19 · 2017-11-06T12:57:25Z

I have setup a VM having following configuration : Redhat 7.4, 4 GB RAM

I have visualized that the size of Recovery.db.mv.db increases which is obvious as i run more jobs.

It is crashing when the size reaches 37 MB with java heap space error.

I wanted to know the reason of it.Is it due to browser loading this whole file or mist itself and what configuration changes/factors i have to keep in mind while deploying it?

The text was updated successfully, but these errors were encountered:

dos65 · 2017-11-06T13:04:54Z

Does it always crash after restart?

utkarshsaraf19 · 2017-11-06T13:09:34Z

Yes..once this starts,it never gets normal.I have to delete Recovery.db.mv.db manually to make it work again.

dos65 · 2017-11-06T13:16:30Z

As a workaround you could increase Xmx for mist master process.
But we definitely should reconsider our way to store jobs history.

utkarshsaraf19 · 2017-11-06T13:21:45Z

I have some suggestion for the same:

a. Bucketing recoverydb on basis of number of jobs and output size of it.
b. Storing recoverydb on basis of folder named with endpoint
c. Showing it in front end with pagination feature as one page will utilize only one recoverydb at a time.
d. Seperate jvm for history logs.

dos65 · 2017-11-06T14:42:35Z

Yes bucketing or limiting could help.
I think at least we should provide a way to configure history storage (use external databases).

There is also a complicated question - should we continue storing jobs results inside database? They can very large and using databases may be an inefficient approach.

@spushkarev @mkf-simpson - This may be interesting for you - if we found another way how to store job results, it can be possible to build pipelines over datasets with that feature.

mkf-simpson · 2017-11-06T14:50:19Z

I cannot understand how job history relates to pipelines?

dos65 · 2017-11-06T15:22:17Z

To invoke pipeline stages on different spark-contexts we need to store jobs results somewhere.

mkf-simpson · 2017-11-06T15:31:27Z

Ok, MistWarehouse? :) But this discussion is for another ticket I guess.

utkarshsaraf19 · 2017-11-07T05:54:40Z

@mkf-simpson @dos65 @spushkarev Whichever way you choose, just one thing should be considered that job should not be deployed every time we hit it with endpoint due to this seperation. As of now,it takes 25 seconds for first run and less then 2 seconds for after run for same endpoint which is better in case of production use.

dos65 added the bug label Nov 6, 2017

dos65 closed this as completed Jul 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recovery.db.mv.db size crashes Mist #339

Recovery.db.mv.db size crashes Mist #339

utkarshsaraf19 commented Nov 6, 2017 •

edited

dos65 commented Nov 6, 2017

utkarshsaraf19 commented Nov 6, 2017 •

edited

dos65 commented Nov 6, 2017

utkarshsaraf19 commented Nov 6, 2017 •

edited

dos65 commented Nov 6, 2017

mkf-simpson commented Nov 6, 2017

dos65 commented Nov 6, 2017

mkf-simpson commented Nov 6, 2017

utkarshsaraf19 commented Nov 7, 2017 •

edited

Recovery.db.mv.db size crashes Mist #339

Recovery.db.mv.db size crashes Mist #339

Comments

utkarshsaraf19 commented Nov 6, 2017 • edited

dos65 commented Nov 6, 2017

utkarshsaraf19 commented Nov 6, 2017 • edited

dos65 commented Nov 6, 2017

utkarshsaraf19 commented Nov 6, 2017 • edited

dos65 commented Nov 6, 2017

mkf-simpson commented Nov 6, 2017

dos65 commented Nov 6, 2017

mkf-simpson commented Nov 6, 2017

utkarshsaraf19 commented Nov 7, 2017 • edited

utkarshsaraf19 commented Nov 6, 2017 •

edited

utkarshsaraf19 commented Nov 6, 2017 •

edited

utkarshsaraf19 commented Nov 6, 2017 •

edited

utkarshsaraf19 commented Nov 7, 2017 •

edited