[Bug] ElasticSearch Backup #15

toxisch · 2019-11-25T08:54:43Z

On large DBs the backup does not work - where large is relative. Sometimes a little data + system data is enough to allow the backup to fail.

I also played with the sources and limited the ElasticDump script to single indexes. In this case the backup works fine. Therefore I assume that it is due to the size of the backup.

It is also no timeout issue! I did my tests with 10h timeout.

JamesClonk · 2019-11-27T09:25:56Z

@toxisch I what way exactly does the backup fail? Do you have any error messages in the app log? Does the app run out of memory perhaps? Does it work if you run elasticdump yourself manually with your DB?

toxisch · 2019-12-20T21:41:35Z

Hi @JamesClonk, sorry for the long response time. There is no difference between automatic and manual backup. Also no mem problem. Here is an ElasticSearch backup log. It ends with a termination in the S3 service. But S3 works fine. Mongo and Maria backups work on this system without problems.

Dec 19, 2019 @ 16:15:09.812 level=error msg="could not upload service backup [analytics-els] to S3: Put https://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx: context canceled"
Dec 19, 2019 @ 16:15:09.778 level=error
Dec 19, 2019 @ 16:15:09.778 level=error msg="requested backup for service [analytics-els] failed: elasticdump: signal: killed"
Dec 19, 2019 @ 16:15:09.778 level=error msg="could not backup service [analytics-els]: elasticdump: signal: killed"
Dec 19, 2019 @ 15:27:48.878 level=debug msg="upload S3 object [elasticsearch/analytics-els/analytics-els_20191219152748.gz]"
Dec 19, 2019 @ 15:27:48.877 level=debug msg="executing elasticsearch backup command: elasticdump --quiet --input=https://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx --output=$"

pvolkemer · 2020-03-05T11:03:44Z

Hi guys, I'm having the same issue trying to back up my elastic instances with backman to an S3 storage.
The log says this:

2020-02-11T13:54:41.31+0100 [APP/PROC/WEB/0] OUT level=debug msg="executing elasticsearch backup command: elasticdump --quiet --input=https://full-access-<Username>:<Password>@<ElasticID>.<mydomain> --output=$"
2020-02-11T13:54:41.31+0100 [APP/PROC/WEB/0] OUT level=debug msg="upload S3 object [elasticsearch/elasticsearch-dev/elasticsearch-dev_20200211135441.gz]"

But the upload to S3 never completes nor is anything stored there at all.

eatsan · 2020-05-04T20:45:11Z

Hi everyone,
I am having a similar issue with the elasticsearch backups with backman. In my case, I found out that there are (by default) 7 system indices called ".monitoring-es-7-YYYY.MM.DD" that is used by the ES "Stack monitoring" feature (https://www.elastic.co/guide/en/kibana/current/xpack-monitoring.html), which is part of X-Pack. Each of these 7 indices were relatively large in my case (each 1M docs count, 1.1 GB in size) and elasticdump was taking quite sometime to read & gzip these. On the other side, the CPU usage was around 4% and memory usage was relatively low (~100 MB). So, I am planning to test with --concurrency & --limit options of the elasticdump executable to see how much benefit it brings.

Are you also using a ES with Stack monitoring feature? Can this issue be affecting you as well?

pvolkemer · 2020-07-15T06:59:50Z

To me it seems elasticdump is never even started or ends immediately.
From my 2 log lines above you can see that "executing elasticsearch backup command:" and "upload s3 object" happen at the exact same time.
@JamesClonk is there anything we/you can do about this?
Also, my S3 instances gets the servicename dynstrg-2 so I need to configure this in my backman config.

toxisch · 2020-07-15T07:29:17Z

Hi @pvolkemer , for me the problem was solved by using the latest backman version.

pvolkemer · 2020-07-15T07:48:50Z

@toxisch Your Problem was different from mine. In my case, elasticdump doesn't seem to do anything so there is nothing that can be uploaded to S3.
In your case elasticdump seemed to create some file but upload to S3 failed.
I tried with 1.15.0 but still doesn't do anything.

somehowchris · 2021-01-04T00:13:49Z

So I do not really need backman to backup an ES instance since our parser vector (because logstash is a waste for parsing logs in my opinion) can also push logs to other destinations.

Now because some guys chose a too big instance I was forced to have a downgrad which would mean 1. backup, 2. recreate the service and 3. replay the backup. But it seems like that service never got backuped. There seems to be no config for the ES instance but backman should give it a default cron schedule.

Did that ever work? @JamesClonk

akovov · 2021-04-19T06:20:50Z

Constantly getting this error
level=error msg="could not backup service [*********]: elasticdump: timeout: context deadline exceeded"
currently changed timeout to 7 days, but could be bakman improved in order to give ability to chose complete backup or indicies starting from some mask, it should be not so hard as elasticdump behind it support this ?

akovov · 2021-04-22T13:56:35Z

failed before 7 days timeout reached @JamesClonk any suggestions regarding when it could be fixed?

JamesClonk · 2021-04-23T09:39:47Z

I've created a new release https://github.com/swisscom/backman/releases/tag/v1.28.0 that adds a direct_s3 configuration option for Elasticsearch backups. This will make elasticdump directly stream from/to S3 itself instead of going through backman internally. Maybe this helps solve the problem, you could try enabling it in your configs.

Unfortunately I do not use Elasticsearch myself and there's also no integration tests, etc. currently for it in the CI workflow. I can't test and support it if it does not work.

akovov · 2021-05-02T07:33:58Z

I checked and for me worked as expected. Didn't checked yet on big volumes, but plan to do so in 1 of June

akovov · 2021-11-02T10:11:47Z

Works good for the last half year

JamesClonk · 2021-11-02T10:56:31Z

thanks 👍️

update from swisscom:master

toxisch changed the title ~~ElasticSearch Backup~~ [Bug] ElasticSearch Backup Nov 25, 2019

JamesClonk closed this as completed Nov 2, 2021

andreatera mentioned this issue Jan 20, 2022

Elasticdump refuses to dump? #21

Closed

michaelbeutler pushed a commit to michaelbeutler/backman that referenced this issue Mar 28, 2024

Merge pull request swisscom#15 from swisscom/master

3c83dd7

update from swisscom:master

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] ElasticSearch Backup #15

[Bug] ElasticSearch Backup #15

toxisch commented Nov 25, 2019 •

edited

JamesClonk commented Nov 27, 2019

toxisch commented Dec 20, 2019

pvolkemer commented Mar 5, 2020 •

edited

eatsan commented May 4, 2020

pvolkemer commented Jul 15, 2020

toxisch commented Jul 15, 2020

pvolkemer commented Jul 15, 2020

somehowchris commented Jan 4, 2021

akovov commented Apr 19, 2021

akovov commented Apr 22, 2021 •

edited

JamesClonk commented Apr 23, 2021

akovov commented May 2, 2021

akovov commented Nov 2, 2021

JamesClonk commented Nov 2, 2021

[Bug] ElasticSearch Backup #15

[Bug] ElasticSearch Backup #15

Comments

toxisch commented Nov 25, 2019 • edited

JamesClonk commented Nov 27, 2019

toxisch commented Dec 20, 2019

pvolkemer commented Mar 5, 2020 • edited

eatsan commented May 4, 2020

pvolkemer commented Jul 15, 2020

toxisch commented Jul 15, 2020

pvolkemer commented Jul 15, 2020

somehowchris commented Jan 4, 2021

akovov commented Apr 19, 2021

akovov commented Apr 22, 2021 • edited

JamesClonk commented Apr 23, 2021

akovov commented May 2, 2021

akovov commented Nov 2, 2021

JamesClonk commented Nov 2, 2021

toxisch commented Nov 25, 2019 •

edited

pvolkemer commented Mar 5, 2020 •

edited

akovov commented Apr 22, 2021 •

edited