Recover data after docker-compose down #37

pedropalb · 2020-04-01T19:45:28Z

Hello!

In order to get rid of the bug below, I used docker-compose -f .\docker-compose-win10.yml down and then docker-compose -f .\docker-compose-win10.yml up -d.

Failed logging task to backend (2 lines, <500/100: events.add_batch/v1.0 (General data error: err=('2 document(s) failed to index.', [{'index': {'_index': 'events-log-d1bd92a3b039400cbafc60a7a5b1e52b', '_type': 'event', '_id': '45d72e8d79724292a7f7b0a5f58fb681', 'status': 503, 'error': {'type': 'unavailable_shards_exception', 'reason': '[events-log-d1bd92a3b039400cbafc60a7a, {'index': {'_index': 'events-log-d1bd92a3b039400cbafc60a7a5b1e52b', '_type': 'event', '_id': 'e61e98adaff94753afb633ef67afc017', 'status': 503, 'errouests and a refresh]'}, 'data': {'timr': {'type': 'unavailable_shards_exception', 'reason': '[events-log-d1bd92a3b039400cbafc60a7a5b1e52b][0] primary shard is not active Timeout: [1m], reqted new task id=63f6480b0a9a4d078a80cuest: [BulkShardRequest [[events-log-d1bd92a3b039400cbafc60a7a5b1e52b][0]] containing [2] requests and a refresh]'}, 'data': {'timestamp': 158576280385og', '@timestamp': '2020-04-01T17:40:0, 'type': 'log', 'task': '63f6480b0a9a4d078a80c8748f27fc65', 'level': 'info', 'worker': 'pa-barbosa01', 'msg': 'Train for 10 steps, validate for 263 s3afb633ef67afc017', 'status': 503, 'eteps\nEpoch 1/5', '@timestamp': '2020-04-01T17:40:04.236Z', 'metric': '', 'variant': ''}}}]), extra_info=[events-log-d1bd92a3b039400cbafc60a7a5b1e52b][request: [BulkShardRequest [[events-l0] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[events-log-d1bd92a3b039400cbafc60a7a5b1e52b][0]] containing [2] requests andb0a9a4d078a80c8748f27fc65', 'level':  a refresh])>)

After that, I don't see any of all the experiments I have done until now. I thought that the services' containers had all the data mapped into the host filesystem (c:\opt\trains in my case).

How can I recover my experiments' data?

The text was updated successfully, but these errors were encountered:

bmartinn · 2020-04-01T21:22:17Z

Hi @pedropalb,

The services' containers indeed have their data folders mapped to the host file system... This might have something to do with an issue during the Elasticsearch container initialization - can you please share the container's log? You can get the log using the following command:

$ docker logs trains-elastic

pedropalb · 2020-04-01T21:50:53Z

It seems to be a problem with the free disk space. But I believe that 23.2 GB (the current free space in my disk) would be enough. The used space in c:\opt\trains is only 250 MB. The available space for docker disk image is 16 GB (from which only 3.1 GB). It follows the logs file:

trains-elastic-logs.txt

bmartinn · 2020-04-01T22:24:39Z

Hi @pedropalb,

By default, the high watermark is 90% (see here), so the question is not how much free space you have on your disk, but what is the percentage of used space - try clearing up some space to see if it helps.

Alternatively, you can also configure the elasticsearch container using a different watermark by setting the value to a different percentage or a hard-coded number of bytes - simply edit your docker compose file and add a new line under the services / elasticsearch / environment section:

services:
  ...
  elasticsearch:
    ...
    environment:
      ...
      cluster.routing.allocation.disk.watermark.high: "15gb"

In the following example, elasticsearch will hit the high watermark only when you have less than 15 gigabytes free on your disk.

Please let me know if that works for you :)

pedropalb · 2020-04-01T23:46:55Z

Oh I see! I managed to free some space and it seems to have fixed the elastic search issue. But still, all my data has gone after the docker-compose down and up.

Every time we restart the server we have to create a new user and credentials? Can't I recover my data anymore?

trains-elastic-logs.txt

bmartinn · 2020-04-02T04:57:23Z

Hi @pedropalb,

Every time we restart the server we have to create a new user and credentials? Can't I recover my data anymore?

The user and credentials are stored in the configuration files, not in the Elasticsearch data - did you lose those as well?

Regarding Elasticsearch, the data should still be there - can you find and send the directory contents of the nodes folder inside the Elasticsearch data folder? It should be located in c:/opt/trains/data/elastic/nodes or thereabouts.

pedropalb · 2020-04-02T14:39:26Z

It seems the Elasticsearch data is still there in the path you said. But the MongoDB is almost empty. I tried to query tasks, projects, users, etc. Everything is empty but the user collection that has only the newest user I created. What does go to MongoDB and what does go to Elasticsearch?

I didn't lose the configuration file but it has only the old credential. I didn't specify a user in the config file. I did that through the Web UI. So after the restart, I had to create new credentials and replace them in the config file. With the old credentials I couldn't even use the APIClient().

bmartinn · 2020-04-02T15:39:09Z

Hi @pedropalb,

You are correct in assuming that tasks, projects etc. (including user credentials) are stored in mongodb.
I now realize that your mongodb data has somehow disappeared, which is very strange - I previously thought it was only Elastic-related.

Can you please share the trains-mongo docker container log?
Also, can you see if there's anything in your C:\Users\Public\Documents\Hyper-V\Virtual hard disks folder? Maybe a file or sub-folder named mongodata?

A few other thoughts:

Is it possible that your docker-compose file was somehow changed and the mount point for the mongodb data folder was changed? Did you update the docker-compose file or download a new one?
Did you upgrade your Docker Desktop? In order to use Linux containers, you usually need to add mapped drives to the Shared Drives list in the Docker Desktop Settings. However, Docker Desktop seems to have an inconsistency in this feature since I can't find this setting any more in the Docker Desktop Settings page, but their troubleshoot page still says it's required (see Troubleshoot, under VOLUME MOUNTING REQUIRES SHARED DRIVES FOR LINUX CONTAINERS - the link there now points to nowhere...) In case of the mount silently failing (it shouldn't, but still), what you're seeing now is probably the result of the mongo data folder not being mapped outside of the docker container, in which case mongo simply creates a new empty database inside the docker that has nothing to do with the outside world.

pedropalb · 2020-04-02T17:55:03Z

Here is the MongoDB log:
trains-mongo-logs.txt

There is nothing in C:\Users\Public\Documents\Hyper-V\Virtual hard disks.

I did docker-compose down, downloaded a newer docker-compose file and did docker-compose up. But, now, without changing the docker-compose file, every time I restart the server, all the data is gone (I tested creating a new project and restarting the server).
I upgraded the docker. I'm using Docker Desktop 2.2.0.4 with docker engine 19.03.8. I believe that this volume mapping is the source of my problems. I will check it.

Thanks!

pedropalb · 2020-04-03T20:22:46Z

Hi @bmartinn!

The issue is in the volume mapping in the docker-compose.yaml for the MongoDB service. By default, MongoDB writes the data in /data/db. But in the docker-compose file, we have mongodata:/data. But /data has no data at all and, consequently, neither does the volume trains_mongodata. When the container dies, all the mongo data in /data/db dies with it.

So, the solution - I hope - is to change the mapping from mongodata:/data to mongodata:/data/db so the MongoDB's records are mapped to the volume trains_mongodata.

I noted that docker-compose.win10.yml has this issue but the docker-compose.yml has not. The latter has no volume named trains_mongodata. It maps /data/db directly in a host directory as it is done with the other data sources.

Would this change impact other TRAINS' functionalities?

Thanks!

bmartinn · 2020-04-06T16:37:48Z

Hi @pedropalb,

I just found where the shared drives feature was moved to in the new Docker Desktop: please go to Settings / Resources / File Sharing and make sure your drive (c drive, I assume) is marked for file sharing - let me know if that changes anything.

pedropalb · 2020-04-07T01:49:41Z

Hi @bmartinn,
I figured out what was the problem and reported the solution above in my previous comment.
Thanks.

pedropalb · 2020-06-10T21:06:42Z

Hi,
I noticed that in the docker-compose-win10.yml file, the volume mapping of the mongo container is still using /data. Have you tested this on windows? As I reported above (#37 (comment)), it only worked for me changing - mongodata:/data to - mongodata:/data/db.
https://github.com/allegroai/trains-server/blob/3bf5126d84a67e7dec193bec0f6eff165e25665f/docker-compose-win10.yml#L90

Is there any other place where you change the mongo default data directory to /data?

bmartinn · 2020-06-16T21:11:10Z

Hi @pedropalb
It seems like the new Docker for windows, solved the need for a specific volume for the mongodb docker (which was the reason to map the parent /data folder and not the /data/db and /data/configdb). The docker compose for windows is now updated with similar fix.
Let me know if the issue consists.

pedropalb closed this as completed Apr 22, 2020

pedropalb reopened this Jun 16, 2020

allegroai-git closed this as completed Jul 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recover data after docker-compose down #37

Recover data after docker-compose down #37

pedropalb commented Apr 1, 2020

bmartinn commented Apr 1, 2020

pedropalb commented Apr 1, 2020

bmartinn commented Apr 1, 2020 •

edited

pedropalb commented Apr 1, 2020

bmartinn commented Apr 2, 2020 •

edited

pedropalb commented Apr 2, 2020

bmartinn commented Apr 2, 2020

pedropalb commented Apr 2, 2020

pedropalb commented Apr 3, 2020 •

edited

bmartinn commented Apr 6, 2020

pedropalb commented Apr 7, 2020

pedropalb commented Jun 10, 2020

bmartinn commented Jun 16, 2020 •

edited

Recover data after docker-compose down #37

Recover data after docker-compose down #37

Comments

pedropalb commented Apr 1, 2020

bmartinn commented Apr 1, 2020

pedropalb commented Apr 1, 2020

bmartinn commented Apr 1, 2020 • edited

pedropalb commented Apr 1, 2020

bmartinn commented Apr 2, 2020 • edited

pedropalb commented Apr 2, 2020

bmartinn commented Apr 2, 2020

pedropalb commented Apr 2, 2020

pedropalb commented Apr 3, 2020 • edited

bmartinn commented Apr 6, 2020

pedropalb commented Apr 7, 2020

pedropalb commented Jun 10, 2020

bmartinn commented Jun 16, 2020 • edited

bmartinn commented Apr 1, 2020 •

edited

bmartinn commented Apr 2, 2020 •

edited

pedropalb commented Apr 3, 2020 •

edited

bmartinn commented Jun 16, 2020 •

edited