Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Backend down when using gateway to aws s3 #6934

Closed
tomerzel87 opened this issue Dec 6, 2018 · 12 comments
Closed

Error: Backend down when using gateway to aws s3 #6934

tomerzel87 opened this issue Dec 6, 2018 · 12 comments

Comments

@tomerzel87
Copy link

tomerzel87 commented Dec 6, 2018

When writing multiple objects to a minio acting as a gateway to AWS with local cache, getting the following errors:

API: PutObject(bucket=data, object=agents/group-ADCDPL/agent-S0W1/resources/chunks/full_vol:DISK01:2018-12-06_114645.676Z:2805a94d-be66-47a9-b726-df3w82f57188:00001911.chunk.bin)
Time: 12:01:46 UTC 12/06/2018
RequestID: 156DBCAEBF30D7AF
RemoteHost: 192.1.1.2
UserAgent: jclouds/2.1.0 java/1.8.0_181
Error: Backend down
1: cmd/fs-v1-helpers.go:349:cmd.fsCreateFile()
2: cmd/disk-cache-fs.go:396:cmd.(*cacheFSObjects).PutObject()
3: cmd/disk-cache-fs.go:277:cmd.(*cacheFSObjects).Put()
4: cmd/disk-cache.go:696:cmd.cacheObjects.PutObject.func2()

The local cache is not close to being full.

Your Environment

  • Version used (minio version): Version: 2018-11-22T02:51:56Z
  • Using minio docker container
  • Operating System and version (uname -a): Linux ubuntu16 4.4.0-139-generic
@nitisht nitisht added this to the Next Release milestone Dec 6, 2018
@kannappanr
Copy link
Contributor

@tomerzel87 We will try to reproduce this issue and get back to you with more questions.

@poornas
Copy link
Contributor

poornas commented Dec 6, 2018

@tomerzel87, looks like you lost network connectivity to s3 backend while this object was being written. Cache writes occur simultaneously with backend writes - if there was a network issue, it would cause cache to fail as well. Is this happening consistently on all uploads or just a one-off issue?

@tomerzel87
Copy link
Author

@poornas we wrote several blobs to AWS which caused our internet connection to become a bit slow.
I'd assume if there is a internet problem or if it's jammed, minio would write the blobs async to AWS when the line is clear. Does that sound reasonable?

@poornas
Copy link
Contributor

poornas commented Dec 10, 2018

@tomerzel87, the underlying minio-go client by default is configured to retry the upload upto 5 times before it returns the Backend Down error

@tomerzel87
Copy link
Author

@poornas I can tell you that we tested with minio as gateway to AWS and without.
When we ran without minio the writes to AWS were ok and no timeouts occurred.
Only when we setup minio to check if it can help us as gateway we got the timeouts. Also what are the default timeouts?

@poornas
Copy link
Contributor

poornas commented Dec 11, 2018

@tomerzel87, retry starts at 30 seconds with exponential backoff. You mentioned that writes to AWS without minio as gateway had no timeouts - was this the case when you turn off cache? Can you paste your cache settings and how you are starting gateway with docker to help debug further

@tomerzel87
Copy link
Author

@poornas our cache settings:
MINIO_CACHE_EXPIRY=30
MINIO_CACHE_MAXUSE=75

I've mounted an external directory as the caching directory and it had ~150GB of free space to utilize.

@poornas
Copy link
Contributor

poornas commented Dec 13, 2018

@tomerzel87 , tried to repro this issue starting gateway with docker container like so,

docker run -p 9000:9000  -e "MINIO_ACCESS_KEY=XXXXX" -e "MINIO_SECRET_KEY=XXXXXXX"  -e "MINIO_CACHE_DRIVES=/var/log/testcache" -v ~/testcache:/var/log/testcache -e "MINIO_CACHE_EXPIRY=30" -e "MINIO_CACHE_MAXUSE=75" minio/minio gateway s3

and ran some automated tests that upload objects to the gateway. I was able to simulate the error you see when I killed my network connection - after the retries, the client fails with Object storage backend is unreachable.

-- server log --
API: PutObject(bucket=aws-sdk-ruby-bucket-00499837ac00, object=datafile-1-MB)
Time: 07:05:05 UTC 12/13/2018
RequestID: 156FD2875255414E
RemoteHost: 172.17.0.3
UserAgent: aws-sdk-ruby3/3.43.0 ruby/2.3.1 x86_64-linux-gnu aws-sdk-s3/1.30.0
Error: Backend down
       1: cmd/fs-v1-helpers.go:349:cmd.fsCreateFile()
       2: cmd/disk-cache-fs.go:396:cmd.(*cacheFSObjects).PutObject()
       3: cmd/disk-cache-fs.go:277:cmd.(*cacheFSObjects).Put()
       4: cmd/disk-cache.go:696:cmd.cacheObjects.PutObject.func2()


-- client log --
(1/3) Running aws-sdk-ruby tests ... FAILED in 2 minutes and 37 seconds
{
  "name": "aws-sdk-ruby",
  "function": "putObject(bucket_name, file)",
  "args": {
    "bucket_name": "aws-sdk-ruby-bucket-00499837ac00",
    "file": "/mint/data/datafile-1-MB"
  },
  "duration": 153298,
  "error": "Object storage backend is unreachable",
  "status": "FAIL"
}

Executed 0 out of 3 tests successfully.

Can you run tcpdump while uploading to gateway with cache on, and attach packet trace? Also, which client is being used for uploading to the gateway?

@tomerzel87
Copy link
Author

@poornas It will take me awhile to reproduce this on our lab due to other responsibilities.
Regarding the client it's part of a test program that uses Apache JClouds.

@poornas
Copy link
Contributor

poornas commented Jan 7, 2019

@tomerzel87, were you able to reproduce this with tcpdump enabled?

@deekoder
Copy link
Contributor

PUTs will no longer be cached. Hence we will close this as a won't fix. @tomerzel87 let us know your thoughts.

@harshavardhana
Copy link
Member

We discussed again, this time we are going to cache PUTs but we will be doing it with a new backend style.

poornas pushed a commit to poornas/minio that referenced this issue Jul 26, 2019
only on download.

Fixes: minio#7458, minio#7573, minio#6265, minio#6630, minio#7938 and minio#6934

This will allow cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.

- All cached content is automatically bitrot protected.

- Avoid ETag verification if cache-control header
is set and cached content is still valid.

- This PR changes the cache backend format, and all existing
content will be migrated to the new format. Until the data is
migrated completely, all content will be served from the backend.
poornas pushed a commit to poornas/minio that referenced this issue Jul 30, 2019
only on download.

Fixes: minio#7458, minio#7573, minio#6265, minio#6630, minio#7938 and minio#6934

This will allow cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.

- All cached content is automatically bitrot protected.

- Avoid ETag verification if cache-control header
is set and cached content is still valid.

- This PR changes the cache backend format, and all existing
content will be migrated to the new format. Until the data is
migrated completely, all content will be served from the backend.
poornas pushed a commit to poornas/minio that referenced this issue Aug 6, 2019
only on download.

Fixes: minio#7458, minio#7573, minio#6265, minio#6630, minio#7938 and minio#6934

This will allow cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.

- All cached content is automatically bitrot protected.

- Avoid ETag verification if cache-control header
is set and cached content is still valid.

- This PR changes the cache backend format, and all existing
content will be migrated to the new format. Until the data is
migrated completely, all content will be served from the backend.
poornas pushed a commit to poornas/minio that referenced this issue Aug 6, 2019
only on download.

Fixes: minio#7458, minio#7573, minio#6265, minio#6630, minio#7938 and minio#6934

This will allow cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.

- All cached content is automatically bitrot protected.

- Avoid ETag verification if cache-control header
is set and cached content is still valid.

- This PR changes the cache backend format, and all existing
content will be migrated to the new format. Until the data is
migrated completely, all content will be served from the backend.
poornas pushed a commit to poornas/minio that referenced this issue Aug 8, 2019
only on download.

Fixes: minio#7458, minio#7573, minio#6265, minio#6630, minio#7938 and minio#6934

This will allow cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.

- All cached content is automatically bitrot protected.

- Avoid ETag verification if cache-control header
is set and cached content is still valid.

- This PR changes the cache backend format, and all existing
content will be migrated to the new format. Until the data is
migrated completely, all content will be served from the backend.
poornas pushed a commit to poornas/minio that referenced this issue Aug 9, 2019
only on download.

Fixes: minio#7458, minio#7573, minio#6265, minio#6630, minio#7938 and minio#6934

This will allow cache to consistently work for
server and gateways. Range GET requests will
be cached in the background after the request
is served from the backend.

- All cached content is automatically bitrot protected.

- Avoid ETag verification if cache-control header
is set and cached content is still valid.

- This PR changes the cache backend format, and all existing
content will be migrated to the new format. Until the data is
migrated completely, all content will be served from the backend.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants