Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: storage info output with unordered endpoints arguments #9610

Merged
merged 2 commits into from May 19, 2020

Conversation

vadmeste
Copy link
Member

Description

Shuffling arguments that we pass to MinIO server is supported. However,
when that happens, prometheus returns wrong information about disks usage
and online/offline status.

The commit fixes the issue by avoiding relying on xl.endpoints since
it is not ordered.

Motivation and Context

Fixes #9512

How to test this PR?

  1. export MINIO_PROMETHEUS_AUTH_TYPE="public"
  2. Deploy a four nodes MinIO cluster
  3. Restart the cluster while suffling the arguments
  4. curl http://localhost:9002/minio/prometheus/metrics 2>/dev/null | grep minio_disk | grep -v '^#'

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • Fixes a regression (If yes, please add commit-id or PR # here)
  • Documentation needed
  • Unit tests needed
  • Functional tests needed (If yes, add mint PR # here: )

cmd/xl-v1.go Show resolved Hide resolved
@vadmeste vadmeste force-pushed the fix-when-endpoints-unordered branch 2 times, most recently from c0d8e10 to 50b1746 Compare May 15, 2020 17:32
@kannappanr kannappanr requested a review from donatello May 15, 2020 18:02
Shuffling arguments that we pass to MinIO server is supported. However,
when that happens, prometheus returns wrong information about disks usage
and online/offline status.

The commit fixes the issue by avoiding relying on xl.endpoints since
it is not ordered.
@vadmeste vadmeste force-pushed the fix-when-endpoints-unordered branch from 50b1746 to 8948520 Compare May 15, 2020 19:47
@minio-trusted
Copy link
Contributor

Mint Automation

Test Result
mint-xl.sh ✔️
mint-large-bucket.sh ✔️
mint-dist-xl.sh ✔️
mint-gateway-s3.sh ✔️
mint-gateway-azure.sh ✔️
mint-fs.sh more...
mint-gateway-nas.sh more...

9610-412b0db/mint-gateway-nas.sh.log:

Running with
SERVER_ENDPOINT:      minio-dev6.minio.io:30559
ACCESS_KEY:           minio
SECRET_KEY:           ***REDACTED***
ENABLE_HTTPS:         0
SERVER_REGION:        us-east-1
MINT_DATA_DIR:        /mint/data
MINT_MODE:            full
ENABLE_VIRTUAL_STYLE: 0

To get logs, run 'docker cp e4a68b3cffc9:/mint/log /tmp/mint-logs'

(1/15) Running aws-sdk-go tests ... done in 0 seconds
(2/15) Running aws-sdk-java tests ... done in 1 seconds
(3/15) Running aws-sdk-php tests ... done in 42 seconds
(4/15) Running aws-sdk-ruby tests ... done in 1 seconds
(5/15) Running awscli tests ... done in 2 minutes and 5 seconds
(6/15) Running healthcheck tests ... done in 0 seconds
(7/15) Running mc tests ... FAILED in 3 seconds
{
  "name": "mc",
  "duration": "370",
  "function": "test_get_object",
  "status": "FAIL",
  "error": "/mint/run/core/mc/mc --config-dir /tmp/.mc-23852 --quiet --no-color cp myminio/mc-test-bucket-30718/mc-test-object-11619 mc-test-object-11619.downloaded  >>> \n`myminio/mc-test-bucket-30718/mc-test-object-11619` -> `mc-test-object-11619.downloaded`\nmc: <ERROR> Failed to copy `http://minio-dev6.minio.io:30559/mc-test-bucket-30718/mc-test-object-11619`. read tcp 172.17.0.4:33270->72.28.97.55:30559: read: connection reset by peer\nTotal: 0 B, Transferred: 384.00 KiB, Speed: 2.51 MiB/s"
}
(7/15) Running minio-dotnet tests ... done in 33 seconds
(8/15) Running minio-go tests ... done in 1 minutes and 13 seconds
(9/15) Running minio-java tests ... done in 34 seconds
(10/15) Running minio-js tests ... done in 35 seconds
(11/15) Running minio-py tests ... done in 1 minutes and 17 seconds
(12/15) Running s3cmd tests ... done in 19 seconds
(13/15) Running s3select tests ... done in 1 seconds
(14/15) Running security tests ... done in 0 seconds

Executed 14 out of 15 tests successfully.

9610-412b0db/mint-fs.sh.log:

Running with
SERVER_ENDPOINT:      minio-dev7.minio.io:31704
ACCESS_KEY:           minio
SECRET_KEY:           ***REDACTED***
ENABLE_HTTPS:         0
SERVER_REGION:        us-east-1
MINT_DATA_DIR:        /mint/data
MINT_MODE:            full
ENABLE_VIRTUAL_STYLE: 0

To get logs, run 'docker cp a9ef7f01ef9d:/mint/log /tmp/mint-logs'

(1/15) Running aws-sdk-go tests ... done in 0 seconds
(2/15) Running aws-sdk-java tests ... done in 1 seconds
(3/15) Running aws-sdk-php tests ... done in 41 seconds
(4/15) Running aws-sdk-ruby tests ... done in 2 seconds
(5/15) Running awscli tests ... done in 2 minutes and 3 seconds
(6/15) Running healthcheck tests ... done in 0 seconds
(7/15) Running mc tests ... done in 29 seconds
(8/15) Running minio-dotnet tests ... done in 35 seconds
(9/15) Running minio-go tests ... done in 47 seconds
(10/15) Running minio-java tests ... done in 21 seconds
(11/15) Running minio-js tests ... done in 41 seconds
(12/15) Running minio-py tests ... FAILED in 55 seconds
{
  "name": "minio-py:test_thread_safe",
  "function": "put_object(bucket_name, object_name, data, length, content_type, metadata, sse, progress, part_size)",
  "args": {
    "bucket_name": "minio-py-test-2f184f1f-882e-4bf9-be6e-c040944bd22d",
    "object_name": "e3bba8c2-3936-4d2c-8ac2-6e1ffae97842"
  },
  "duration": 440,
  "message": "(\"Connection broken: ConnectionResetError(104, 'Connection reset by peer')\", ConnectionResetError(104, 'Connection reset by peer'))",
  "error": "Traceback (most recent call last):\n  File \"/mint/run/core/minio-py/tests.py\", line 1638, in test_thread_safe\n    raise exceptions[0]\nException: (\"Connection broken: ConnectionResetError(104, 'Connection reset by peer')\", ConnectionResetError(104, 'Connection reset by peer'))\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/mint/run/core/minio-py/tests.py\", line 2128, in main\n    test_thread_safe(client, testfile, log_output)\n  File \"/mint/run/core/minio-py/tests.py\", line 1640, in test_thread_safe\n    raise Exception(err)\nException: (\"Connection broken: ConnectionResetError(104, 'Connection reset by peer')\", ConnectionResetError(104, 'Connection reset by peer'))\n",
  "status": "FAIL"
}
(12/15) Running s3cmd tests ... done in 19 seconds
(13/15) Running s3select tests ... done in 2 seconds
(14/15) Running security tests ... done in 0 seconds

Executed 14 out of 15 tests successfully.

Deleting image on docker hub
Deleting image locally

Copy link
Member

@donatello donatello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@harshavardhana harshavardhana changed the title xl: Fix storage info output with unordered endpoints arguments fix: storage info output with unordered endpoints arguments May 19, 2020
@harshavardhana harshavardhana merged commit 9baeda7 into minio:master May 19, 2020
blaenk pushed a commit to blaenk/minio that referenced this pull request Aug 26, 2020
Shuffling arguments that we pass to MinIO server are supported. However,
when that happens, Prometheus returns wrong information about disks usage
and online/offline status.

The commit fixes the issue by avoiding relying on xl.endpoints since
it is not ordered.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing Prometheus metrics when restarting server with different Endpoint order
4 participants