Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the server to start if one of the local nodes in a swarm/kube setup shows up and successfully resolves. #7452

Merged
merged 1 commit into from
Apr 19, 2019

Conversation

Praveenrajmani
Copy link
Contributor

Description

  • The rule is that we need atleast one local node to work. We dont need to resolve the
    rest at that point.

  • In a non-orchestrational setup, we fail if we do not have atleast one local node up
    and running.

  • In an orchestrational setup (docker-swarm and kubernetes), We retry with a sleep of 5
    seconds until any one local node shows up.

Motivation and Context

Minio should not halt the start-up in such occasions. fixes #6995

Regression

No

How Has This Been Tested?

  • Bring up 4 docker containers with
version: '2'

# starts 4 docker containers running minio server instances. Each
# minio server's web interface will be accessible on the host at port
# 9001 through 9004.
services:
 minio1:
  image: praveenminio/restart-fixlatest
  volumes:
   - data1:/data
  ports:
   - "9001:9000"
  environment:
   MINIO_ACCESS_KEY: minio
   MINIO_SECRET_KEY: minio123
  command: server http://minio1/data http://minio2/data http://minio3/data http://minio4/data
 minio2:
  image: praveenminio/restart-fix:latest
  volumes:
   - data2:/data
  ports:
   - "9002:9000"
  environment:
   MINIO_ACCESS_KEY: minio
   MINIO_SECRET_KEY: minio123
  command: server http://minio1/data http://minio2/data http://minio3/data http://minio4/data
 minio3:
  image: praveenminio/restart-fix:latest
  volumes:
   - data3:/data
  ports:
   - "9003:9000"
  environment:
   MINIO_ACCESS_KEY: minio
   MINIO_SECRET_KEY: minio123
  command: server http://minio1/data http://minio2/data http://minio3/data http://minio4/data
 minio4:
  image: praveenminio/restart-fix:latest
  volumes:
   - data4:/data
  ports:
   - "9004:9000"
  environment:
   MINIO_ACCESS_KEY: minio
   MINIO_SECRET_KEY: minio123
  command: server http://minio1/data http://minio2/data http://minio3/data http://minio4/data

## By default this config uses default local driver,
## For custom volumes replace with volume driver configuration.
volumes:
  data1:
  data2:
  data3:
  data4:
  • Bring down one container by docker stop <container-id>
  • do a mc admin service restart
  • The setup will start successfully without waiting

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added unit tests to cover my changes.
  • [] I have added/updated functional tests in mint. (If yes, add mint PR # here: )
  • All new and existing tests passed.

@codecov
Copy link

codecov bot commented Apr 1, 2019

Codecov Report

Merging #7452 into master will increase coverage by 0.02%.
The diff coverage is 50.74%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7452      +/-   ##
==========================================
+ Coverage      48%   48.02%   +0.02%     
==========================================
  Files         296      296              
  Lines       46796    46828      +32     
==========================================
+ Hits        22463    22490      +27     
+ Misses      22247    22246       -1     
- Partials     2086     2092       +6
Impacted Files Coverage Δ
cmd/server-main.go 10.23% <0%> (-0.1%) ⬇️
cmd/net.go 77.48% <0%> (+8.1%) ⬆️
cmd/endpoint.go 70.49% <53.12%> (-2.64%) ⬇️
cmd/bitrot-streaming.go 79.43% <0%> (-3.74%) ⬇️
cmd/posix.go 63.53% <0%> (-0.32%) ⬇️
pkg/certs/certs.go 58.76% <0%> (+4.12%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 720ed3f...6d5cfbe. Read the comment docs.

@nitisht nitisht requested review from balamurugana and removed request for nitisht April 1, 2019 10:07
cmd/endpoint.go Outdated Show resolved Hide resolved
@Praveenrajmani Praveenrajmani changed the title Allow the server to start if one of the local nodes in a swarm/kube setup shows up and successfully resolves. [WIP] Allow the server to start if one of the local nodes in a swarm/kube setup shows up and successfully resolves. Apr 1, 2019
@Praveenrajmani Praveenrajmani changed the title [WIP] Allow the server to start if one of the local nodes in a swarm/kube setup shows up and successfully resolves. Allow the server to start if one of the local nodes in a swarm/kube setup shows up and successfully resolves. Apr 3, 2019
cmd/endpoint.go Outdated Show resolved Hide resolved
cmd/endpoint.go Outdated Show resolved Hide resolved
cmd/endpoint.go Show resolved Hide resolved
…tes setup is

successfully resolved

- The rule is that we need atleast one local node to work. We dont need to resolve the
  rest at that point.

- In a non-orchestrational setup, we fail if we do not have atleast one local node up
  and running.

- In an orchestrational setup (docker-swarm and kubernetes), We retry with a sleep of 5
  seconds until any one local node shows up.

fixes minio#6995
@minio-ops
Copy link

Mint Automation

Test Result
mint-compression-xl.sh ✔️
mint-xl.sh ✔️
mint-large-bucket.sh ✔️
mint-compression-dist-xl.sh ✔️
mint-compression-fs.sh ✔️
mint-worm.sh ✔️
mint-dist-xl.sh ✔️
mint-gateway-nas.sh ✔️
mint-fs.sh more...

7452-6d5cfbe/mint-fs.sh.log:

Running with
SERVER_ENDPOINT:      72.28.97.61:30975
ACCESS_KEY:           minio
SECRET_KEY:           ***REDACTED***
ENABLE_HTTPS:         0
SERVER_REGION:        us-east-1
MINT_DATA_DIR:        /mint/data
MINT_MODE:            full
ENABLE_VIRTUAL_STYLE: 0

To get logs, run 'docker cp a647b4187304:/mint/log /tmp/mint-logs'

(1/14) Running aws-sdk-go tests ... done in 0 seconds
(2/14) Running aws-sdk-java tests ... done in 1 seconds
(3/14) Running aws-sdk-php tests ... done in 41 seconds
(4/14) Running aws-sdk-ruby tests ... done in 2 seconds
(5/14) Running awscli tests ... done in 1 minutes and 46 seconds
(6/14) Running healthcheck tests ... done in 0 seconds
(7/14) Running mc tests ... done in 23 seconds
(8/14) Running minio-dotnet tests ... done in 24 seconds
(9/14) Running minio-go tests ... done in 27 seconds
(10/14) Running minio-java tests ... done in 56 seconds
(11/14) Running minio-js tests ... done in 35 seconds
(12/14) Running minio-py tests ... FAILED in 2 minutes and 9 seconds
{
  "status": "FAIL",
  "args": {
    "bucket_name": "minio-py-test-7aeec91e-16c2-424b-9d69-7781dc31c662",
    "object_name": "1e481743-9bef-4c12-9d3d-bc98f9f8a8b9"
  },
  "function": "presigned_get_object(bucket_name, object_name, expires, response_headers, request_date)",
  "duration": 38,
  "error": "Traceback (most recent call last):\n  File \"/mint/run/core/minio-py/tests.py\", line 1220, in test_presigned_get_object_expiry_5sec\n    object_name).get_exception()\nminio.error.AccessDenied: AccessDenied: message: Access Denied\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/mint/run/core/minio-py/tests.py\", line 1885, in main\n    test_presigned_get_object_expiry_5sec(client, log_output)\n  File \"/mint/run/core/minio-py/tests.py\", line 1228, in test_presigned_get_object_expiry_5sec\n    raise Exception(err)\nException: AccessDenied: message: Access Denied\n",
  "name": "minio-py:test_presigned_get_object_expiry_5sec",
  "message": "AccessDenied: message: Access Denied"
}

Executed 11 out of 14 tests successfully.

@Praveenrajmani
Copy link
Contributor Author

PTAL @harshavardhana @balamurugana

Copy link
Member

@harshavardhana harshavardhana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and tested

@harshavardhana
Copy link
Member

ping @balamurugana

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The cluster is restarted successfully only when all nodes are started.
5 participants