New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Startup can fail if mongo server, database, and replica aren't available #4745

Closed
focusaurus opened this Issue Oct 16, 2018 · 3 comments

Comments

2 participants
@focusaurus
Copy link
Contributor

focusaurus commented Oct 16, 2018

This issue is the same as reaction-platform issue 16, but since it's really a problem within the reaction codebase, I'm filing this here so we can track it with our normal process.

Issue Description

After running make from the clean reaction-platform folder, the created containers reaction_reaction_1, reaction_mongo-init-replica_1 and reaction-hydra_hydra-migrate_1 show as exited.

The reaction container fails, as it can't connect to the replica set:
docker logs -f reaction_reaction_1

npm WARN react-addons-test-utils@15.6.2 requires a peer of react-dom@^15.4.2 but none is installed. You must install peer dependencies yourself.
npm WARN react-taco-table@0.5.1 requires a peer of react@^15.3 but none is installed. You must install peer dependencies yourself.
npm WARN react-taco-table@0.5.1 requires a peer of react-dom@^15.3 but none is installed. You must install peer dependencies yourself.
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: fsevents@1.2.4 (node_modules/fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for fsevents@1.2.4: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"x64"})

up to date in 18.481s
ERROR: MongoDB replica set not ready in time.
no replset config has been received..%

The replica set fails to connect to the main mongo container:
docker logs -f reaction_mongo-init-replica_1

MongoDB shell version v3.6.3
connecting to: mongodb://mongo:27017/reaction
2018-10-04T11:25:18.048+0000 W NETWORK  [thread1] Failed to connect to 172.30.0.2:27017, in(checking socket for error after poll), reason: Connection refused
2018-10-04T11:25:18.062+0000 E QUERY    [thread1] Error: couldn't connect to server mongo:27017, connection attempt failed :
connect@src/mongo/shell/mongo.js:251:13
@(connect):1:6
exception: connect failed

I checked the internal IP, it seems to be right
docker inspect reaction_mongo_1 | grep IPAddress

"SecondaryIPAddresses": null,
"IPAddress": "",
"IPAddress": "172.30.0.2",

Steps to Reproduce

Please provide starting context, i.e. logged in as a user, configure a particular payment method.

  1. clone the repo
  2. run make and wait for it to finish
  3. run docker ps -a | grep reaction and observe that both reaction_reaction_1 and reaction_mongo-init-replica_1 exited

Possible Solution

Running docker start reaction_mongo-init-replica_1 afterwards seems to make it start - Maybe the build script needs to wait until the main mongodb container finished loading.

@focusaurus focusaurus added the bug label Oct 16, 2018

@focusaurus focusaurus added this to the Oxford milestone Oct 16, 2018

@focusaurus focusaurus self-assigned this Oct 16, 2018

@focusaurus

This comment has been minimized.

Copy link
Contributor

focusaurus commented Oct 16, 2018

CC @janus-reith just FYI I'll be working on a fix for this issue here in the reaction repo as each service managed by reaction-platform should take responsibility for its own startup dependencies.

@focusaurus

This comment has been minimized.

Copy link
Contributor

focusaurus commented Oct 16, 2018

Mechanics of the issue

  • User runs make start in reaction-platform to bring up the system
  • reaction-platform runs docker compose up -d for the reaction project
  • docker compose dependencies bring start reaction services in the following order, but without any delay in between
    • mongo
    • mongo-init-replica
    • reaction
  • There's a race condition there and in this case mongo-init-replica is started and tries to connect to mongo to initialize the replica set before the mongo service is up and accepting connections. Thus the connection fails.
    • mongo-init-replica has no wait/retry logic currently so one failure causes it to exit nonzero without retry
      • Log from above is 2018-10-04T11:25:18.048+0000 W NETWORK [thread1] Failed to connect to 172.30.0.2:27017, in(checking socket for error after poll), reason: Connection refused
  • When reaction starts up it will wait for the replica set to be available, but since the command to create it was never issued, we hit the timeout and the reaction service also exits nonzero
    • Log from above: ERROR: MongoDB replica set not ready in time

proposed solution

I would like to gather all the setup logic into the reaction docker container such that the new startup steps are

  • npm install
  • wait for mongo server
  • ensure the mongo database exists, creating it if necessary
  • ensure the replica set is initialized, initializing it if necessary
  • wait for the replica set to be ready
    • all these mongo things would be scripted in a waitForMongo.js which will be renamed from waitForReplica.js. I think they are mostly 1-liners either on the command line or via javascript.
    • I have existing wait/retry logic I should be able to just plug in a few more async function calls
  • start the main reaction server
@focusaurus

This comment has been minimized.

Copy link
Contributor

focusaurus commented Oct 16, 2018

Notes on existing startup process

In digging into this I needed to understand the existing startup process so let me note that down here for reference

  • If you are using reaction-platform a make start maps to docker-compose up in the reaction project directory
    • But if not, you can do the docker-compose up yourself and the same sequence happens
  • The docker-compose up brings up the reaction container based on the docker-compose.yml file
    • The relevant line is: command: bash -c "npm install && node ./.reaction/waitForReplica.js && reaction"
  • So when the reaction container starts, it does those commands
    • npm install
    • waitForReplica.js
    • reaction
      • this is /usr/local/bin/reaction which is the reaction-cli/src/main.js file
      • This runs reaction-cli/src/commands/run.js
      • a meteor command line is built up and executed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment