Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New distribute registrations not loaded until after service restart #276

Open
joehobson opened this issue Apr 7, 2014 · 1 comment

Comments

@joehobson
Copy link
Member

commented Apr 7, 2014

Testing out the distribute service, ran into an error getting the node to pick up new registered connections for distribute. My guess is that you have to bounce the service, or couchdb, in order to load changes in connections - either adding new ones or removing old ones.

to reproduce:

  1. remove any existing distribute registrations from the node db in couchdb (i just manually deleted the documents, which have a key/id like 70c01a08028a4d55814e4fd2b9f2de52)

  2. set your logger_lr level = DEBUG in your LR ini file (probably development.ini)

  3. restart the LR/uwsgi/couchdb services to make sure you're starting with fresh data

  4. manually start the distribute service:

    curl -XPOST -H "Content-Type:application/json" http://localhost/distribute

  5. check the log to see if anything distributed. It will probably just have "INFO [lr.controllers.distribute] [MainThread] No connection present for distribution"

    sudo tail -fn 1000 /var/log/learningregistry/uwsgi.log

  6. add a new distribute registration: http://localhost/register. You can use http://alpha.learningregistry.org/incoming with admin/password for credentials.

  7. manually start the distribute service again

    curl -XPOST -H "Content-Type:application/json" http://localhost/distribute

  8. check the uwsgi log again, or verify that /distribute returned information on the connection you registered.

If you restart the LR/uwsgi/couchdb services after registering the distribute connection, it will show up in the output

@jimklo

This comment has been minimized.

Copy link
Contributor

commented Apr 7, 2014

This is a long outstanding problem for which the rationale for why it was this way to begin with was terrible. There was this belief that changing the replication membership meant you're changing the node, and as such you must change the node identifier, since it is no longer the same node… it makes some amount of sense if the network is intended to be made up of static/finite number of members - if membership changed you'd want some signal to be able to tell that the makeup of the data that is contained is not-what-it-was-before; however IMO it's not very well founded reason for LR (maybe a DoD reason) - which is supposed to encourage more sharing - more networking.

I thought Walt had fixed this some time ago - however we found testing this was hit/miss. Essentially the problem stems back to the fact that the node configuration loads once at boot and never goes back to check CouchDB.

To fix, if not already, the service needs to just refresh the 'static configuration' so the newly added nodes are reloaded when /distribute is run.

  • JK

On Apr 7, 2014, at 4:28 PM, joe hobson notifications@github.com
wrote:

Testing out the distribute service, ran into an error getting the node to pick up new registered connections for distribute. My guess is that you have to bounce the service, or couchdb, in order to load changes in connections - either adding new ones or removing old ones.

to reproduce:

remove any existing distribute registrations from the node db in couchdb (i just manually deleted the documents, which have a key/id like 70c01a08028a4d55814e4fd2b9f2de52)

set your logger_lr level = DEBUG in your LR ini file (probably development.ini)

restart the LR/uwsgi/couchdb services to make sure you're starting with fresh data
manually start the distribute service:

curl -XPOST -H "Content-Type:application/json" http://localhost/distribute

check the log to see if anything distributed. It will probably just have "INFO [lr.controllers.distribute] [MainThread] No connection present for distribution"

sudo tail -fn 1000 /var/log/learningregistry/uwsgi.log

add a new distribute registration: http://localhost/register. You can use http://alpha.learningregistry.org/incoming with admin/password for credentials.

manually start the distribute service again

curl -XPOST -H "Content-Type:application/json" http://localhost/distribute

check the uwsgi log again, or verify that /distribute returned information on the connection you registered.

If you restart the LR/uwsgi/couchdb services after registering the distribute connection, it will show up in the output


Reply to this email directly or view it on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.