Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify IPython.parallel connections and enable Controller Resume #1471

Merged
merged 6 commits into from Jul 21, 2012

Conversation

minrk
Copy link
Member

@minrk minrk commented Mar 5, 2012

Rolls back two-stage connection info, putting more information into the connection files. This makes it easier to use hand-crafted ssh tunnels, as all ports are read from the file, rather than from the reply to registration/connection requests.

It is no longer possible to connect to the Controller without a connection file.

Adding the serialization method to the connection file also makes it harder for custom serialization to result in a mismatch in configuration between the various objects.

@minrk
Copy link
Member Author

minrk commented Mar 5, 2012

Sample connection file before change:

{
  "url": "tcp://127.0.0.1:56224", 
  "exec_key": "e257a5bb-154f-41ee-a8aa-c51cea42b54a", 
  "ssh": "", 
  "location": "10.71.0.113"
}

and after:

{
  "control": 62418, 
  "task": 62424, 
  "hb_ping": 62421, 
  "exec_key": "fde6ebb4-8310-4fa1-b97a-dfacbbdaa2e7", 
  "mux": 62420, 
  "hb_pong": 62422, 
  "ssh": "", 
  "registration": 62415, 
  "interface": "tcp://127.0.0.1", 
  "iopub": 62426, 
  "pack": "msgpack.packb", 
  "unpack": "msgpack.unpackb", 
  "location": "192.168.1.188"
}

@minrk
Copy link
Member Author

minrk commented Mar 6, 2012

This PR now contains a simple first implementation of controller-resume. Engine state is stored to JSON on register/unregister, and loaded on startup if the --restore flag is given.

This should behave as expected as long as the cluster is idle, but is untested for resuming the Controller during active execution.

@minrk
Copy link
Member Author

minrk commented Mar 6, 2012

This enables resume of controller, but also removes explicit identities from the Client.

We need to make a decision on the explicit client identities, because there is a reason to go either way:

  1. in libzmq-2.1, setting an explicit identity results in a 'durable socket', which means that the Schedulers act on the assumption that clients never go away, and queue up replies to disconnected clients in memory. For users doing 'fire&forget' task submission, this will result in an apparent memory leak in the Schedulers.
  2. The durable socket side-effect of explicit identities is removed from libzmq-3.1.
  3. removing explicit identities means that client sockets will get new identities when using Controller resume introduced here, breaking the propagation of results of tasks submitted prior to controller crash.

So we have two principal choices:

  1. Continue to use explicit identities, and just state that the apparent memory leak is fixed by upgrading to libzmq-3.1.
  2. Remove explicit identities, and break client-replies across controller restart.

I think 2. makes the most sense, because fire&forget seems more important than preserving interactivity across controller crashes. The code here reflects this.

@minrk
Copy link
Member Author

minrk commented Mar 6, 2012

I should note that the changes here are highly experimental, and shouldn't go into 0.13.

@fperez
Copy link
Member

fperez commented Apr 14, 2012

@minrk, do you still think we should hold off on this one for after 0.13? I understand if they aren't robust yet, but do they introduce any potential problems? Because if it's only a matter of robustness, we can merge it and flag this explicitly as an experimental feature; that at least will give it some field-testing...

@minrk
Copy link
Member Author

minrk commented Apr 14, 2012

This would break connection-file compatibility between 0.13 and 0.12 clients/clusters, which I know there are people using right now. But the main reason is just that this is a big enough and tricky enough change that I want it to bake in master for longer than we have between now and 0.13. Especially the controller-resume part, which is far from complete, though it certainly works in some simple cases.

@fperez
Copy link
Member

fperez commented Apr 14, 2012

Understood; I'll update the description at the top with a more visible note about this.

@minrk
Copy link
Member Author

minrk commented May 31, 2012

Test results for commit 7d7a06b merged into master
Platform: darwin

  • python2.6: Failed, log at https://gist.github.com/2839954 (libraries not available: matplotlib pygments pymongo qt tornado wx wx.aui)
  • python2.7: OK (libraries not available: wx wx.aui zmq)
  • python3.2: OK (libraries not available: matplotlib pymongo qt wx wx.aui)

Not available for testing:

@minrk
Copy link
Member Author

minrk commented Jun 6, 2012

As we continue to move the goalposts on 0.13, I'm starting to feel like I want to split this one, and merge the new connection files into 0.13, and only leave controller resume until after release. Thoughts on that @fperez?

@fperez
Copy link
Member

fperez commented Jun 6, 2012

I was just thinking about pinging the list for 0.13 plans... We have so much new stuff that it's high time we start settling it down with eye towards a release... Do you think we'd be able to pull off an 0.13 release before we head out to SciPy?

@minrk
Copy link
Member Author

minrk commented Jun 6, 2012

Not sure about a Final, but at least a beta or RC. We have 70 issues marked for 0.13 and 35 PRs, but I think very few of them actually need to get in to 0.13 (0 of mine, for instance, though I would like the file/shell cell magics). So if we switch into stabilization-only mode this week, I think we can make it.

It's a bit of an unfortunate time as we are just now exploring what we can/should be doing with Cell Magics.

We still haven't done the cell-metadata bit, which I think is a blocker so we reduce the need to increment the nbformat in the future.

@fperez
Copy link
Member

fperez commented Jun 6, 2012

That sounds like a good plan. I'm going to try to finish this lovely report now, and will then ping the list.

fperez added a commit that referenced this pull request Jun 13, 2012
Add git-mpr, to merge PR(s) from github just by number(s).

Inspired by git-mrb and test_pr, I thougth it would be usefull to be able to merge PR only by giving their number. Hence `git merge-pull-request`or `git-mpr`.

usage: 
```bash
$ git mpr -h                                                                          ~/ipython/tools
usage: git-mpr [-h] [-l | -a | -m [pr-number [pr-number ...]]]

Merge (one|many) github pull request by their number. If pull request can't be
merge as is, cancel merge, and continue to the next if any.

optional arguments:
  -h, --help            show this help message and exit
  -l, --list            list PR, their number and their mergeability
  -a, --merge-all       try to merge as many PR as possible, one by one
  -m [pr-number [pr-number ...]], --merge [pr-number [pr-number ...]]
                        The pull request numbers
```

examples :
```bash
$ git mpr --list  
* #1758 [√]:  test_pr, fallback on http if git protocol fail, and SSL errors...
* #1755 [√]:  test for pygments before running qt tests
* #1715 [√]:  Fix for #1688, traceback-unicode issue
[...]
* #1493 [√]:  Selenium web tests proof-of-concept
* #1471 [ ]:  simplify IPython.parallel connections and enable Controller Resume
* #1343 [ ]:  Add prototype shutdown button to Notebook dashboard
* #1285 [√]:  Implementation of grepping the input history using several patterns
* #1215 [√]:  updated %quickref to show short-hand for %sc and %sx
```
PR number, mergeability and title
Quite slow, as it does 1 api call by PR, since api does not give mergeability anymore if you ask for the list of all PRs at once.

merge one or more PR (skip the ones with conflict and present a nice list to copy and past to do it by hand) 

```bash
$ git mpr --merge [pr-number [pr-number ...]]]
[...]
*************************************************************************************
the following branch have not been merged automatically, considere doing it by hand :
PR 1630: git pull https://github.com/minrk/ipython.git mergekernel
PR 1343: git pull https://github.com/takluyver/ipython.git notebook-shutdown
PR 1715: git pull https://github.com/jstenar/ipython.git ultratb-pycolorize-unicode
PR 1732: git pull https://github.com/fperez/ipython.git cellmagics
PR 1471: git pull https://github.com/minrk/ipython.git connection
PR 1674: git pull https://github.com/mdboom/ipython.git notebook-carriage-return
*************************************************************************************
```

And last, 
```
git mpr --merge-all
```

That is pretty self explainatory
explicit IDENTITIES imply a potentially undesirable side-effect called 'durable sockets', which should be avoided.
Rolls back two-stage connection, putting complete connection info
into the connection files.  This makes it easier to use hand-crafted
ssh tunnels, as all ports are read from the file, rather than
from the reply to registration/connection requests.

It is no longer possible to connect to the Controller without a connection file.

Adding the serialization methods to the connection file also makes it
harder for custom serialization to result in a mismatch in configuration
between the various objects.
cleans up EngineConnector and registration messages further, only storing single UUID for each engine (all the stored UUIDs happened to be the same already).

Message spec docs updated to reflect changes to message formats.
@minrk
Copy link
Member Author

minrk commented Jul 18, 2012

Rebased. I don't know why GitHub thinks it cannot be automatically merged, because it certainly can. Presumably this will resolve itself before long.

@fperez
Copy link
Member

fperez commented Jul 20, 2012

We should probably discuss this one at the sprint...

@minrk
Copy link
Member Author

minrk commented Jul 20, 2012

Test results for commit 2888383 merged into master
Platform: darwin

  • python2.6: OK (libraries not available: cython matplotlib oct2py pygments pymongo qt rpy2 tornado wx wx.aui)
  • python2.7: OK (libraries not available: oct2py wx wx.aui)
  • python3.2: OK (libraries not available: cython matplotlib oct2py pymongo qt rpy2 wx wx.aui)

Not available for testing: python3.1

@fperez
Copy link
Member

fperez commented Jul 21, 2012

This is large and complex but the code and functionality looks very clean; as discussed during sprint this is best merged early in this cycle. Running testpr on linux just to be safe, will merge shortly if report is OK.

@fperez
Copy link
Member

fperez commented Jul 21, 2012

Test results for commit 2888383 merged into master
Platform: linux2

  • python2.7: OK
  • python3.2: OK (libraries not available: cython matplotlib oct2py pygments pymongo rpy2 wx wx.aui)

Not available for testing: python2.6, python3.1

fperez added a commit that referenced this pull request Jul 21, 2012
simplify IPython.parallel connections and enable Controller Resume

Rolls back two-stage connection info, putting more information into the connection files. This makes it easier to use hand-crafted ssh tunnels, as all ports are read from the file, rather than from the reply to registration/connection requests.

It is no longer possible to connect to the Controller without a connection file.

Adding the serialization method to the connection file also makes it harder for custom serialization to result in a mismatch in configuration between the various objects.
@fperez fperez merged commit bff463b into ipython:master Jul 21, 2012
@minrk
Copy link
Member Author

minrk commented Jul 21, 2012

I have a feeling this will conflict with Jason's, and I just asked him to rebase already and don't want to do it twice. So I will get that one in once it is rebased, then make sure this still applies, and merge.

@minrk
Copy link
Member Author

minrk commented Jul 21, 2012

oops, nevermind. I'll take care of the metadata rebase if it needs one.

@minrk
Copy link
Member Author

minrk commented Jul 21, 2012

metadata doesn't need a rebase, so all is well.

@fperez
Copy link
Member

fperez commented Jul 21, 2012

ok, glad to hear that.

@fperez
Copy link
Member

fperez commented Jul 21, 2012

ok, glad to hear that.

@minrk minrk deleted the connection branch March 31, 2014 23:36
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
Add git-mpr, to merge PR(s) from github just by number(s).

Inspired by git-mrb and test_pr, I thougth it would be usefull to be able to merge PR only by giving their number. Hence `git merge-pull-request`or `git-mpr`.

usage: 
```bash
$ git mpr -h                                                                          ~/ipython/tools
usage: git-mpr [-h] [-l | -a | -m [pr-number [pr-number ...]]]

Merge (one|many) github pull request by their number. If pull request can't be
merge as is, cancel merge, and continue to the next if any.

optional arguments:
  -h, --help            show this help message and exit
  -l, --list            list PR, their number and their mergeability
  -a, --merge-all       try to merge as many PR as possible, one by one
  -m [pr-number [pr-number ...]], --merge [pr-number [pr-number ...]]
                        The pull request numbers
```

examples :
```bash
$ git mpr --list  
* ipython#1758 [√]:  test_pr, fallback on http if git protocol fail, and SSL errors...
* ipython#1755 [√]:  test for pygments before running qt tests
* ipython#1715 [√]:  Fix for ipython#1688, traceback-unicode issue
[...]
* ipython#1493 [√]:  Selenium web tests proof-of-concept
* ipython#1471 [ ]:  simplify IPython.parallel connections and enable Controller Resume
* ipython#1343 [ ]:  Add prototype shutdown button to Notebook dashboard
* ipython#1285 [√]:  Implementation of grepping the input history using several patterns
* ipython#1215 [√]:  updated %quickref to show short-hand for %sc and %sx
```
PR number, mergeability and title
Quite slow, as it does 1 api call by PR, since api does not give mergeability anymore if you ask for the list of all PRs at once.

merge one or more PR (skip the ones with conflict and present a nice list to copy and past to do it by hand) 

```bash
$ git mpr --merge [pr-number [pr-number ...]]]
[...]
*************************************************************************************
the following branch have not been merged automatically, considere doing it by hand :
PR 1630: git pull https://github.com/minrk/ipython.git mergekernel
PR 1343: git pull https://github.com/takluyver/ipython.git notebook-shutdown
PR 1715: git pull https://github.com/jstenar/ipython.git ultratb-pycolorize-unicode
PR 1732: git pull https://github.com/fperez/ipython.git cellmagics
PR 1471: git pull https://github.com/minrk/ipython.git connection
PR 1674: git pull https://github.com/mdboom/ipython.git notebook-carriage-return
*************************************************************************************
```

And last, 
```
git mpr --merge-all
```

That is pretty self explainatory
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
simplify IPython.parallel connections and enable Controller Resume

Rolls back two-stage connection info, putting more information into the connection files. This makes it easier to use hand-crafted ssh tunnels, as all ports are read from the file, rather than from the reply to registration/connection requests.

It is no longer possible to connect to the Controller without a connection file.

Adding the serialization method to the connection file also makes it harder for custom serialization to result in a mismatch in configuration between the various objects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants