Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

ipcluster plugin could support cluster-id? #229

Closed
adgaudio opened this Issue · 2 comments

1 participant

@adgaudio

Hello!

I've recently started using StarCluster with the IPython plugin (from develop branch) and found I wanted to support cases where I can restart the ipcluster on the same ec2 nodes but don't lose currently running tasks. I modified StarCluster's IPython plugin to support this (linked below) and would love to know if this is worth integrating into the current StarCluster plugin.

Problem: The current implementation of the IPython plugin, ipcluster.py, does a hard kill of ipengines, which means we potentially lose long-running tasks.

Proposed Solution: Initiate ipclusters with the "--cluster-id <...>" parameter. When we call ipcluster.IPClusterStop and ipcluster.IPClusterRestartEngines, provide user the option to select which cluster-ids to keep alive or which cluster-ids to kill. Also, all proposed changes are backwards compatible.

I have a currently functioning design of this linked below. If this sounds like it's worth integrating into StarCluster, I would love some feedback on these current design problems:

  • how to identify the most recent cluster-id.
    • The code I linked to hardcodes it - this is a todo item.
  • should the sshmaster store json files for all running controllers? (I currently only store the json file for the most recent --cluster-id invocation).
    • If yes, then I could find the most recent cluster-id by searching modification times of the json files.

https://github.com/adgaudio/StarCluster/blob/develop/starcluster/plugins/ipcluster.py

Thank you!

@adgaudio

Hello again,

I created a backwards compatible implementation of the above proposed solution to integrate --cluster-id to manage concurrent ipcluster instances. I created a PR for this here:

#233

I'd very much appreciate your feedback, as this could be quite useful for others.

Thank you,
Alex

@adgaudio

closing this as it's linked to above.

@adgaudio adgaudio closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.