Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue on clusters where compute nodes have persistent data #482

Open
rezuma opened this issue Dec 23, 2014 · 0 comments
Open

Issue on clusters where compute nodes have persistent data #482

rezuma opened this issue Dec 23, 2014 · 0 comments

Comments

@rezuma
Copy link

rezuma commented Dec 23, 2014

I have been using StarCluster with SciDB. In SciDB each compute nodes needs fast access to the data, and each compute node has a portion of the full dataset.

In these type of clusters the current version of Starcluster has to major problems:

  1. When a cluster is stopped with starcluster stop [name of the cluster] all volumes on the compute nodes of the cluster get dismounted, and don't get remounted when you do starcluster start -x [name of the cluster]
  2. A bigger problem is that Starcluster terminates the nodes that can be reach for 15 min after restarting the cluster. This is huge issue for us because it means that we have to re-ingest the data in the whole cluster. The worst part is nodes running Ubuntu 12.04 sometimes dont start the network device and you have reboot them again. Unfortunately we dont that option since StarCluster terminates the nodes even in this cases where all it takes is a reboot.

I have modified cluster.py to change these 2 behaviors

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant