Issue on clusters where compute nodes have persistent data #482

rezuma · 2014-12-23T20:58:21Z

I have been using StarCluster with SciDB. In SciDB each compute nodes needs fast access to the data, and each compute node has a portion of the full dataset.

In these type of clusters the current version of Starcluster has to major problems:

When a cluster is stopped with starcluster stop [name of the cluster] all volumes on the compute nodes of the cluster get dismounted, and don't get remounted when you do starcluster start -x [name of the cluster]
A bigger problem is that Starcluster terminates the nodes that can be reach for 15 min after restarting the cluster. This is huge issue for us because it means that we have to re-ingest the data in the whole cluster. The worst part is nodes running Ubuntu 12.04 sometimes dont start the network device and you have reboot them again. Unfortunately we dont that option since StarCluster terminates the nodes even in this cases where all it takes is a reboot.

I have modified cluster.py to change these 2 behaviors

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue on clusters where compute nodes have persistent data #482

Issue on clusters where compute nodes have persistent data #482

rezuma commented Dec 23, 2014

Issue on clusters where compute nodes have persistent data #482

Issue on clusters where compute nodes have persistent data #482

Comments

rezuma commented Dec 23, 2014