Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding additional worker nodes causes job failure, collections don't error out #88

Closed
MoHeydarian opened this issue Oct 29, 2018 · 2 comments

Comments

@MoHeydarian
Copy link

When jobs/collections are running and additional worker nodes are added to the Cloudman instance, some of the jobs/collections stop running and return empty jobs/collections in the that are green state. Not all collections exhibit this behavior, some collections are stopped and in the error state.

The two issues:

  • Adding workers interrupts ongoing jobs
  • Interrupted jobs are not always marked in the error state, collections contain empty data items

Behavior observed using the GVL 4.4.0 RC2 (Galaxy 18.05)

@MoHeydarian
Copy link
Author

Update: removing idle worker nodes while jobs are running leads to the same issue above.

@almahmoud
Copy link
Member

almahmoud commented Nov 6, 2018

Should be fixed with the new CloudMan image, merged here: #90 and deployed to bucket by Enis

@afgane afgane closed this as completed Nov 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants