Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed analytics #10

Closed
tesseract2048 opened this issue Apr 13, 2016 · 2 comments
Closed

Distributed analytics #10

tesseract2048 opened this issue Apr 13, 2016 · 2 comments

Comments

@tesseract2048
Copy link

Practically it is possible that a single node cannot analyze all completed applications.
Is it possible to distribute AnalyticJob to a cluster of nodes?

@akshayrai
Copy link
Contributor

Dr. Elephant has a bunch of executor threads that parallely fetch the applications and analyze them. As long as you have enough resources you should be able to increase these executor threads. I don't think a cluster of analyzer nodes would be required for this. Does this answer your question?

@tesseract2048
Copy link
Author

Yes. I did some further investigation, turns out that JobHistory is bottleneck.
Optimizing job history and increasing number of dr-elephant executors worked.
Thanks, @akshayrai .

abhishekdas99 added a commit to abhishekdas99/dr-elephant that referenced this issue Aug 21, 2017
ashangit pushed a commit to ashangit/dr-elephant that referenced this issue May 31, 2018
…block_size_when_using_viewfs

Add compatibility with viewfs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants