Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hailctl] Spark monitor #7087

Merged
merged 4 commits into from Sep 19, 2019

Conversation

@tpoterba
Copy link
Collaborator

commented Sep 18, 2019

No description provided.

@johnc1231

This comment has been minimized.

Copy link
Contributor

commented Sep 18, 2019

I checked out your branch, ran make install-hailctl, started a cluster, connected to a notebook, and ran hl.utils.range_table(1_000_000, 10000)._force_count(). Did not see any monitor UI show up.

@johnc1231

This comment has been minimized.

Copy link
Contributor

commented Sep 18, 2019

Though running: pprint(dict(os.environ.items())), yielded:

{'CLICOLOR': '1',
 'GIT_PAGER': 'cat',
 'HOME': '/root',
 'INVOCATION_ID': '0faec80a970f4cf29ce69112519fe641',
 'JOURNAL_STREAM': '8:38888',
 'JPY_PARENT_PID': '5858',
 'LANG': 'en_US.UTF-8',
 'LOGNAME': 'root',
 'MPLBACKEND': 'module://ipykernel.pylab.backend_inline',
 'PAGER': 'cat',
 'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin',
 'SHELL': '/bin/sh',
 'SPARKMONITOR_KERNEL_PORT': '38853',
 'TERM': 'xterm-color',
 'USER': 'root'}

which does not include the environment variable you added saying to use the new thing, though that's clearly present in init_notebook.py

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 18, 2019

huh. I just tested again and it worked!

@johnc1231

This comment has been minimized.

Copy link
Contributor

commented Sep 18, 2019

I'll try again tomorrow, maybe I made a mistake.

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 18, 2019

bad news, though -- after playing with the thing a bit longer and trying to bump the partition number up, it's really not reliable. At all.

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 18, 2019

I think we should probably go back to jupyter spark.

@danking

This comment has been minimized.

Copy link
Collaborator

commented Sep 18, 2019

:((((((((((((

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 18, 2019

aha it gets a push for every task end! Running even a 5000-partition short job is a denial of service attack against the extension

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 18, 2019

after running one 5000-partition job, the extension log is 300k lines

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 19, 2019

OK, give this a try, John. I fixed the Spark UI linking issue, and added some other QoL features.

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 19, 2019

The changes to the fork are in a PR here: https://github.com/hail-is/sparkmonitor/pull/1/files

This PR points to an artifact built from that PR.

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Sep 19, 2019

@johnc1231 fixed the problem, ready to go.

Copy link
Contributor

left a comment

Going to be great!

@danking danking merged commit bda4b11 into hail-is:master Sep 19, 2019
1 check passed
1 check passed
ci-test success
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.