Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix init(idempotent=True) / HailContext.getOrCreate #5872

Merged
merged 2 commits into from Apr 20, 2019

Conversation

Projects
None yet
3 participants
@tpoterba
Copy link
Collaborator

commented Apr 11, 2019

No description provided.

@tpoterba tpoterba changed the title Fix init(idenpotent=True) / HailContext.getOrCreate Fix init(idempotent=True) / HailContext.getOrCreate Apr 12, 2019

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 18, 2019

need to figure out how to properly stop.

@henrydavidge do you know how to totally kill the java process from pyspark?

In [2]: spark.stop()

In [3]: !jps
44688 Launcher
65412 SparkSubmit
65419 Jps
62991
@henrydavidge

This comment has been minimized.

Copy link
Contributor

commented Apr 18, 2019

I don't think there's a pyspark API to do that. You could try calling Runtime.halt through py4j maybe? Or use py4j to get the pid and then kill through a shell command.

dismissing because I need to make some other changes

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 18, 2019

if pyspark doesn't have a way to do that, then I'd bet that there's some Python state that's not cleaned up if you try to restart it from the same Python runtime after killing the JVM.

I can fix Hail to properly clean up, I think.

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 18, 2019

@danking ready for another review. This PR has expanded because we needed to properly store/clear session-specific stuff in a bunch of places.

@@ -4486,6 +4486,7 @@ def liftover(x, dest_reference_genome, min_match=0.95, include_strand=False):
rtype = tstruct(result=tinterval(tlocus(dest_reference_genome)), is_negative_strand=tbool)

if not rg.has_liftover(dest_reference_genome.name):
print(rg._liftovers)

This comment has been minimized.

Copy link
@danking

danking Apr 18, 2019

Collaborator

rogue print

@@ -1285,7 +1285,27 @@ def _compute_type(self, env, agg_env):

_function_registry = {}
_seeded_function_registry = {}

This comment has been minimized.

Copy link
@danking

danking Apr 18, 2019

Collaborator

these should probably all be collections.defaultdict([]), there's a few places in here where we have extra code to handle the key not being present but if we consistently treated "not present" as empty list, things would be simpler, i.e. the remove_function is just:

f = (param_types, ret_type)
_function_registry[name] = [b for b in _function_registry[name] if b != f]

Also removes an if check in _lookup_function_return_type.


not required for the PR, but for future reference, I think this is an exemplary use case for defaultdict

@danking

This comment has been minimized.

Copy link
Collaborator

commented Apr 18, 2019

It's definitely a bit unclear to met that we got everything and I'm not convinced the test checks everything. I don't think I'd feel much better unless we were randomly restarting the context mid tests.

@tpoterba

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 18, 2019

It's definitely a bit unclear to met that we got everything and I'm not convinced the test checks everything. I don't think I'd feel much better unless we were randomly restarting the context mid tests.

We actually do that -- this test_context runs in the same python process as the other tests. I also think we don't really guarantee you can stop and restart, but this at least makes it work with our tests.

addressed?

@tpoterba tpoterba force-pushed the tpoterba:fix-idempotent branch from 734812b to 736c1e7 Apr 19, 2019

@danking danking merged commit 69c2ec9 into hail-is:master Apr 20, 2019

1 check passed

hail-ci-0-1 successful build
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.