Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[qob] Add regions parameter #12581

Merged
merged 13 commits into from
Mar 10, 2023
Merged

[qob] Add regions parameter #12581

merged 13 commits into from
Mar 10, 2023

Conversation

jigold
Copy link
Contributor

@jigold jigold commented Jan 6, 2023

CHANGELOG: Added the regions parameter to hl.init() for specifying which regions Query on Batch jobs should run in.

@@ -157,6 +157,8 @@ async def serialize(self, writer: afs.WritableStream):


class ServiceBackend(Backend):
input_version = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this versioning here? Seems arbitrary, is it ever checked? Are there instructions on how or when it should be updated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. It was old cruft from when I thought I had to make the spec backwards compatible.

Copy link
Contributor

@daniel-goldstein daniel-goldstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry one more thing I noticed.

import hail as hl

from ..helpers import skip_unless_service_backend
from hail.backend.service_backend import ServiceBackend

CLOUD = os.environ['HAIL_CLOUD']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this impede running other query tests locally? Not sure if it does but it would be annoying to have to specify HAIL_CLOUD to run tests that have nothing to do with it. I'd prefer in that case to put the os.environ call in the method that actually needs it and remove the env variable from test steps that aren't testing the service backend.

build.yaml Show resolved Hide resolved
import hail as hl

from ..helpers import skip_unless_service_backend
from hail.backend.service_backend import ServiceBackend

CLOUD = os.environ.get('HAIL_CLOUD')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to treat None and Azure the same. Can we keep this as os.environ['HAIL_CLOUD'] but just put this line inside the relevant test?

if regions_str is not None:
regions = regions_str.split(',')
else:
regions = bc.supported_regions()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regions parameter below is Optional[List[str]], but from this else case it looks like we don't exercise the None option. I don't have a strong opinion about whether to do one or the other, but I think we should either make regions not Optional, or keep this None instead of using supported_regions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should try and keep the same interface as hailtop.batch as much as possible for consistency reasons. This would mean we'd need to expose the ServiceBackend.supported_regions and the equivalent of ServiceBackend.ANY_REGION. I think they should be hl.supported_regions() that is implemented as a static method on the service backend when applicable and hl.ANY_REGION is just an object. Once we figure this out, then we can figure out what makes sense for the types further downstream.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should try and keep the same interface as hailtop.batch as much as possible for consistency reasons.

Totally agree with this, but I see the constructor of the ServiceBackend object and its implementation as private, so I don't see why we can't currently write the type as it's being used -- if it's never None let's use that knowledge to cut down on None checks that are currently dead code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the user-facing API should be as close to batch as possible, but the ServiceBackend for query is not a user-facing API. Users only use hl.init which select the backend to use.

@@ -329,6 +340,9 @@ async def _rpc(self,
await write_str(infile, orjson.dumps(reference_config).decode('utf-8'))
await write_str(infile, str(self.worker_cores))
await write_str(infile, str(self.worker_memory))
await write_int(infile, len(self.regions))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks if self.regions is None, so would need to change if we allow this to be optional

Some(regionsArrayBuffer.toArray)
} else {
None
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would the service_backend ever send an empty list?

Copy link
Contributor

@daniel-goldstein daniel-goldstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left further comments in the thread.

Copy link
Contributor

@daniel-goldstein daniel-goldstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes so it's off my CI page. Looks fine there's just merge conflicts now.

@jigold
Copy link
Contributor Author

jigold commented Mar 9, 2023

I know you've got a lot on your plate, but it would be awesome if we could get this in tomorrow or early next week. Of all my remaining PRs, I think this one is highest priority and should be close to being mergeable.

Copy link
Contributor

@daniel-goldstein daniel-goldstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reminder. This looks good

@danking danking merged commit 45c8993 into hail-is:main Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants