Skip to content

HLG: get_all_external_keys()#6774

Merged
jrbourbeau merged 2 commits intodask:masterfrom
madsbk:hlg_use_ext_keys
Oct 30, 2020
Merged

HLG: get_all_external_keys()#6774
jrbourbeau merged 2 commits intodask:masterfrom
madsbk:hlg_use_ext_keys

Conversation

@madsbk
Copy link
Copy Markdown
Contributor

@madsbk madsbk commented Oct 28, 2020

This PR introduces HighLevelGraph.get_all_external_keys(), which makes it possible to cull high level graphs that contains ShuffleLayer without materializing them.

  • Tests added / passed
  • Passes black dask / flake8 dask

cc. @rjzamora

This makes it possible to cull ShuffleLayer without materializing
Copy link
Copy Markdown
Member

@rjzamora rjzamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach here makes sense to me @madsbk . I was particularly focued on the shuffle changes as I reviewed, but the use of external dependencies for culling seems like a good idea to me - Thanks for working on this!

"""
deps = defaultdict(set)
parts_out = self._keys_to_parts(keys)
parts_out = parts_out or self._keys_to_parts(keys)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops - My bad.

self.meta_input = meta_input
self.parts_out = parts_out or range(npartitions)

def get_output_keys(self):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that it should be in this PR, but should we be able to implement something similar for other Layer types (like Blockwise)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, implementing layer specific get_output_keys() will be essential to avoid materialization.

@madsbk
Copy link
Copy Markdown
Contributor Author

madsbk commented Oct 30, 2020

@jrbourbeau I think this is ready to be merged

Copy link
Copy Markdown
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @madsbk!

@jrbourbeau jrbourbeau merged commit f1bdccb into dask:master Oct 30, 2020
@madsbk madsbk deleted the hlg_use_ext_keys branch October 30, 2020 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants