Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring and renaming to KeyZipper to IterKeyZipper and MapZipper to MapKeyZipper #50

Closed
wants to merge 5 commits into from

Conversation

NivekT
Copy link
Contributor

@NivekT NivekT commented Oct 7, 2021

Stack from ghstack:

Since MapZipper has been added, the name KeyZipper is confusing and should be changed to IterKeyZipper instead. We are also changing MapZipper to MapKeyZipper to ensure the names stay matching.

Note that this renaming is BC breaking for users.

Differential Revision: D31487393

NivekT added a commit that referenced this pull request Oct 7, 2021
ghstack-source-id: 103429968438a8f9cc64321d85fe71284838ee78
Pull Request resolved: #50
@NivekT
Copy link
Contributor Author

NivekT commented Oct 7, 2021

The linting error will go away when this PR lands.

@NivekT
Copy link
Contributor Author

NivekT commented Oct 7, 2021

@NivekT has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ejguan
Copy link
Contributor

ejguan commented Oct 8, 2021

Let me discuss with domains about how to maintain BC breaking change before we land it.

Since `MapZipper` has been added, the name `KeyZipper` is confusing and should be changed to `IterZipper` instead.

Differential Revision: [D31487393](https://our.internmc.facebook.com/intern/diff/D31487393)

[ghstack-poisoned]
NivekT added a commit that referenced this pull request Oct 8, 2021
ghstack-source-id: b38cd6e5ea96df7b5bec62c4d3096a1a6f64fe35
Pull Request resolved: #50
@NivekT
Copy link
Contributor Author

NivekT commented Oct 8, 2021

@NivekT has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@NivekT
Copy link
Contributor Author

NivekT commented Oct 11, 2021

Let me discuss with domains about how to maintain BC breaking change before we land it.

Let me know if there is any follow up related to this.

@ejguan
Copy link
Contributor

ejguan commented Oct 11, 2021

Sorry for the delay. The POC from Vision is on PTO this week. Have to wait until he is back.

@NivekT
Copy link
Contributor Author

NivekT commented Oct 11, 2021

Okay no problem.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 13, 2021
NivekT added a commit that referenced this pull request Oct 21, 2021
Fixes #44 by adding merge function as an optional input for `KeyZipper`. It allows user to specify how the items yielded from the DataPipes will be combined before they are yielded.

This PR changes the default behavior when `keep_key=True` and that may be BC breaking.
Previously, when `keep_key=True`, the output will be `(key, item1, item2)`, now it will default to `(key, (item1, item2)`. It is possible to leave the default behavior unchanged.

Note that `KeyZipper` may be renamed to `IterZipper` in #50 to provide users with better clarity.

[ghstack-poisoned]
NivekT added a commit that referenced this pull request Oct 21, 2021
Fixes #44 by adding merge function as an optional input for `KeyZipper`. It allows user to specify how the items yielded from the DataPipes will be combined before they are yielded.

This PR changes the default behavior when `keep_key=True` and that may be BC breaking.
Previously, when `keep_key=True`, the output will be `(key, item1, item2)`, now it will default to `(key, (item1, item2)`. It is possible to leave the default behavior unchanged.

Note that `KeyZipper` may be renamed to `IterZipper` in #50 to provide users with better clarity.

Differential Revision: [D31820203](https://our.internmc.facebook.com/intern/diff/D31820203)

[ghstack-poisoned]
facebook-github-bot pushed a commit that referenced this pull request Oct 22, 2021
Summary:
Pull Request resolved: #76

Fixes #44 by adding merge function as an optional input for `KeyZipper`. It allows user to specify how the items yielded from the DataPipes will be combined before they are yielded.

This PR changes the default behavior when `keep_key=True` and that may be BC breaking.
Previously, when `keep_key=True`, the output will be `(key, item1, item2)`, now it will default to `(key, (item1, item2)`. It is possible to leave the default behavior unchanged.

Note that `KeyZipper` may be renamed to `IterZipper` in #50 to provide users with better clarity.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D31820203

Pulled By: NivekT

fbshipit-source-id: 670941e03595f0b0013c1162cfe1c22ec3ce6064
Since `MapZipper` has been added, the name `KeyZipper` is confusing and should be changed to `IterZipper` instead.

Note that this renaming is BC breaking for users.

Differential Revision: [D31487393](https://our.internmc.facebook.com/intern/diff/D31487393)

[ghstack-poisoned]
NivekT added a commit that referenced this pull request Oct 28, 2021
ghstack-source-id: 9743b148616a8a5814af2a28d03f0b6253cee758
Pull Request resolved: #50
@NivekT
Copy link
Contributor Author

NivekT commented Oct 28, 2021

@ejguan any update on this thread? We can also add a deprecation warning and keep the old name working in the meantime.

@NivekT
Copy link
Contributor Author

NivekT commented Nov 5, 2021

@ejguan Since TorchVision is fine with this change, can we land it (maybe next week so that they can change things downstream as well)? Or do we need to follow up with TorchText too?

Copy link
Contributor

@ejguan ejguan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add Philip to see if there is objection

@ejguan ejguan requested a review from pmeier November 5, 2021 16:18
Copy link
Contributor

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't that be quite confusing when we still have the Zipper datapipe?

Given that it is also an IterDataPipe, you can't know which does what from the names: IterZipper vs Zipper. Maybe name it IterKeyZipper?

@NivekT
Copy link
Contributor Author

NivekT commented Nov 8, 2021

Won't that be quite confusing when we still have the Zipper datapipe?

Given that it is also an IterDataPipe, you can't know which does what from the names: IterZipper vs Zipper. Maybe name it IterKeyZipper?

I think IterKeyZipper is better. @ejguan do you agree?

Since `MapZipper` has been added, the name `KeyZipper` is confusing and should be changed to `IterZipper` instead.

Note that this renaming is BC breaking for users.

Differential Revision: [D31487393](https://our.internmc.facebook.com/intern/diff/D31487393)

[ghstack-poisoned]
NivekT added a commit that referenced this pull request Nov 8, 2021
ghstack-source-id: fdd2b8aa0a73cd1138e658af42973c99fa6ef3f9
Pull Request resolved: #50
@NivekT
Copy link
Contributor Author

NivekT commented Nov 8, 2021

@NivekT has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@NivekT
Copy link
Contributor Author

NivekT commented Nov 8, 2021

@pmeier I am importing this PR and will land it later today with IterKeyZipper as the name. Let me know if there is any issue. Thanks!

@NivekT NivekT changed the title Refactoring and renaming to KeyZipper to IterZipper Refactoring and renaming to KeyZipper to IterKeyZipper Nov 8, 2021
@pmeier
Copy link
Contributor

pmeier commented Nov 8, 2021

Will you also rename Zipper to IterZipper?

@NivekT
Copy link
Contributor Author

NivekT commented Nov 8, 2021

Will you also rename Zipper to IterZipper?

No, there will be no change to Zipper at all.

@pmeier
Copy link
Contributor

pmeier commented Nov 8, 2021

Why not? If this PR is landed we would have

  • Zipper
  • MapZipper
  • IterKeyZipper

IIUC, Zipper also just works with IterDataPipe's, no? If that is the case, we should probably also namespace it.

@NivekT
Copy link
Contributor Author

NivekT commented Nov 8, 2021

Why not? If this PR is landed we would have

  • Zipper
  • MapZipper
  • IterKeyZipper

IIUC, Zipper also just works with IterDataPipe's, no? If that is the case, we should probably also namespace it.

Because MapZipper is applied onto an IterDataPipe (as are the other two). The Map in MapZipper indicates that it zips a MapDataPipe onto an IterDataPipe

def __init__(
self,
source_iterdatapipe: IterDataPipe,
map_datapipe: MapDataPipe,
key_fn: Callable,
merge_fn: Optional[Callable] = None,
):
if not isinstance(map_datapipe, MapDataPipe):
raise TypeError(f"map_datapipe must be a MapDataPipe, but its type is {type(map_datapipe)} instead.")
self.source_iterdatapipe: IterDataPipe = source_iterdatapipe
self.map_datapipe: MapDataPipe = map_datapipe
self.key_fn: Callable = key_fn
self.merge_fn: Optional[Callable] = merge_fn
self.length: int = -1

I am open to the possibility that MapZipper should be changed to MapKeyZipper?

@ejguan maybe you want to weight in on this as well

@pmeier
Copy link
Contributor

pmeier commented Nov 8, 2021

Ohh, I'm sorry. I think I misunderstood the issue. IMO,

  • Zipper
  • IterKeyZipper
  • MapKeyZipper

would be best then.

@NivekT
Copy link
Contributor Author

NivekT commented Nov 8, 2021

Ohh, I'm sorry. I think I misunderstood the issue. IMO,

  • Zipper
  • IterKeyZipper
  • MapKeyZipper

would be best then.

Great, I think that is more reasonable as well. I will rename MapZipper to MapKeyZipper. Though I expect most users to call it using zip_by_map so that should be fine regardless.

Since `MapZipper` has been added, the name `KeyZipper` is confusing and should be changed to `IterKeyZipper` instead.

Note that this renaming is BC breaking for users.

Differential Revision: [D31487393](https://our.internmc.facebook.com/intern/diff/D31487393)

[ghstack-poisoned]
NivekT added a commit that referenced this pull request Nov 8, 2021
…to MapKeyZipper

ghstack-source-id: 15a64f5716549fd0c9fd5b64f8ad4b728e80babe
Pull Request resolved: #50
@NivekT
Copy link
Contributor Author

NivekT commented Nov 8, 2021

@NivekT has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@NivekT NivekT changed the title Refactoring and renaming to KeyZipper to IterKeyZipper Refactoring and renaming to KeyZipper to IterKeyZipper and MapZipper to MapKeyZipper Nov 8, 2021
@NivekT
Copy link
Contributor Author

NivekT commented Nov 8, 2021

@pmeier I will likely land this tomorrow morning (US EST). I will let you know so you can land the other PR as well. Thanks!

@ejguan
Copy link
Contributor

ejguan commented Nov 8, 2021

  • Zipper
  • IterKeyZipper
  • MapKeyZipper

This is great.

@facebook-github-bot
Copy link
Contributor

@NivekT merged this pull request in 3f2fee6.

@facebook-github-bot facebook-github-bot deleted the gh/NivekT/9/head branch November 13, 2021 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants