-
Notifications
You must be signed in to change notification settings - Fork 63
Frame and FrameBatch improvements #283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if self.data.ndim == 4: | ||
return Frame( | ||
data=data, | ||
pts_seconds=float(pts_seconds.item()), | ||
duration_seconds=float(duration_seconds.item()), | ||
) | ||
else: | ||
return FrameBatch( | ||
data=data, | ||
pts_seconds=pts_seconds, | ||
duration_seconds=duration_seconds, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tensor has a .item() method for returning the underlying dtype.
Should we have something like that here? i.e. always return a FrameBatch but return a Frame if .item() is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
always return a FrameBatch but return a Frame if .item() is called?
I feel like this is what we're already doing, but perhaps I'm misunderstanding?
BTW, this quirk is only needed for mypy (sigh). Originally the code was simpler:
cls = Frame if self.data.ndim == 4 else FrameBatch
return cls(
self.data[key],
self.pts_seconds[key],
self.duration_seconds[key],
)
and everything was fine, and the Frame would get proper float value because of what we do in its post_init
. But mypy was complaining so I had to go for this in 4661237
(#283)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is what we're already doing, but perhaps I'm misunderstanding?
Don't we return a Frame for the special case of dimensions=4?
What I am saying is we should return a FrameBatch even in that case (of size 1). So we are consistent with Tensor
I'll leave it to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a super strong preference on this - let me open an issue so we can discuss during one of the meetings
This PR:
Frame
or aFrameBatch
. We enforceFrame
data to be 3D, andFrameBatch
data to be >= 4D. We also ensure consistency of leading dimensions between data, pts_seconds and duration_secondsFrameBatch
. This supports pytorch fancy indexing naturally and intuitively. Note that indexing a 4DFrameBatch
returns aFrame
.FrameBatch
. This removes the "tuple unpacking" behavior that we had forFrameBatch
, but luckily this is not something we have been using at all. The unpacking behavior ofFrame
is preserved.These changes are mostly necessary in order for us to change the output of the samples from
List[FrameBatch(4D)]
toFrameBatch (5D)
, as done in #284