-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tensor Dataset support #553
Conversation
25adc3b
to
1c6f9db
Compare
|
||
def data(self, batch_size): | ||
|
||
arr = self.df[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From current tests, there is always only one item in self.df
, not sure if it's the designed condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could double and double the dataset, and then, log the size of the pandas udf input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use https://github.com/eto-ai/rikai/blob/v0.1.4/experimental/tfhub/tests/test_tfhub.py to run the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not sound right. can we check batch size and etc. ?
13341c6
to
ee0210f
Compare
@@ -0,0 +1,67 @@ | |||
# Copyright 2021 Rikai Authors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2022?
data = data.batch(batch_size) | ||
data = PandasDataset( | ||
df, model.transform(), unpickle=is_udf, use_pil=True | ||
).data(batch_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.data => .batch
?
|
||
__all__ = ["PandasDataset"] | ||
|
||
from rikai.types import Image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is Image
used?
|
||
def data(self, batch_size): | ||
|
||
arr = self.df[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not sound right. can we check batch size and etc. ?
More complex than I thought, will create another branch to avoid noise during debugging. |
This patch solves
#539
and
#40