-
Notifications
You must be signed in to change notification settings - Fork 601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AL-1579] Auto htype #1370
[AL-1579] Auto htype #1370
Conversation
@@ -811,3 +811,17 @@ def test_empty_extend(memory_ds): | |||
ds.create_tensor("y") | |||
ds.y.extend(np.zeros((len(ds), 3))) | |||
assert len(ds) == 0 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test case for when set_htype
is called and self.htype
is not None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_htype is internal , analogous to set_dtype, that case shouldn't be hit from user space.
@@ -811,3 +811,17 @@ def test_empty_extend(memory_ds): | |||
ds.create_tensor("y") | |||
ds.y.extend(np.zeros((len(ds), 3))) | |||
assert len(ds) == 0 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test case for when set_htype
is called and self.length
is > 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_htype is internal , analogous to set_dtype, that case shouldn't be hit from user space.
@@ -88,6 +80,38 @@ def set_dtype(self, dtype: np.dtype): | |||
|
|||
self.dtype = dtype.name | |||
|
|||
def set_htype(self, htype: str, **kwargs): | |||
"""Should only be called once.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More documentation please.
- If self.htype is present, it must be None.
- If self.length is present, it must be zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstr is similar to that of set_dtype. This is an internal method and the conditions are explicit in the checks.
"""Should only be called once.""" | ||
ffw_tensor_meta(self) | ||
|
||
if getattr(self, "htype", None) is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test case when htype is not yet present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is hit by default when create_tensor is called.
f"Tensor meta already has a htype ({self.htype}). Incoming: {htype}." | ||
) | ||
|
||
if getattr(self, "length", 0) > 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test case when length is not yet present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is hit by default when create_tensor is called.
hub/util/casting.py
Outdated
@@ -37,6 +37,21 @@ def get_dtype(val: Union[np.ndarray, Sequence, Sample]) -> np.dtype: | |||
raise TypeError(f"Cannot infer numpy dtype for {val}") | |||
|
|||
|
|||
def get_htype(val: Union[np.ndarray, Sequence, Sample]) -> str: | |||
if isinstance(val, np.ndarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docstring.
@@ -37,6 +37,21 @@ def get_dtype(val: Union[np.ndarray, Sequence, Sample]) -> np.dtype: | |||
raise TypeError(f"Cannot infer numpy dtype for {val}") | |||
|
|||
|
|||
def get_htype(val: Union[np.ndarray, Sequence, Sample]) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
samples
instead of val
is easier to parse for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method is analogous to get_dtype.
if isinstance(val, np.ndarray): | ||
return "generic" | ||
types = set((map(type, val))) | ||
if dict in types: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't all types have to be dict? Or, what's the case when not all are dicts (please add test case)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test added.
Codecov Report
@@ Coverage Diff @@
## main #1370 +/- ##
==========================================
- Coverage 92.27% 92.22% -0.05%
==========================================
Files 174 174
Lines 13850 13906 +56
==========================================
+ Hits 12780 12825 +45
- Misses 1070 1081 +11
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
🚀 🚀 Pull Request
Checklist:
coverage-rate
upChanges
Detect json and text htypes automatically. https://activeloop.atlassian.net/browse/AL-1579