Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nested sequence feature won't encode example if the first item of the outside sequence is an empty list #3306

Closed
function2-llx opened this issue Nov 20, 2021 · 3 comments · Fixed by #3402
Assignees
Labels
bug Something isn't working

Comments

@function2-llx
Copy link

Describe the bug

As the title, nested sequence feature won't encode example if the first item of the outside sequence is an empty list.

Steps to reproduce the bug

from datasets import Features, Sequence, ClassLabel
features = Features({
    'x': Sequence(Sequence(ClassLabel(names=['a', 'b']))),
})
print(features.encode_batch({
    'x': [
        [['a'], ['b']],
        [[], ['b']],
    ]
}))

Expected results

print {'x': [[[0], [1]], [[], ['1']]]}

Actual results

print {'x': [[[0], [1]], [[], ['b']]]}

Environment info

  • datasets version: 1.15.1
  • Platform: Linux-5.13.0-21-generic-x86_64-with-glibc2.34
  • Python version: 3.9.7
  • PyArrow version: 6.0.0

Additional information

I think the issue stems from here.

@function2-llx function2-llx added the bug Something isn't working label Nov 20, 2021
@function2-llx
Copy link
Author

knock knock

@mariosasko
Copy link
Collaborator

Hi, thanks for reporting! I've linked a PR that should fix the issue.

@function2-llx
Copy link
Author

I've checked the PR and it looks great, thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants