Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Field order issue in loading json #2548

Closed
luyug opened this issue Jun 24, 2021 · 1 comment
Closed

Field order issue in loading json #2548

luyug opened this issue Jun 24, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@luyug
Copy link

luyug commented Jun 24, 2021

Describe the bug

The load_dataset function expects columns in alphabetical order when loading json files.

Similar bug was previously reported for csv in #623 and fixed in #684.

Steps to reproduce the bug

For a json file j.json,

{"c":321, "a": 1, "b": 2}

Running the following,

f= datasets.Features({'a': Value('int32'), 'b': Value('int32'), 'c': Value('int32')})
json_data = datasets.load_dataset('json', data_files='j.json', features=f)

Expected results

A successful load.

Actual results

File "pyarrow/table.pxi", line 1409, in pyarrow.lib.Table.cast
ValueError: Target schema's field names are not matching the table's field names: ['c', 'a', 'b'], ['a', 'b', 'c']

Environment info

  • datasets version: 1.8.0
  • Platform: Linux-3.10.0-957.1.3.el7.x86_64-x86_64-with-glibc2.10
  • Python version: 3.8.8
  • PyArrow version: 3.0.0
@luyug luyug added the bug Something isn't working label Jun 24, 2021
@luyug luyug changed the title Field Order Issue in loading json Field order issue in loading json Jun 24, 2021
@albertvillanova
Copy link
Member

albertvillanova commented Jun 24, 2021

Hi @luyug, thanks for reporting.

The good news is that we fixed this issue only 9 days ago: #2507.

The patch is already in the master branch of our repository and it will be included in our next datasets release version 1.9.0.

Feel free to reopen the issue if the problem persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants