Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failed when reading JSON with nullable fields #7145

Closed
2 tasks done
matt-allan opened this issue Apr 19, 2023 · 1 comment
Closed
2 tasks done

Assertion failed when reading JSON with nullable fields #7145

matt-allan opened this issue Apr 19, 2023 · 1 comment

Comments

@matt-allan
Copy link
Contributor

matt-allan commented Apr 19, 2023

What happens?

I ran into the following error while querying a few JSON files with the json extension from Node:

node: ../src/duckdb/extension/json/json_functions/read_json.cpp:251: void duckdb::ReadJSONFunction(duckdb::ClientContext&, duckdb::TableFunctionInput&, duckdb::DataChunk&): Assertion `result_vectors.size() == gstate.bind_data.names.size()' failed.

The JSON I am querying is essentially a linked list that looks like this:

// data_1.json
{
  "id": 1,
   "next": null
}
// data_2.json
{
  "id": 2,
  "next": 1
}

I am using a recursive CTE to query the files which looks like this:

CREATE OR REPLACE VIEW segments AS WITH RECURSIVE
raw_segments AS (
  SELECT *
  FROM read_json(
    '/tmp/duckdb-reprofOPqUc/*.json',
    columns = {
      id: 'INTEGER',
      next: 'INTEGER'
  })
),
sorted_segments AS (
  SELECT
    0 AS seq,
    raw_segments.*,
    FROM raw_segments
  WHERE raw_segments.next IS NULL
  UNION ALL
  (
    SELECT
      sorted_segments.seq + 1 AS seq,
      raw_segments.*,
      FROM raw_segments
      INNER JOIN sorted_segments ON (raw_segments.next = sorted_segments.id)
  )
)
SELECT *
FROM sorted_segments
ORDER BY seq;

Running the same query from the DuckDB CLI works correctly on 0.7.1, while using 0.7.1 from Node does not (still need to try comparing with exact versions though).

$ cat repro.sql|duckdb
┌───────┬───────┬───────┐
│  seq  │  id   │ next  │
│ int32 │ int32 │ int32 │
├───────┼───────┼───────┤
│     0 │     1 │       │
│     1 │     2 │     1 │
│     2 │     3 │     2 │
└───────┴───────┴───────┘

It also works from Node.JS if I make next always an integer and never null. If I first insert the JSON into a table, then query that it prevents the issue as well.

To Reproduce

I've attached a file with a Node.JS script that reproduces the issue. There's also a SQL script and a folder of test data for comparison.

duckdb-issue.zip

OS:

Arch Linux x86_64

DuckDB Version:

0.7.2-dev2233 / c81600e

DuckDB Client:

Node

Full Name:

Matt Allan

Affiliation:

Destructure LLC (consulting)

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree
@lnkuiper
Copy link
Contributor

Closed via #7478

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants