Skip to content

Fix Socrata previews returning null#674

Merged
mildbyte merged 1 commit intomasterfrom
bugfix/fix-socrata-data-source
Apr 27, 2022
Merged

Fix Socrata previews returning null#674
mildbyte merged 1 commit intomasterfrom
bugfix/fix-socrata-data-source

Conversation

@mildbyte
Copy link
Copy Markdown
Contributor

It looks like row_to_json(t.*) makes Multicorn think we're requesting no
columns from the FDW (this could be related to our Multicorn changes, but it
doesn't happen with the LQFDW):

explain SELECT row_to_json(p.*) FROM sg_tmp_67cecbc993a0738c2f815e80169dafd3.some_table p LIMIT 10

Limit  (cost=20.00..20.02 rows=10 width=32)
  ->  Foreign Scan on some_table p  (cost=20.00..237.38 rows=94954 width=32)
        Multicorn: Socrata query to data.cityofnewyork.us
        Multicorn: Socrata dataset ID: 8wbx-tsch
        Multicorn: Query:
        Multicorn: Columns:
        Multicorn: Order: :id

This query returns JSON objects full of NULLs.

Rewrite the preview query to first get the first 10 rows as a materialized CTE,
then use row_to_json on the result, which seems to fix the issue:

explain SELECT t.* FROM sg_tmp_67cecbc993a0738c2f815e80169dafd3.some_table t LIMIT 10

Limit  (cost=20.00..33620.00 rows=10 width=3360)
  ->  Foreign Scan on some_table t  (cost=20.00..319045440.00 rows=94954 width=3360)
        Multicorn: Socrata query to data.cityofnewyork.us
        Multicorn: Socrata dataset ID: 8wbx-tsch
        Multicorn: Query:
        Multicorn: Columns: `last_date_updated`,`license_type`,`last_time_updated`,`vehicle_license_number`,`wheelchair_accessible`,`vehicle_year`,`permit_license_number`,`order_date`,`certification_date`,`base_address`,`base_name`,`reason`,`hack_up_date`,`expiration_date`,`vehicle_vin_number`,`base_number`,`base_telephone_number`,`dmv_license_plate_number`,`website`,`active`,`base_type`,`veh`,`name`
        Multicorn: Order: :id

explain with p as materialized
    (SELECT * FROM sg_tmp_67cecbc993a0738c2f815e80169dafd3.some_table LIMIT 10)
select row_to_json(p.*) from p

CTE Scan on p  (cost=33620.00..33620.22 rows=10 width=32)
  CTE p
    ->  Limit  (cost=20.00..33620.00 rows=10 width=3360)
          ->  Foreign Scan on some_table  (cost=20.00..319045440.00 rows=94954 width=3360)
                Multicorn: Socrata query to data.cityofnewyork.us
                Multicorn: Socrata dataset ID: 8wbx-tsch
                Multicorn: Query:
                Multicorn: Columns: `last_date_updated`,`license_type`,`last_time_updated`,`vehicle_license_number`,`wheelchair_accessible`,`vehicle_year`,`permit_license_number`,`order_date`,`certification_date`,`base_address`,`base_name`,`reason`,`hack_up_date`,`expiration_date`,`vehicle_vin_number`,`base_number`,`base_telephone_number`,`dmv_license_plate_number`,`website`,`active`,`base_type`,`veh`,`name`
                Multicorn: Order: :id

It looks like `row_to_json(t.*)` makes Multicorn think we're requesting no
columns from the FDW (this could be related to our Multicorn changes, but it
doesn't happen with the LQFDW):

```sql
explain SELECT row_to_json(p.*) FROM sg_tmp_67cecbc993a0738c2f815e80169dafd3.some_table p LIMIT 10

Limit  (cost=20.00..20.02 rows=10 width=32)
  ->  Foreign Scan on some_table p  (cost=20.00..237.38 rows=94954 width=32)
        Multicorn: Socrata query to data.cityofnewyork.us
        Multicorn: Socrata dataset ID: 8wbx-tsch
        Multicorn: Query:
        Multicorn: Columns:
        Multicorn: Order: :id
```

This query returns JSON objects full of NULLs.

Rewrite the preview query to first get the first 10 rows as a materialized CTE,
then use `row_to_json` on the result, which seems to fix the issue:

```sql
explain SELECT t.* FROM sg_tmp_67cecbc993a0738c2f815e80169dafd3.some_table t LIMIT 10

Limit  (cost=20.00..33620.00 rows=10 width=3360)
  ->  Foreign Scan on some_table t  (cost=20.00..319045440.00 rows=94954 width=3360)
        Multicorn: Socrata query to data.cityofnewyork.us
        Multicorn: Socrata dataset ID: 8wbx-tsch
        Multicorn: Query:
        Multicorn: Columns: `last_date_updated`,`license_type`,`last_time_updated`,`vehicle_license_number`,`wheelchair_accessible`,`vehicle_year`,`permit_license_number`,`order_date`,`certification_date`,`base_address`,`base_name`,`reason`,`hack_up_date`,`expiration_date`,`vehicle_vin_number`,`base_number`,`base_telephone_number`,`dmv_license_plate_number`,`website`,`active`,`base_type`,`veh`,`name`
        Multicorn: Order: :id

explain with p as materialized
    (SELECT * FROM sg_tmp_67cecbc993a0738c2f815e80169dafd3.some_table LIMIT 10)
select row_to_json(p.*) from p

CTE Scan on p  (cost=33620.00..33620.22 rows=10 width=32)
  CTE p
    ->  Limit  (cost=20.00..33620.00 rows=10 width=3360)
          ->  Foreign Scan on some_table  (cost=20.00..319045440.00 rows=94954 width=3360)
                Multicorn: Socrata query to data.cityofnewyork.us
                Multicorn: Socrata dataset ID: 8wbx-tsch
                Multicorn: Query:
                Multicorn: Columns: `last_date_updated`,`license_type`,`last_time_updated`,`vehicle_license_number`,`wheelchair_accessible`,`vehicle_year`,`permit_license_number`,`order_date`,`certification_date`,`base_address`,`base_name`,`reason`,`hack_up_date`,`expiration_date`,`vehicle_vin_number`,`base_number`,`base_telephone_number`,`dmv_license_plate_number`,`website`,`active`,`base_type`,`veh`,`name`
                Multicorn: Order: :id
```
@mildbyte mildbyte merged commit 40520f7 into master Apr 27, 2022
@mildbyte mildbyte deleted the bugfix/fix-socrata-data-source branch September 13, 2022 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant