Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Handling for objects in json_extract_values function #350

Closed
sgoley opened this issue Feb 24, 2023 · 2 comments
Closed

Request: Handling for objects in json_extract_values function #350

sgoley opened this issue Feb 24, 2023 · 2 comments

Comments

@sgoley
Copy link

sgoley commented Feb 24, 2023

Issue:
bqutils.fn.json_extract_values returns unusable values [object Object] when children are objects / sub collections.

Example:
image

Sample query:

-- (select body from`bigquery-public-data.geo_whos_on_first.geojson` limit 100 )

with geojson as 
(select """{"id": 1259392143, "type": "Feature", "properties": {"edtf:cessation": "uuuu", "edtf:inception": "uuuu", "geom:area": 0.0, "geom:area_square_m": 0.0, "geom:bbox": "116.97547,32.41412,116.97547,32.41412", "geom:latitude": 32.41412, "geom:longitude": 116.97547, "gn:admin1_code": "1.0", "gn:asciiname": "Liuyingcun", "gn:country_code": "CN", "gn:dem": 25, "gn:feature_class": "P", "gn:feature_code": "PPL", "gn:geonameid": 10417093, "gn:latitude": 32.41412, "gn:longitude": 116.97547, "gn:modification_date": "2015-08-08", "gn:name": "Liuyingcun", "gn:population": 0, "gn:timezone": "Asia/Shanghai", "iso:country": "CN", "mz:hierarchy_label": 1, "mz:is_current": -1, "name:zho_x_preferred": ["\u5218\u90e2\u6751"], "src:geom": "geonames", "wof:belongsto": [85669739, 102191569, 85632695, 136253041, 890516233], "wof:breaches": [], "wof:concordances": {"gn:id": 10417093}, "wof:country": "CN", "wof:geomhash": "cabdc55714fc98313d2246b1c4f7fb0b", "wof:hierarchy": [{"continent_id": 102191569, "country_id": 85632695, "county_id": 890516233, "empire_id": 136253041, "locality_id": 1259392143, "region_id": 85669739}], "wof:id": 1259392143, "wof:lastmodified": 1537613321, "wof:name": "Liuyingcun", "wof:parent_id": 890516233, "wof:placetype": "locality", "wof:repo": "whosonfirst-data", "wof:superseded_by": [], "wof:supersedes": [], "wof:tags": []}, "bbox": [116.97547, 32.41412, 116.97547, 32.41412], "geometry": {"coordinates": [116.97547, 32.41412], "type": "Point"}}""" as body)

SELECT bqutil.fn.json_extract_keys(body) as jkeys, bqutil.fn.json_extract_values(body) as jvalues
FROM geojson

I used a single row here because the public dataset is 14GB and unpartitioned.

Desired Output:
bqutils.fn.json_extract_values returns a usable value (even just a json string) of the contents of those objects.

Thanks!

@sgoley sgoley changed the title Request: Handling for objects in json_extract_keys function Request: Handling for objects in json_extract_values function Feb 24, 2023
@sgoley
Copy link
Author

sgoley commented Feb 24, 2023

Alternative might be to build something like this as a whole separate UDF:

-- extract all key value pairs as an array from a json dict
-- input: json string with a dictionary
-- returns:  list of struct <key, value>
CREATE TEMP  FUNCTION EXTRACT_KV_PAIRS(json_str STRING)
RETURNS ARRAY<STRUCT<key STRING, value STRING>>
LANGUAGE js AS """
  try{ 
    const json_dict = JSON.parse(json_str); 
    const all_kv = Object.entries(json_dict).map(
        (r)=>Object.fromEntries([["key", r[0]],["value",  
                                   JSON.stringify(r[1])]]));
    return all_kv;
  } catch(e) { return [{"key": "error","value": e}];}
""";

Source: https://medium.com/google-cloud/extracting-json-key-value-pairs-in-bigquery-1bb9d0ec0b6d

@afleisc
Copy link
Collaborator

afleisc commented May 13, 2024

@plaflamme Just contributed the json_extract_key_value_pairs function in #408 which should help with this use case, please let us know if there are any issues

@afleisc afleisc closed this as completed May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants