Skip to content

Feature request: infer field names in json_tuple #55632

@jxu

Description

@jxu

https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.json_tuple.html

Currently, if you want new columns matching the field names, you have to specify the field names you want twice: first for extracting the fields and second for aliasing them from the default "c0", "c1", etc.

json_tuple(df.jstring, 'f1', 'f2').alias("f1", "f2")

You can put the field names into a tuple and unpack them with *, but it would be a nice convenience if the column names took on what the fields were automatically. Also if it inferred the schema with JSON data types. Maybe this is already handled by schema_of_json.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions