https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.json_tuple.html
Currently, if you want new columns matching the field names, you have to specify the field names you want twice: first for extracting the fields and second for aliasing them from the default "c0", "c1", etc.
json_tuple(df.jstring, 'f1', 'f2').alias("f1", "f2")
You can put the field names into a tuple and unpack them with *, but it would be a nice convenience if the column names took on what the fields were automatically. Also if it inferred the schema with JSON data types. Maybe this is already handled by schema_of_json.
https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.json_tuple.html
Currently, if you want new columns matching the field names, you have to specify the field names you want twice: first for extracting the fields and second for aliasing them from the default "c0", "c1", etc.
You can put the field names into a tuple and unpack them with *, but it would be a nice convenience if the column names took on what the fields were automatically. Also if it inferred the schema with JSON data types. Maybe this is already handled by
schema_of_json.