Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-36779][PYTHON] Fix when list of data type tuples has len = 1 #34019

Conversation

dgd-contributor
Copy link

@dgd-contributor dgd-contributor commented Sep 16, 2021

What changes were proposed in this pull request?

Fix when list of data type tuples has len = 1

Why are the changes needed?

Fix when list of data type tuples has len = 1

>>> ps.DataFrame[("a", int), [int]]
typing.Tuple[pyspark.pandas.typedef.typehints.IndexNameType, int]

>>> ps.DataFrame[("a", int), [("b", int)]]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dgd/spark/python/pyspark/pandas/frame.py", line 11998, in __class_getitem__
    return create_tuple_for_frame_type(params)
  File "/Users/dgd/spark/python/pyspark/pandas/typedef/typehints.py", line 685, in create_tuple_for_frame_type
    return Tuple[extract_types(params)]
  File "/Users/dgd/spark/python/pyspark/pandas/typedef/typehints.py", line 755, in extract_types
    return (index_type,) + extract_types(data_types)
  File "/Users/dgd/spark/python/pyspark/pandas/typedef/typehints.py", line 770, in extract_types
    raise TypeError(
TypeError: Type hints should be specified as one of:
  - DataFrame[type, type, ...]
  - DataFrame[name: type, name: type, ...]
  - DataFrame[dtypes instance]
  - DataFrame[zip(names, types)]
  - DataFrame[index_type, [type, ...]]
  - DataFrame[(index_name, index_type), [(name, type), ...]]
  - DataFrame[dtype instance, dtypes instance]
  - DataFrame[(index_name, index_type), zip(names, types)]
However, got [('b', <class 'int'>)].

Does this PR introduce any user-facing change?

After:

>>> ps.DataFrame[("a", int), [("b", int)]]
typing.Tuple[pyspark.pandas.typedef.typehints.IndexNameType, pyspark.pandas.typedef.typehints.NameType]

How was this patch tested?

exist test

@dgd-contributor dgd-contributor changed the title [SPARK-36779] Fix when list of data type tuples has len = 1 [SPARK-36779][PYTHON] Fix when list of data type tuples has len = 1 Sep 16, 2021
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@dgd-contributor
Copy link
Author

@HyukjinKwon could you take a look?

@HyukjinKwon
Copy link
Member

Oh, thanks!

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants