Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

Review Column annotations #2

Closed
zero323 opened this issue Feb 10, 2017 · 4 comments
Closed

Review Column annotations #2

zero323 opened this issue Feb 10, 2017 · 4 comments

Comments

@zero323
Copy link
Owner

zero323 commented Feb 10, 2017

Currently there is a number of problems with pyspark.sql.column annotations. Some are related to the Mypy behavior:

other, like bitwise* to the vague upstream semantics (should we allow Any if the only literal type acceptable on runtime is int?).

@harpaj
Copy link
Contributor

harpaj commented Jun 3, 2019

@zero323,
The following valid snippet currently results in an error:

df = df.withColumn("score", sum((F.col("sim"), F.col("weight"))))
(note that sum is the stdlib sum, not the one from pyspark functions)

The error is Argument 2 to "withColumn" of "DataFrame" has incompatible type "Union[Column, int]"; expected "Column"

We actually do have a proper annotation for radd on Column, but it has a type: ignore. It sounds like this is connected to the python/mypy#2129 you mention above, but that one has been fixed. Do you think the type: ignore can be safely removed now?

@zero323
Copy link
Owner Author

zero323 commented Jul 1, 2019

@harpaj Indeed, it looks like it should be safe to remove type: ignore now.

However I don't think that's really the source of the problem here. With ingore this for example is valid:

expr: Column = 1 + col("b")

It looks like the problem is more that we don't have dependent types here, and mypy cannot infer that sum(NonEmptyIterable[Column]) is Column. Instead in consider both cases:

  • iterable is empty and we default to int,
  • iterable is non empty and we get Column

If you want sum to type check you should rather start with literal

sum([F.col("sim"), F.col("weight")], lit(0))

@zero323
Copy link
Owner Author

zero323 commented Aug 29, 2019

Partially addressed by #194

@github-actions
Copy link

This issue haven't seen any activity in a while.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants