Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

Allow non-string load/save parameters #276

Closed
wants to merge 2 commits into from

Conversation

mark-oppenheim
Copy link

Resolves #273

Additional parameters to DataFrameReader.load() and DataFrameWriter.save()/.saveTable() are passed to the file-type specific reader or writer types. These parameters can be of any type.

Additional parameters to DataFrameReader.load() and DataFrameWriter.save() are passed to the file-type specific reader or writer types.  These parameters can be of any type.
@zero323
Copy link
Owner

zero323 commented Dec 31, 2019

First of all thanks for your work.

Additionally to my comment to the issue (#275 (comment)) I am not convinced here. Some broad type (Any or protocol proposed in the linked comment) might work in many cases.

This however is quite brittle as it depend on both details of implementation (__str__ or __rep__) and Scala counterpart interpretation of these results. Now, why does it matter? One of personal goals here is to reduce number of runtime failures, especially ones that happen when call chain hits JVM. That's why I tend to be more restrictive than the actual universe.

Additionally I try to avoid Any when possible to its rather ugly nature (because of its wildcard nature it tends to escalate in unpredictable ways), though in this case it shouldn't really matter.

@@ -54,7 +54,7 @@ class DataFrameWriter(OptionUtils):
def sortBy(self, col: TupleOrListOfString) -> DataFrameWriter: ...
def save(self, path: Optional[str] = ..., format: Optional[str] = ..., mode: Optional[str] = ..., partitionBy: Optional[List[str]] = ..., **options: Any) -> None: ...
def insertInto(self, tableName: str, overwrite: Optional[bool] = ...) -> None: ...
def saveAsTable(self, name: str, format: Optional[str] = ..., mode: Optional[str] = ..., partitionBy: Optional[List[str]] = ..., **options: str) -> None: ...
def saveAsTable(self, name: str, format: Optional[str] = ..., mode: Optional[str] = ..., partitionBy: Optional[List[str]] = ..., **options: Any) -> None: ...
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I still haven't made my mind if I am fine with making this annotation less strict. But just for the sake of argument - Optional[Union[bool, int, float str]] (extended #275 (comment)) is something that is up for discussion as:

  • There is reasonable justification for it - there are already option methods with Bool, Double, Long, String.
  • These are rather unlikely to be extended and / or modified by the final user, so string representation is predictable.
  • Smaller set of types is usually preferred, as it helps catching some naive mistakes.
  • In case of unlikely event that this project is merged with upstream and inlined, we really want to avoid Any (stubs don't validate against the annotated source, so it doesn't matter much at the moment).

@mark-oppenheim
Copy link
Author

@zero323 I'll fix this up as suggested as soon as I get a chance.

@zero323
Copy link
Owner

zero323 commented Jan 30, 2020

Do you still plan to work on this @mark-oppenheim. If not, I might take a look this weekend, if I find a spare moment.

@zero323 zero323 mentioned this pull request Feb 3, 2020
@zero323
Copy link
Owner

zero323 commented Feb 3, 2020

Superseded by #363

@zero323 zero323 closed this Feb 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Comparisons with date values unsupported
2 participants