Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

Allow non-string load/save parameters #276

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
6 changes: 3 additions & 3 deletions third_party/3/pyspark/sql/readwriter.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ class DataFrameReader(OptionUtils):
def schema(self, schema: Union[StructType, str]) -> DataFrameReader: ...
def option(self, key: str, value: Union[bool, float, int, str]) -> DataFrameReader: ...
def options(self, **options: str) -> DataFrameReader: ...
def load(self, path: Optional[PathOrPaths] = ..., format: Optional[str] = ..., schema: Optional[StructType] = ..., **options: str) -> DataFrame: ...
def load(self, path: Optional[PathOrPaths] = ..., format: Optional[str] = ..., schema: Optional[StructType] = ..., **options: Any) -> DataFrame: ...
def json(self, path: Union[str, List[str], RDD[str]], schema: Optional[StructType] = ..., primitivesAsString: Optional[Union[bool, str]] = ..., prefersDecimal: Optional[Union[bool, str]] = ..., allowComments: Optional[Union[bool, str]] = ..., allowUnquotedFieldNames: Optional[Union[bool, str]] = ..., allowSingleQuotes: Optional[Union[bool, str]] = ..., allowNumericLeadingZero: Optional[Union[bool, str]] = ..., allowBackslashEscapingAnyCharacter: Optional[Union[bool, str]] = ..., mode: Optional[str] = ..., columnNameOfCorruptRecord: Optional[str] = ..., dateFormat: Optional[str] = ..., timestampFormat: Optional[str] = ..., multiLine: Optional[Union[bool, str]] = ..., allowUnquotedControlChars: Optional[Union[bool, str]] = ..., lineSep: Optional[str] = ..., samplingRatio: Optional[Union[float, str]] = ..., dropFieldIfAllNull: Optional[Union[bool, str]] = ..., encoding: Optional[str] = ..., locale: Optional[str] = ...) -> DataFrame: ...
def table(self, tableName: str) -> DataFrame: ...
def parquet(self, *paths: str) -> DataFrame: ...
Expand Down Expand Up @@ -52,9 +52,9 @@ class DataFrameWriter(OptionUtils):
def sortBy(self, col: str, *cols: str) -> DataFrameWriter: ...
@overload
def sortBy(self, col: TupleOrListOfString) -> DataFrameWriter: ...
def save(self, path: Optional[str] = ..., format: Optional[str] = ..., mode: Optional[str] = ..., partitionBy: Optional[List[str]] = ..., **options: str) -> None: ...
def save(self, path: Optional[str] = ..., format: Optional[str] = ..., mode: Optional[str] = ..., partitionBy: Optional[List[str]] = ..., **options: Any) -> None: ...
def insertInto(self, tableName: str, overwrite: Optional[bool] = ...) -> None: ...
def saveAsTable(self, name: str, format: Optional[str] = ..., mode: Optional[str] = ..., partitionBy: Optional[List[str]] = ..., **options: str) -> None: ...
def saveAsTable(self, name: str, format: Optional[str] = ..., mode: Optional[str] = ..., partitionBy: Optional[List[str]] = ..., **options: Any) -> None: ...
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I still haven't made my mind if I am fine with making this annotation less strict. But just for the sake of argument - Optional[Union[bool, int, float str]] (extended #275 (comment)) is something that is up for discussion as:

  • There is reasonable justification for it - there are already option methods with Bool, Double, Long, String.
  • These are rather unlikely to be extended and / or modified by the final user, so string representation is predictable.
  • Smaller set of types is usually preferred, as it helps catching some naive mistakes.
  • In case of unlikely event that this project is merged with upstream and inlined, we really want to avoid Any (stubs don't validate against the annotated source, so it doesn't matter much at the moment).

def json(self, path: str, mode: Optional[str] = ..., compression: Optional[str] = ..., dateFormat: Optional[str] = ..., timestampFormat: Optional[str] = ..., lineSep: Optional[str] = ..., encoding: Optional[str] = ...) -> None: ...
def parquet(self, path: str, mode: Optional[str] = ..., partitionBy: Optional[List[str]] = ..., compression: Optional[str] = ...) -> None: ...
def text(self, path: str, compression: Optional[str] = ..., lineSep: Optional[str] = ...) -> None: ...
Expand Down