Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix docstrings for DataFrame, and DataFrameReader #41

Merged
merged 4 commits into from Aug 10, 2021

Conversation

sfc-gh-okostakis
Copy link
Collaborator

This PR aims to fix ALL details related to docstrings (for the purpose of autodocs) for the files dataframe.py and dataframe_reader.py

Copy link
Collaborator

@sfc-gh-jdu sfc-gh-jdu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good start! Do we have a plan to fix docstrings for other files?

One thing I'm a little bit concerned about is that the long docstring is not formatted (separated to multiple lines). Should we also enforced our auto-formatter to control the length of a docstring in a line? (we can add this flag psf/black#1802)

src/snowflake/snowpark/dataframe.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/dataframe.py Show resolved Hide resolved
src/snowflake/snowpark/dataframe.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/dataframe.py Outdated Show resolved Hide resolved
Example 4
Create a new DataFrame by applying transformations to other existing DataFrames::

df_merged_data = df_catalog.join(df_prices, df_catalog["itemId"] == df_prices["ID"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is getting a column from a dataframe using []-style accessors worth documenting?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Scala it was documented in the user-guide.
@sfc-gh-mabrennan FYI

src/snowflake/snowpark/dataframe.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/dataframe.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/dataframe.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/dataframe_reader.py Outdated Show resolved Hide resolved
src/snowflake/snowpark/dataframe_reader.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@sfc-gh-mabrennan sfc-gh-mabrennan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I made a few small suggestions.

In addition, if you want to load only a subset of files from the stage, you can use the
`pattern <https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html#loading-using-pattern-matching>`_
option to specify a regular expression that matches the files that you want to load.
In addition, if you want to load only a subset of files from the stage, you can
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on line 308 above this, the sentence says " Loading the first two columns of a colon-delimited CSV file ..." but the example code seems to use a semicolon as a delimiter and not a colon. (This error also exists in the Scaladoc)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched the delimiter to a colon in the example code (I copied the examples from there).

@sfc-gh-okostakis
Copy link
Collaborator Author

Very good start! Do we have a plan to fix docstrings for other files?

Each one of us should gradually fix the files for which they are the "code owners". Counting on @sfc-gh-mabrennan to step in and carry us the extra mile by assisting with examples and such.

One thing I'm a little bit concerned about is that the long docstring is not formatted (separated to multiple lines). Should we also enforced our auto-formatter to control the length of a docstring in a line? (we can add this flag psf/black#1802)

In some cases creating a new line is not working well with sphinx. I would like to avoid having the auto-formatter messing up the strings and getting silly autodocs. If, however, our auto-formatter can be 100% compatible with sphinx's format, then we should utilize it.

@sfc-gh-okostakis sfc-gh-okostakis merged commit 89a7634 into master Aug 10, 2021
@sfc-gh-okostakis sfc-gh-okostakis deleted the okostakis-fix-docstrings-dataframe branch August 10, 2021 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants