Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOCS] Mention behavior of string returning functions #1039

Merged
merged 5 commits into from
Sep 28, 2023

Conversation

furqaankhan
Copy link
Contributor

Did you read the Contributor Guide?

Is this PR related to a JIRA ticket?

  • No, this is a documentation update. The PR name follows the format [DOCS] my subject.

What changes were proposed in this PR?

  • show function escapes special characters, this messes up with functions that return string with special characters.

How was this patch tested?

Did this PR include necessary documentation updates?

  • Yes, I have updated the documentation update.

@furqaankhan
Copy link
Contributor Author

Please let me know if there are more functions that return strings with special characters.

If you are using `show()` to display the output, it will show special characters as escape sequences. To get the expected behavior use the following code:

```sql
print(df.selectExpr("RS_GeoReference(rast)").sample(0.5).collect().mkString(""))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a SQL example. This should be a language-specific example. You need to provide examples for (Scala/Java) and (Python). Please also say sample() here is to reduce data to be collected. Other functions such as filter() will work as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the filter function makes much sense here. As there's nothing to filter, the function returns one row, that's a string with special characters for formatting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@furqaankhan sample and filter are operations that can reduce the number of resulting rows. Without this kind of commands, a direct collect() on a DataFrame that has millions of row will crash the driver program.

If you are using `show()` to display the output, it will show special characters as escape sequences. To get the expected behavior use the following code:

```sql
print(df.selectExpr("RS_AsMatrix(rast)").sample(0.5).collect().mkString(""))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a SQL example. This should be a language-specific example. You need to provide examples for (Scala/Java) and (Python). Please also say sample() here is to reduce data to be collected. Other functions such as filter() will work as well.

@jiayuasu jiayuasu added the docs label Sep 28, 2023
@jiayuasu jiayuasu merged commit 16c7972 into apache:master Sep 28, 2023
37 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants