New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-16992][PYSPARK] use map comprehension in doc #14863
Conversation
Test build #64553 has finished for PR 14863 at commit
|
I don't know if performance is important here. I'd rather either batch this together with other changes that make this change consistently or drop this one. |
I agree. I would prefer if Spark examples also "promotes" the good practice of Python, ie, replacing 'map' and 'filter' by list or map comprehension ('reduce' has no equivalent on comprehension), even though 'map'/'filter' syntax might be closer to their equivalent on the TDDs, they are not the same. I am not sure if there is a consensus over this point on the "data science" community, but most of the Pythonists now happily promotes comprehension over map/filter. Most of the time it is faster, especially when there is a conversion to list after the map. Long story short: if Spark community agree, I can look for these 'map'/'filter' in the examples and replace them with comprehension. |
OK well I'd leave it to people here with more taste to agree about what's canonical but I take your word for it. I'm mostly interested in consistency if anythign. |
This is actually wrong, 'map()' returns a 'list' and not a dict |
OK you're saying the existing example doesn't work? |
7a2621e
to
079665b
Compare
No my proposal was wrong. I have updated it |
@@ -29,7 +29,7 @@ | |||
.getOrCreate() | |||
|
|||
# $example on$ | |||
data = [(0, 18.0,), (1, 19.0,), (2, 8.0,), (3, 5.0,), (4, 2.2,)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these extra comma are useless
Test build #64779 has finished for PR 14863 at commit
|
Signed-off-by: Gaetan Semet <gaetan@xeberon.net>
079665b
to
f674f75
Compare
Test build #64940 has finished for PR 14863 at commit
|
OK, this one's trivial in any event so I'm OK to merge this much. |
Code is equivalent, but map comprehency is most of the time faster than a map. Author: Gaetan Semet <gaetan@xeberon.net> Closes apache#14863 from Stibbons/map_comprehension.
Code is equivalent, but map comprehency is most of the time faster than a map.