Skip to content

Conversation

@lazyman500
Copy link
Contributor

add examples for PySpark

2.add module example for PySpark
@lazyman500 lazyman500 changed the title [SPARK-5155] [PySpark] [SPARK-5616] [PySpark] Feb 6, 2015
@srowen
Copy link
Member

srowen commented Feb 7, 2015

The title of the PR should include the title of the JIRA. There is a typo throughout this PR: it's "broadcast" and not "boardcast". What are the other new files besides the main example file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's clear what this is an example of. Yes it uses broadcast variables, but the als.py example already does too. Why does it need to be done several times and how does this show the difference versus non-broadcast variables? that plus comments might make this more useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had added some comment to explain why we use broadcast variables . and print the performance report.

Using broadcast: Iteration 0 cost time 0.829586982727
Using broadcast: Iteration 1 cost time 0.0809919834137
Using broadcast: Iteration 2 cost time 0.0794229507446
Don't use broadcast: Iteration 0 cost time 2.80766296387
Don't use broadcast: Iteration 1 cost time 2.83087706566
Don't use broadcast: Iteration 2 cost time 3.16146707535

@srowen
Copy link
Member

srowen commented Feb 8, 2015

ok to test

@SparkQA
Copy link

SparkQA commented Feb 8, 2015

Test build #27037 has finished for PR 4417 at commit f7f7f22.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@lazyman500 lazyman500 changed the title [SPARK-5616] [PySpark] [SPARK-5616] [PySpark] Add examples for PySpark API Feb 8, 2015
@SparkQA
Copy link

SparkQA commented Feb 8, 2015

Test build #27043 has finished for PR 4417 at commit 1cf3e59.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 8, 2015

Test build #27044 has finished for PR 4417 at commit d2ec368.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@lazyman500
Copy link
Contributor Author

Do I need to fix the python style ? I have use lint-python to check my python style.
There are only some E501 errors like that:
[examples/src/main/python/broadcast.py:46:80: E501 line too long (105 > 79 characters)]
But I saw it at other scripts too. If I fix it I will make my comment ugly.
[examples/src/main/python/streaming/recoverable_network_wordcount.py:47:80: E501 line too long (85 > 79 characters)]

@srowen
Copy link
Member

srowen commented Feb 8, 2015

@lazyman500 yes, this needs to pass python style checks. You can use ./dev/lint-python to check your code. But I think it's also important for the Python maintainers to confirm whether this is an additive example.

@SparkQA
Copy link

SparkQA commented Feb 8, 2015

Test build #27045 has finished for PR 4417 at commit 28b8a55.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We change the default Broadcast factory to TorrentBroadcastFactory, is there any reason we use Http as default here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davies Thanks for your review
I imitate the scala example (https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/BroadcastTest.scala)
I guess that author want to tell us how to change boardCast Type :)
Do I need change the default Broadcast factory to TorrentBroadcastFactory ?

@davies
Copy link
Contributor

davies commented Jun 2, 2015

It's good to have these examples, thanks for working on it. I had took a round on it, left few comments.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@rxin
Copy link
Contributor

rxin commented Dec 31, 2015

I'm going to close this pull request. If this is still relevant and you are interested in pushing it forward, please open a new pull request. Thanks!

@asfgit asfgit closed this in 7b4452b Dec 31, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants