Skip to content

[SPARK-19002][BUILD][PYTHON] Check pep8 against all Python scripts#16405

Closed
HyukjinKwon wants to merge 2 commits intoapache:masterfrom
HyukjinKwon:minor-pep8
Closed

[SPARK-19002][BUILD][PYTHON] Check pep8 against all Python scripts#16405
HyukjinKwon wants to merge 2 commits intoapache:masterfrom
HyukjinKwon:minor-pep8

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Dec 26, 2016

What changes were proposed in this pull request?

This PR proposes to check pep8 against all other Python scripts and fix the errors as below:

./dev/create-release/generate-contributors.py
./dev/create-release/releaseutils.py
./dev/create-release/translate-contributors.py
./dev/lint-python
./python/docs/epytext.py
./examples/src/main/python/mllib/decision_tree_classification_example.py
./examples/src/main/python/mllib/decision_tree_regression_example.py
./examples/src/main/python/mllib/gradient_boosting_classification_example.py
./examples/src/main/python/mllib/gradient_boosting_regression_example.py
./examples/src/main/python/mllib/linear_regression_with_sgd_example.py
./examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py
./examples/src/main/python/mllib/naive_bayes_example.py
./examples/src/main/python/mllib/random_forest_classification_example.py
./examples/src/main/python/mllib/random_forest_regression_example.py
./examples/src/main/python/mllib/svm_with_sgd_example.py
./examples/src/main/python/streaming/network_wordjoinsentiments.py
./sql/hive/src/test/resources/data/scripts/cat.py
./sql/hive/src/test/resources/data/scripts/cat_error.py
./sql/hive/src/test/resources/data/scripts/doubleescapedtab.py
./sql/hive/src/test/resources/data/scripts/dumpdata_script.py
./sql/hive/src/test/resources/data/scripts/escapedcarriagereturn.py
./sql/hive/src/test/resources/data/scripts/escapednewline.py
./sql/hive/src/test/resources/data/scripts/escapedtab.py
./sql/hive/src/test/resources/data/scripts/input20_script.py
./sql/hive/src/test/resources/data/scripts/newline.py

How was this patch tested?

  • ./python/docs/epytext.py

    cd ./python/docs && make html
  • pep8 check (Python 2.7 / Python 3.3.6)

    ./dev/lint-python
    
  • ./dev/merge_spark_pr.py (Python 2.7 only / Python 3.3.6 not working)

    python -m doctest -v ./dev/merge_spark_pr.py
  • ./dev/create-release/releaseutils.py ./dev/create-release/generate-contributors.py ./dev/create-release/translate-contributors.py (Python 2.7 only / Python 3.3.6 not working)

    python generate-contributors.py
    python translate-contributors.py
  • Examples (Python 2.7 / Python 3.3.6)

    ./bin/spark-submit examples/src/main/python/mllib/decision_tree_classification_example.py
    ./bin/spark-submit examples/src/main/python/mllib/decision_tree_regression_example.py
    ./bin/spark-submit examples/src/main/python/mllib/gradient_boosting_classification_example.py
    ./bin/spark-submit examples/src/main/python/mllib/gradient_boosting_regression_example.p
    ./bin/spark-submit examples/src/main/python/mllib/random_forest_classification_example.py
    ./bin/spark-submit examples/src/main/python/mllib/random_forest_regression_example.py
  • Examples (Python 2.7 only / Python 3.3.6 not working)

    ./bin/spark-submit examples/src/main/python/mllib/linear_regression_with_sgd_example.py
    ./bin/spark-submit examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py
    ./bin/spark-submit examples/src/main/python/mllib/naive_bayes_example.py
    ./bin/spark-submit examples/src/main/python/mllib/svm_with_sgd_example.py
    
  • sql/hive/src/test/resources/data/scripts/*.py (Python 2.7 / Python 3.3.6 within suggested changes)

    Manually tested only changed ones.

  • ./dev/github_jira_sync.py (Python 2.7 only / Python 3.3.6 not working)

    Manually tested this after disabling actually adding comments and links.

And also via Jenkins tests.

@SparkQA
Copy link

SparkQA commented Dec 26, 2016

Test build #70601 has finished for PR 16405 at commit 8af1edb.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Dec 26, 2016

Hm, this was passed on my local. Will fix them

@HyukjinKwon
Copy link
Member Author

Hi @srowen and @holden, this is a small minor PR to check pep8 against ./dev/merge_spark_pr.py. Could you check if it makes sense please?

@SparkQA
Copy link

SparkQA commented Dec 26, 2016

Test build #70602 has finished for PR 16405 at commit 4c3e496.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

dev/lint-python Outdated
Copy link
Member

@dongjoon-hyun dongjoon-hyun Dec 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @HyukjinKwon .
Is this the last PEP8 fix? If then, could you remove the above TODO?
If not, what about fixing all PEP8 errors in this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I not sure if this is the last one, but I'm on mobile (yay Hawaii), but if it is we should change PATHS_TO_CHECK to wildcard include under ./dev/*.py

Copy link
Member Author

@HyukjinKwon HyukjinKwon Dec 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense. Let me try to do this for all.Thank you both.

@holdenk
Copy link
Contributor

holdenk commented Dec 26, 2016

I'll try and take a more detailed look later on tonight.

@HyukjinKwon HyukjinKwon changed the title [SPARK-19002][BUILD] Check pep8 against merge_spark_pr.py script [SPARK-19002][BUILD] Check pep8 against dev/*.py scripts Dec 27, 2016
@SparkQA
Copy link

SparkQA commented Dec 27, 2016

Test build #70616 has finished for PR 16405 at commit f756e09.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

Ah, this seems complaining in Python 3.

@SparkQA
Copy link

SparkQA commented Dec 27, 2016

Test build #70618 has finished for PR 16405 at commit 6230678.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 27, 2016

Test build #70620 has finished for PR 16405 at commit 330b4fe.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 27, 2016

Test build #70622 has finished for PR 16405 at commit 330b4fe.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 27, 2016

Test build #70624 has finished for PR 16405 at commit 330b4fe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems reasonable, though while we're here, how about checking all other *.py scripts in the project that aren't Pyspark files? things like release scripts for example.

@HyukjinKwon
Copy link
Member Author

Sure, let me double check.

@HyukjinKwon HyukjinKwon changed the title [SPARK-19002][BUILD] Check pep8 against dev/*.py scripts [WIP][SPARK-19002][BUILD] Check pep8 against all Python scripts Dec 28, 2016
@HyukjinKwon HyukjinKwon changed the title [WIP][SPARK-19002][BUILD] Check pep8 against all Python scripts [SPARK-19002][BUILD] Check pep8 against all Python scripts Dec 28, 2016
@SparkQA
Copy link

SparkQA commented Dec 28, 2016

Test build #70666 has finished for PR 16405 at commit e4aa0af.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Dec 28, 2016

It seems some existing scripts and examples such as random_rdd_generation.py do not work with Python 3.3.6 too although it complies fine so that pep8 check can be passed now. I fixed only the errors from pep8 here.

@HyukjinKwon
Copy link
Member Author

BTW, anyone tried Python 3.6.0 with PySpark? I could not even run ./bin/pyspark appeartly with an error.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't know if these scripts have been tested with Python 3.x at all. You're right that this is a separate issue. I assume that these changes fix style only, meaning they should keep working with 2.x, but also might be a little closer to working with 3.x.

@HyukjinKwon
Copy link
Member Author

Ah, thank you for approving @srowen.

@HyukjinKwon HyukjinKwon changed the title [SPARK-19002][BUILD] Check pep8 against all Python scripts [SPARK-19002][BUILD][PYTHON] Check pep8 against all Python scripts Dec 28, 2016
@SparkQA
Copy link

SparkQA commented Dec 28, 2016

Test build #70667 has finished for PR 16405 at commit aa6f427.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

I just manually ran ./dev/create-release/translate-contributors.py which had a conflict for sure.

@SparkQA
Copy link

SparkQA commented Dec 30, 2016

Test build #70734 has finished for PR 16405 at commit 5ffa382.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

HyukjinKwon commented Dec 30, 2016

retest this please

@SparkQA
Copy link

SparkQA commented Dec 30, 2016

Test build #70737 has started for PR 16405 at commit 5ffa382.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 30, 2016

Test build #70738 has finished for PR 16405 at commit 5ffa382.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

dev/lint-python Outdated
PATHS_TO_CHECK="$PATHS_TO_CHECK ./dev/run-tests.py ./python/*.py ./dev/run-tests-jenkins.py"
PATHS_TO_CHECK="$PATHS_TO_CHECK ./dev/pip-sanity-check.py"
# Exclude auto-geneated configuration file.
PATHS_TO_CHECK="$( find "$SPARK_ROOT_DIR" -name "*.py" -not -path "*python/docs/conf.py" )"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly concerned we might eventually have this be too long to pass in the shell (on Linux in bash ARG_MAX is pretty high but that's not the case everywhere, although we would probably have to double the number of Python files before this started being an issue in Cygwin).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think this is a valid point. Let me check the length and the length limitation first for sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems usually 32K on Cygwin by default in general. The actual length without any prefix seems 11K for now. Let me try to turn these into relative paths as a safe choice. Then, it would be safe in general.

predictions = model.predict(testData.map(lambda x: x.features))
labelsAndPredictions = testData.map(lambda lp: lp.label).zip(predictions)
testMSE = labelsAndPredictions.map(lambda (v, p): (v - p) * (v - p)).sum() /\
testMSE = labelsAndPredictions.map(lambda lp: (lp[0] - lp[1]) * (lp[0] - lp[1])).sum() /\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we get rid of the lambda (v, p) & similar elsewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems causing errors in python 3 when a tuple is used in lambda to unpack. It seems http://www.python.org/dev/peps/pep-3113 is related issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, makes sense - I was looking at changes directly from pep8 but if we need it to be compiled with python3 to test py3 pep8 that makes sense (of course a follow up issue for proper py3 support is the best place to address the issues not blocking pep8 testing).

PATHS_TO_CHECK="$PATHS_TO_CHECK ./dev/run-tests.py ./python/*.py ./dev/run-tests-jenkins.py"
PATHS_TO_CHECK="$PATHS_TO_CHECK ./dev/pip-sanity-check.py"
# Exclude auto-geneated configuration file.
PATHS_TO_CHECK="$( cd "$SPARK_ROOT_DIR" && find . -name "*.py" -not -path "*python/docs/conf.py" )"
Copy link
Member Author

@HyukjinKwon HyukjinKwon Dec 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this by running three times as below for sure,

./lint-python
./dev/lint-python
./spark/dev/lint-python

So, now it is relative paths which currently are up to 11K as below:

./dev/create-release/generate-contributors.py ./dev/create-release/releaseutils.py ./dev/create-release/translate-contributors.py ./dev/github_jira_sync.py ./dev/merge_spark_pr.py ./dev/pep8-1.7.0.py ./dev/pip-sanity-check.py ./dev/run-tests-jenkins.py ./dev/run-tests.py ./dev/sparktestsupport/__init__.py ./dev/sparktestsupport/modules.py ./dev/sparktestsupport/shellutils.py ./dev/sparktestsupport/toposort.py ./examples/src/main/python/als.py ./examples/src/main/python/avro_inputformat.py ./examples/src/main/python/kmeans.py ./examples/src/main/python/logistic_regression.py ./examples/src/main/python/ml/aft_survival_regression.py ./examples/src/main/python/ml/als_example.py ./examples/src/main/python/ml/binarizer_example.py ./examples/src/main/python/ml/bisecting_k_means_example.py ./examples/src/main/python/ml/bucketizer_example.py ./examples/src/main/python/ml/chisq_selector_example.py ./examples/src/main/python/ml/count_vectorizer_example.py ./examples/src/main/python/ml/cross_validator.py ./examples/src/main/python/ml/dataframe_example.py ./examples/src/main/python/ml/dct_example.py ./examples/src/main/python/ml/decision_tree_classification_example.py ./examples/src/main/python/ml/decision_tree_regression_example.py ./examples/src/main/python/ml/elementwise_product_example.py ./examples/src/main/python/ml/estimator_transformer_param_example.py ./examples/src/main/python/ml/gaussian_mixture_example.py ./examples/src/main/python/ml/generalized_linear_regression_example.py ./examples/src/main/python/ml/gradient_boosted_tree_classifier_example.py ./examples/src/main/python/ml/gradient_boosted_tree_regressor_example.py ./examples/src/main/python/ml/index_to_string_example.py ./examples/src/main/python/ml/isotonic_regression_example.py ./examples/src/main/python/ml/kmeans_example.py ./examples/src/main/python/ml/lda_example.py ./examples/src/main/python/ml/linear_regression_with_elastic_net.py ./examples/src/main/python/ml/logistic_regression_summary_example.py ./examples/src/main/python/ml/logistic_regression_with_elastic_net.py ./examples/src/main/python/ml/max_abs_scaler_example.py ./examples/src/main/python/ml/min_max_scaler_example.py ./examples/src/main/python/ml/multiclass_logistic_regression_with_elastic_net.py ./examples/src/main/python/ml/multilayer_perceptron_classification.py ./examples/src/main/python/ml/n_gram_example.py ./examples/src/main/python/ml/naive_bayes_example.py ./examples/src/main/python/ml/normalizer_example.py ./examples/src/main/python/ml/one_vs_rest_example.py ./examples/src/main/python/ml/onehot_encoder_example.py ./examples/src/main/python/ml/pca_example.py ./examples/src/main/python/ml/pipeline_example.py ./examples/src/main/python/ml/polynomial_expansion_example.py ./examples/src/main/python/ml/quantile_discretizer_example.py ./examples/src/main/python/ml/random_forest_classifier_example.py ./examples/src/main/python/ml/random_forest_regressor_example.py ./examples/src/main/python/ml/rformula_example.py ./examples/src/main/python/ml/sql_transformer.py ./examples/src/main/python/ml/standard_scaler_example.py ./examples/src/main/python/ml/stopwords_remover_example.py ./examples/src/main/python/ml/string_indexer_example.py ./examples/src/main/python/ml/tf_idf_example.py ./examples/src/main/python/ml/tokenizer_example.py ./examples/src/main/python/ml/train_validation_split.py ./examples/src/main/python/ml/vector_assembler_example.py ./examples/src/main/python/ml/vector_indexer_example.py ./examples/src/main/python/ml/vector_slicer_example.py ./examples/src/main/python/ml/word2vec_example.py ./examples/src/main/python/mllib/binary_classification_metrics_example.py ./examples/src/main/python/mllib/bisecting_k_means_example.py ./examples/src/main/python/mllib/correlations.py ./examples/src/main/python/mllib/correlations_example.py ./examples/src/main/python/mllib/decision_tree_classification_example.py ./examples/src/main/python/mllib/decision_tree_regression_example.py ./examples/src/main/python/mllib/elementwise_product_example.py ./examples/src/main/python/mllib/fpgrowth_example.py ./examples/src/main/python/mllib/gaussian_mixture_example.py ./examples/src/main/python/mllib/gaussian_mixture_model.py ./examples/src/main/python/mllib/gradient_boosting_classification_example.py ./examples/src/main/python/mllib/gradient_boosting_regression_example.py ./examples/src/main/python/mllib/hypothesis_testing_example.py ./examples/src/main/python/mllib/hypothesis_testing_kolmogorov_smirnov_test_example.py ./examples/src/main/python/mllib/isotonic_regression_example.py ./examples/src/main/python/mllib/k_means_example.py ./examples/src/main/python/mllib/kernel_density_estimation_example.py ./examples/src/main/python/mllib/kmeans.py ./examples/src/main/python/mllib/latent_dirichlet_allocation_example.py ./examples/src/main/python/mllib/linear_regression_with_sgd_example.py ./examples/src/main/python/mllib/logistic_regression.py ./examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py ./examples/src/main/python/mllib/multi_class_metrics_example.py ./examples/src/main/python/mllib/multi_label_metrics_example.py ./examples/src/main/python/mllib/naive_bayes_example.py ./examples/src/main/python/mllib/normalizer_example.py ./examples/src/main/python/mllib/power_iteration_clustering_example.py ./examples/src/main/python/mllib/random_forest_classification_example.py ./examples/src/main/python/mllib/random_forest_regression_example.py ./examples/src/main/python/mllib/random_rdd_generation.py ./examples/src/main/python/mllib/ranking_metrics_example.py ./examples/src/main/python/mllib/recommendation_example.py ./examples/src/main/python/mllib/regression_metrics_example.py ./examples/src/main/python/mllib/sampled_rdds.py ./examples/src/main/python/mllib/standard_scaler_example.py ./examples/src/main/python/mllib/stratified_sampling_example.py ./examples/src/main/python/mllib/streaming_k_means_example.py ./examples/src/main/python/mllib/streaming_linear_regression_example.py ./examples/src/main/python/mllib/summary_statistics_example.py ./examples/src/main/python/mllib/svm_with_sgd_example.py ./examples/src/main/python/mllib/tf_idf_example.py ./examples/src/main/python/mllib/word2vec.py ./examples/src/main/python/mllib/word2vec_example.py ./examples/src/main/python/pagerank.py ./examples/src/main/python/parquet_inputformat.py ./examples/src/main/python/pi.py ./examples/src/main/python/sort.py ./examples/src/main/python/sql/basic.py ./examples/src/main/python/sql/datasource.py ./examples/src/main/python/sql/hive.py ./examples/src/main/python/sql/streaming/structured_kafka_wordcount.py ./examples/src/main/python/sql/streaming/structured_network_wordcount.py ./examples/src/main/python/sql/streaming/structured_network_wordcount_windowed.py ./examples/src/main/python/status_api_demo.py ./examples/src/main/python/streaming/direct_kafka_wordcount.py ./examples/src/main/python/streaming/flume_wordcount.py ./examples/src/main/python/streaming/hdfs_wordcount.py ./examples/src/main/python/streaming/kafka_wordcount.py ./examples/src/main/python/streaming/network_wordcount.py ./examples/src/main/python/streaming/network_wordjoinsentiments.py ./examples/src/main/python/streaming/queue_stream.py ./examples/src/main/python/streaming/recoverable_network_wordcount.py ./examples/src/main/python/streaming/sql_network_wordcount.py ./examples/src/main/python/streaming/stateful_network_wordcount.py ./examples/src/main/python/transitive_closure.py ./examples/src/main/python/wordcount.py ./external/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py ./python/docs/epytext.py ./python/pyspark/__init__.py ./python/pyspark/accumulators.py ./python/pyspark/broadcast.py ./python/pyspark/cloudpickle.py ./python/pyspark/conf.py ./python/pyspark/context.py ./python/pyspark/daemon.py ./python/pyspark/files.py ./python/pyspark/find_spark_home.py ./python/pyspark/heapq3.py ./python/pyspark/java_gateway.py ./python/pyspark/join.py ./python/pyspark/ml/__init__.py ./python/pyspark/ml/base.py ./python/pyspark/ml/classification.py ./python/pyspark/ml/clustering.py ./python/pyspark/ml/common.py ./python/pyspark/ml/evaluation.py ./python/pyspark/ml/feature.py ./python/pyspark/ml/linalg/__init__.py ./python/pyspark/ml/param/__init__.py ./python/pyspark/ml/param/_shared_params_code_gen.py ./python/pyspark/ml/param/shared.py ./python/pyspark/ml/pipeline.py ./python/pyspark/ml/recommendation.py ./python/pyspark/ml/regression.py ./python/pyspark/ml/tests.py ./python/pyspark/ml/tuning.py ./python/pyspark/ml/util.py ./python/pyspark/ml/wrapper.py ./python/pyspark/mllib/__init__.py ./python/pyspark/mllib/classification.py ./python/pyspark/mllib/clustering.py ./python/pyspark/mllib/common.py ./python/pyspark/mllib/evaluation.py ./python/pyspark/mllib/feature.py ./python/pyspark/mllib/fpm.py ./python/pyspark/mllib/linalg/__init__.py ./python/pyspark/mllib/linalg/distributed.py ./python/pyspark/mllib/random.py ./python/pyspark/mllib/recommendation.py ./python/pyspark/mllib/regression.py ./python/pyspark/mllib/stat/__init__.py ./python/pyspark/mllib/stat/_statistics.py ./python/pyspark/mllib/stat/distribution.py ./python/pyspark/mllib/stat/KernelDensity.py ./python/pyspark/mllib/stat/test.py ./python/pyspark/mllib/tests.py ./python/pyspark/mllib/tree.py ./python/pyspark/mllib/util.py ./python/pyspark/profiler.py ./python/pyspark/rdd.py ./python/pyspark/rddsampler.py ./python/pyspark/resultiterable.py ./python/pyspark/serializers.py ./python/pyspark/shell.py ./python/pyspark/shuffle.py ./python/pyspark/sql/__init__.py ./python/pyspark/sql/catalog.py ./python/pyspark/sql/column.py ./python/pyspark/sql/conf.py ./python/pyspark/sql/context.py ./python/pyspark/sql/dataframe.py ./python/pyspark/sql/functions.py ./python/pyspark/sql/group.py ./python/pyspark/sql/readwriter.py ./python/pyspark/sql/session.py ./python/pyspark/sql/streaming.py ./python/pyspark/sql/tests.py ./python/pyspark/sql/types.py ./python/pyspark/sql/utils.py ./python/pyspark/sql/window.py ./python/pyspark/statcounter.py ./python/pyspark/status.py ./python/pyspark/storagelevel.py ./python/pyspark/streaming/__init__.py ./python/pyspark/streaming/context.py ./python/pyspark/streaming/dstream.py ./python/pyspark/streaming/flume.py ./python/pyspark/streaming/kafka.py ./python/pyspark/streaming/kinesis.py ./python/pyspark/streaming/listener.py ./python/pyspark/streaming/tests.py ./python/pyspark/streaming/util.py ./python/pyspark/taskcontext.py ./python/pyspark/tests.py ./python/pyspark/traceback_utils.py ./python/pyspark/version.py ./python/pyspark/worker.py ./python/run-tests.py ./python/setup.py ./python/test_support/SimpleHTTPServer.py ./python/test_support/userlibrary.py ./sql/hive/src/test/resources/data/scripts/cat.py ./sql/hive/src/test/resources/data/scripts/cat_error.py ./sql/hive/src/test/resources/data/scripts/doubleescapedtab.py ./sql/hive/src/test/resources/data/scripts/dumpdata_script.py ./sql/hive/src/test/resources/data/scripts/escapedcarriagereturn.py ./sql/hive/src/test/resources/data/scripts/escapednewline.py ./sql/hive/src/test/resources/data/scripts/escapedtab.py ./sql/hive/src/test/resources/data/scripts/input20_script.py ./sql/hive/src/test/resources/data/scripts/newline.py ./sql/hive/src/test/resources/data/scripts/test_transform.py

@SparkQA
Copy link

SparkQA commented Dec 31, 2016

Test build #70766 has finished for PR 16405 at commit 3da7aec.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jan 2, 2017

Merged to master

@asfgit asfgit closed this in 46b2126 Jan 2, 2017
cmonkey pushed a commit to cmonkey/spark that referenced this pull request Jan 4, 2017
## What changes were proposed in this pull request?

This PR proposes to check pep8 against all other Python scripts and fix the errors as below:

```bash
./dev/create-release/generate-contributors.py
./dev/create-release/releaseutils.py
./dev/create-release/translate-contributors.py
./dev/lint-python
./python/docs/epytext.py
./examples/src/main/python/mllib/decision_tree_classification_example.py
./examples/src/main/python/mllib/decision_tree_regression_example.py
./examples/src/main/python/mllib/gradient_boosting_classification_example.py
./examples/src/main/python/mllib/gradient_boosting_regression_example.py
./examples/src/main/python/mllib/linear_regression_with_sgd_example.py
./examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py
./examples/src/main/python/mllib/naive_bayes_example.py
./examples/src/main/python/mllib/random_forest_classification_example.py
./examples/src/main/python/mllib/random_forest_regression_example.py
./examples/src/main/python/mllib/svm_with_sgd_example.py
./examples/src/main/python/streaming/network_wordjoinsentiments.py
./sql/hive/src/test/resources/data/scripts/cat.py
./sql/hive/src/test/resources/data/scripts/cat_error.py
./sql/hive/src/test/resources/data/scripts/doubleescapedtab.py
./sql/hive/src/test/resources/data/scripts/dumpdata_script.py
./sql/hive/src/test/resources/data/scripts/escapedcarriagereturn.py
./sql/hive/src/test/resources/data/scripts/escapednewline.py
./sql/hive/src/test/resources/data/scripts/escapedtab.py
./sql/hive/src/test/resources/data/scripts/input20_script.py
./sql/hive/src/test/resources/data/scripts/newline.py
```

## How was this patch tested?

- `./python/docs/epytext.py`

  ```bash
  cd ./python/docs $$ make html
  ```

- pep8 check (Python 2.7 / Python 3.3.6)

  ```
  ./dev/lint-python
  ```

- `./dev/merge_spark_pr.py` (Python 2.7 only / Python 3.3.6 not working)

  ```bash
  python -m doctest -v ./dev/merge_spark_pr.py
  ```

- `./dev/create-release/releaseutils.py` `./dev/create-release/generate-contributors.py` `./dev/create-release/translate-contributors.py` (Python 2.7 only / Python 3.3.6 not working)

  ```bash
  python generate-contributors.py
  python translate-contributors.py
  ```

- Examples (Python 2.7 / Python 3.3.6)

  ```bash
  ./bin/spark-submit examples/src/main/python/mllib/decision_tree_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/decision_tree_regression_example.py
  ./bin/spark-submit examples/src/main/python/mllib/gradient_boosting_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/gradient_boosting_regression_example.p
  ./bin/spark-submit examples/src/main/python/mllib/random_forest_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/random_forest_regression_example.py
  ```

- Examples (Python 2.7 only / Python 3.3.6 not working)
  ```
  ./bin/spark-submit examples/src/main/python/mllib/linear_regression_with_sgd_example.py
  ./bin/spark-submit examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py
  ./bin/spark-submit examples/src/main/python/mllib/naive_bayes_example.py
  ./bin/spark-submit examples/src/main/python/mllib/svm_with_sgd_example.py
  ```

- `sql/hive/src/test/resources/data/scripts/*.py` (Python 2.7 / Python 3.3.6 within suggested changes)

  Manually tested only changed ones.

- `./dev/github_jira_sync.py` (Python 2.7 only / Python 3.3.6 not working)

  Manually tested this after disabling actually adding comments and links.

And also via Jenkins tests.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#16405 from HyukjinKwon/minor-pep8.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
## What changes were proposed in this pull request?

This PR proposes to check pep8 against all other Python scripts and fix the errors as below:

```bash
./dev/create-release/generate-contributors.py
./dev/create-release/releaseutils.py
./dev/create-release/translate-contributors.py
./dev/lint-python
./python/docs/epytext.py
./examples/src/main/python/mllib/decision_tree_classification_example.py
./examples/src/main/python/mllib/decision_tree_regression_example.py
./examples/src/main/python/mllib/gradient_boosting_classification_example.py
./examples/src/main/python/mllib/gradient_boosting_regression_example.py
./examples/src/main/python/mllib/linear_regression_with_sgd_example.py
./examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py
./examples/src/main/python/mllib/naive_bayes_example.py
./examples/src/main/python/mllib/random_forest_classification_example.py
./examples/src/main/python/mllib/random_forest_regression_example.py
./examples/src/main/python/mllib/svm_with_sgd_example.py
./examples/src/main/python/streaming/network_wordjoinsentiments.py
./sql/hive/src/test/resources/data/scripts/cat.py
./sql/hive/src/test/resources/data/scripts/cat_error.py
./sql/hive/src/test/resources/data/scripts/doubleescapedtab.py
./sql/hive/src/test/resources/data/scripts/dumpdata_script.py
./sql/hive/src/test/resources/data/scripts/escapedcarriagereturn.py
./sql/hive/src/test/resources/data/scripts/escapednewline.py
./sql/hive/src/test/resources/data/scripts/escapedtab.py
./sql/hive/src/test/resources/data/scripts/input20_script.py
./sql/hive/src/test/resources/data/scripts/newline.py
```

## How was this patch tested?

- `./python/docs/epytext.py`

  ```bash
  cd ./python/docs $$ make html
  ```

- pep8 check (Python 2.7 / Python 3.3.6)

  ```
  ./dev/lint-python
  ```

- `./dev/merge_spark_pr.py` (Python 2.7 only / Python 3.3.6 not working)

  ```bash
  python -m doctest -v ./dev/merge_spark_pr.py
  ```

- `./dev/create-release/releaseutils.py` `./dev/create-release/generate-contributors.py` `./dev/create-release/translate-contributors.py` (Python 2.7 only / Python 3.3.6 not working)

  ```bash
  python generate-contributors.py
  python translate-contributors.py
  ```

- Examples (Python 2.7 / Python 3.3.6)

  ```bash
  ./bin/spark-submit examples/src/main/python/mllib/decision_tree_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/decision_tree_regression_example.py
  ./bin/spark-submit examples/src/main/python/mllib/gradient_boosting_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/gradient_boosting_regression_example.p
  ./bin/spark-submit examples/src/main/python/mllib/random_forest_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/random_forest_regression_example.py
  ```

- Examples (Python 2.7 only / Python 3.3.6 not working)
  ```
  ./bin/spark-submit examples/src/main/python/mllib/linear_regression_with_sgd_example.py
  ./bin/spark-submit examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py
  ./bin/spark-submit examples/src/main/python/mllib/naive_bayes_example.py
  ./bin/spark-submit examples/src/main/python/mllib/svm_with_sgd_example.py
  ```

- `sql/hive/src/test/resources/data/scripts/*.py` (Python 2.7 / Python 3.3.6 within suggested changes)

  Manually tested only changed ones.

- `./dev/github_jira_sync.py` (Python 2.7 only / Python 3.3.6 not working)

  Manually tested this after disabling actually adding comments and links.

And also via Jenkins tests.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes apache#16405 from HyukjinKwon/minor-pep8.
@HyukjinKwon HyukjinKwon deleted the minor-pep8 branch January 2, 2018 03:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants