Skip to content

Commit

Permalink
[SPARK-26033][PYTHON][TESTS] Break large ml/tests.py file into smalle…
Browse files Browse the repository at this point in the history
…r files

## What changes were proposed in this pull request?

This PR breaks down the large ml/tests.py file that contains all Python ML unit tests into several smaller test files to be easier to read and maintain.

The tests are broken down as follows:
```
pyspark
├── __init__.py
...
├── ml
│   ├── __init__.py
...
│   ├── tests
│   │   ├── __init__.py
│   │   ├── test_algorithms.py
│   │   ├── test_base.py
│   │   ├── test_evaluation.py
│   │   ├── test_feature.py
│   │   ├── test_image.py
│   │   ├── test_linalg.py
│   │   ├── test_param.py
│   │   ├── test_persistence.py
│   │   ├── test_pipeline.py
│   │   ├── test_stat.py
│   │   ├── test_training_summary.py
│   │   ├── test_tuning.py
│   │   └── test_wrapper.py
...
├── testing
...
│   ├── mlutils.py
...
```

## How was this patch tested?

Ran tests manually by module to ensure test count was the same, and ran `python/run-tests --modules=pyspark-ml` to verify all passing with Python 2.7 and Python 3.6.

Closes #23063 from BryanCutler/python-test-breakup-ml-SPARK-26033.

Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: hyukjinkwon <gurwls223@apache.org>
  • Loading branch information
BryanCutler authored and HyukjinKwon committed Nov 18, 2018
1 parent e00cac9 commit 034ae30
Show file tree
Hide file tree
Showing 17 changed files with 3,331 additions and 2,763 deletions.
16 changes: 15 additions & 1 deletion dev/sparktestsupport/modules.py
Expand Up @@ -452,6 +452,7 @@ def __hash__(self):
"python/pyspark/ml/"
],
python_test_goals=[
# doctests
"pyspark.ml.classification",
"pyspark.ml.clustering",
"pyspark.ml.evaluation",
Expand All @@ -463,7 +464,20 @@ def __hash__(self):
"pyspark.ml.regression",
"pyspark.ml.stat",
"pyspark.ml.tuning",
"pyspark.ml.tests",
# unittests
"pyspark.ml.tests.test_algorithms",
"pyspark.ml.tests.test_base",
"pyspark.ml.tests.test_evaluation",
"pyspark.ml.tests.test_feature",
"pyspark.ml.tests.test_image",
"pyspark.ml.tests.test_linalg",
"pyspark.ml.tests.test_param",
"pyspark.ml.tests.test_persistence",
"pyspark.ml.tests.test_pipeline",
"pyspark.ml.tests.test_stat",
"pyspark.ml.tests.test_training_summary",
"pyspark.ml.tests.test_tuning",
"pyspark.ml.tests.test_wrapper",
],
blacklisted_python_implementations=[
"PyPy" # Skip these tests under PyPy since they require numpy and it isn't available there
Expand Down

0 comments on commit 034ae30

Please sign in to comment.