-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3128][MLLIB] Use streaming test suite for StreamingLR #2037
Conversation
- Test predictOnValues for accuracy on a test stream
- Made mllib depend on tests from streaming - Rewrote all streamingLR tests to use the setupStreams & runStreams functions
Jenkins, test this please. |
@@ -242,7 +242,7 @@ trait TestSuiteBase extends FunSuite with BeforeAndAfter with Logging { | |||
logInfo("numBatches = " + numBatches + ", numExpectedOutput = " + numExpectedOutput) | |||
|
|||
// Get the output buffer | |||
val outputStream = ssc.graph.getOutputStreams.head.asInstanceOf[TestOutputStreamWithPartitions[V]] | |||
val outputStream = ssc.graph.getOutputStreams.filter(_.isInstanceOf[TestOutputStreamWithPartitions[_]]).head.asInstanceOf[TestOutputStreamWithPartitions[V]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is over 100 characters.
QA tests have started for PR 2037 at commit
|
|
||
// compute the mean absolute error and check that it's always less than 0.1 | ||
val errors = output.map(batch => batch.map( | ||
p => math.abs(p._1 - p._2)).reduce(_+_) / nPoints.toDouble) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.reduce(_+_)
-> sum
Then it could be fit into one line:
val errors = output.map(batch => batch.map(p => math.abs(p._1 - p._2)).sum / nPoints)
sum returns Double so we don't need to call toDouble
explicitly.
QA tests have finished for PR 2037 at commit
|
Jenkins, test this please. |
QA tests have started for PR 2037 at commit
|
QA tests have started for PR 2037 at commit
|
// compute the mean absolute error and check that it's always less than 0.1 | ||
val errors = output.map(batch => batch.map(p => math.abs(p._1 - p._2)).sum / nPoints) | ||
assert(errors.forall(x => x <= 0.1)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra line
LGTM, if tests pass. Small nits, which can be ignored if there is nothing else to be changed. |
QA tests have finished for PR 2037 at commit
|
QA tests have finished for PR 2037 at commit
|
Refactored tests for streaming linear regression to use existing streaming test utilities. Summary of changes: - Made ``mllib`` depend on tests from ``streaming`` - Rewrote accuracy and convergence tests to use ``setupStreams`` and ``runStreams`` - Added new test for the accuracy of predictions generated by ``predictOnValue`` These tests should run faster, be easier to extend/maintain, and provide a reference for new tests. mengxr tdas Author: freeman <the.freeman.lab@gmail.com> Closes #2037 from freeman-lab/streamingLR-predict-tests and squashes the following commits: e851ca7 [freeman] Fixed long lines 50eb0bf [freeman] Refactored tests to use streaming test tools 32c43c2 [freeman] Added test for prediction (cherry picked from commit 31f0b07) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
Thanks a lot @freeman-lab |
@tdas It seems that this creates a folder called |
Refactored tests for streaming linear regression to use existing streaming test utilities. Summary of changes: - Made ``mllib`` depend on tests from ``streaming`` - Rewrote accuracy and convergence tests to use ``setupStreams`` and ``runStreams`` - Added new test for the accuracy of predictions generated by ``predictOnValue`` These tests should run faster, be easier to extend/maintain, and provide a reference for new tests. mengxr tdas Author: freeman <the.freeman.lab@gmail.com> Closes apache#2037 from freeman-lab/streamingLR-predict-tests and squashes the following commits: e851ca7 [freeman] Fixed long lines 50eb0bf [freeman] Refactored tests to use streaming test tools 32c43c2 [freeman] Added test for prediction
Refactored tests for streaming linear regression to use existing streaming test utilities. Summary of changes:
mllib
depend on tests fromstreaming
setupStreams
andrunStreams
predictOnValue
These tests should run faster, be easier to extend/maintain, and provide a reference for new tests.
@mengxr @tdas