Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-14985][ML] Update LinearRegression, LogisticRegression summary internals to handle model copy #12823

Closed
wants to merge 3 commits into from

Conversation

BenFradet
Copy link
Contributor

What changes were proposed in this pull request?

The summaries now have a internal copy of the model

How was this patch tested?

Reran the concerned suites

@BenFradet
Copy link
Contributor Author

BenFradet commented May 1, 2016

@jkbradley

Regarding linear regression, should the tValues, pValues, coefficientStandardErrors and the different metrics be moved to the training summary?

Same thing for logistic regression, should the different metrics be moved to the training summary?

Also, I noticed some discrepancy regarding the visibility of some fields in linear regression compared to generalized linear regression:

  • numInstances is public in lr and regression private in glr
  • degreesOfFreedom is private in lr and public in glr

Is this intentional?

@SparkQA
Copy link

SparkQA commented May 1, 2016

Test build #57482 has finished for PR 12823 at commit 030442c.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BenFradet BenFradet closed this Jun 29, 2016
@jkbradley
Copy link
Member

@BenFradet I'm sorry for dropping the ball on this one. Did you close this due to inactivity? If you're willing, it would be nice to do this cleanup.

To answer your questions:

Regarding linear regression, should the tValues, pValues, coefficientStandardErrors and the different metrics be moved to the training summary?
Same thing for logistic regression, should the different metrics be moved to the training summary?

These can go in the summary, not just the training summary, since they can be calculated from the model. The training summary just has values which are specific to the training process.

Also, I noticed some discrepancy regarding the visibility of some fields in linear regression compared to generalized linear regression:

  • numInstances is public in lr and regression private in glr
  • degreesOfFreedom is private in lr and public in glr

Making numInstances and degreesOfFreedom public sounds good to me.

Thanks!

@BenFradet BenFradet reopened this Mar 23, 2017
@BenFradet
Copy link
Contributor Author

@jkbradley I did close this due to inactivity, I'm reopening it as I now have a bit of time.

@SparkQA
Copy link

SparkQA commented Mar 23, 2017

Test build #75107 has finished for PR 12823 at commit 030442c.

  • This patch fails MiMa tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 23, 2017

Test build #75110 has finished for PR 12823 at commit a996b6d.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2017

Test build #75169 has finished for PR 12823 at commit d0a04bb.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 26, 2017

Test build #75242 has finished for PR 12823 at commit 74e1fa0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BenFradet
Copy link
Contributor Author

org.apache.spark.storage.BlockManagerProactiveReplicationSuite.proactive block replication - 3 replicas - 2 block manager deletions doesn't seem to be linked, relaunching the tests.

@BenFradet
Copy link
Contributor Author

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Mar 26, 2017

Test build #75245 has finished for PR 12823 at commit 74e1fa0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@BenFradet
Copy link
Contributor Author

ping @jkbradley if you could take a look, that'd be great.

If you have the time, there is also the #17431 segue.

@MLnick
Copy link
Contributor

MLnick commented May 17, 2017

@BenFradet I think this needs to be updated now - and should also take into account #15435 and #17586

@BenFradet
Copy link
Contributor Author

@MLnick ok, I'll have a look

@BenFradet
Copy link
Contributor Author

@MLnick should we wait on those PRs being merged before moving forward with this?

@BenFradet BenFradet closed this Aug 21, 2018
@SparkQA
Copy link

SparkQA commented Aug 21, 2018

Test build #95021 has finished for PR 12823 at commit 74e1fa0.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants