Skip to content

Conversation

@vkorukanti
Copy link
Member

@vkorukanti vkorukanti commented Jun 15, 2021

What changes were proposed in this pull request?

Remove the usage of the enumerating subclasses of StateStoreCustomMetric dependency.

To achieve it, add couple of utility methods to StateStoreCustomMetric

  • withNewDesc(desc : String) to StateStoreCustomMetric for cloning the instance with a new desc (currently used in SymmetricHashJoinStateManager)
  • createSQLMetric(sparkContext: sparkContext): SQLMetric for creating a corresponding SQLMetric to show the metric in UI and accumulate at the query level (currently used in statefulOperator. stateStoreCustomMetrics)

Why are the changes needed?

Code in SymmetricHashJoinStateManager and StateStoreWriter rely on the subclass implementations of StateStoreCustomMetric.

If a new subclass of StateStoreCustomMetric is added, it requires code changes to SymmetricHashJoinStateManager and StateStoreWriter, and we may miss the update if there is no existing test coverage.

To prevent these issues add a couple of utility methods to StateStoreCustomMetric as mentioned above.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing UT and a new UT

@HyukjinKwon
Copy link
Member

cc @HeartSaVioR @xuanyuanking FYI

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't quite get why it worths the effort, you have add code either here or in the case class. no difference to me.

or maybe you can save the effort by provide some default implementation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main point is to avoid making changes here (in SymmetricHashJoinManager) whenever you add a new case class. You just add all the code required in the case class when a new one is added. It is easy to miss adding the code here and with this patch we can avoid exposing the subclasses of StateStoreCustomMetric here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
this is easier to maintain. exhaustive enumeration of subclasses is brittle to addition of subclasses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be even better if with some scala magic the default implementation could be provided. but its not obvious to me how, because this is calling the copy constructor which is not available in any trait.

Copy link
Contributor

@HeartSaVioR HeartSaVioR Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are also considering extension point from outside (like 3rd party state store provider) to add a new implementation of StateStoreCustomMetric, we should definitely not suppose the new implementation to be a case class having desc as field. (Even they have it, we can't leverage it unless doing reflection.) I think this is the right way to do.

@tdas
Copy link
Contributor

tdas commented Jun 16, 2021

LGTM once again. thank you for catching the other place where we were doing the same mistake.

@HeartSaVioR
Copy link
Contributor

Indeed. I was about to comment on the missing spot but it was fixed so fast! Thanks for fixing the overall issue!

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 pending build.

@HeartSaVioR
Copy link
Contributor

add to whitelist

@HeartSaVioR
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Test build #139830 has finished for PR 32914 at commit 5dddc66.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44358/

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44358/

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44359/

@HeartSaVioR
Copy link
Contributor

https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139830

StackOverflowError happened while compiling... I'll retrigger again to see whether it's intermittent or not.

@HeartSaVioR
Copy link
Contributor

retest this, please

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44359/

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44361/

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44361/

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Test build #139833 has finished for PR 32914 at commit 5dddc66.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Test build #139829 has finished for PR 32914 at commit 1031593.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44367/

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44367/

@SparkQA
Copy link

SparkQA commented Jun 16, 2021

Test build #139839 has finished for PR 32914 at commit 7211c4b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vkorukanti vkorukanti changed the title [SPARK-35763][SS] Add a new copy method to StateStoreCustomMetric [SPARK-35763][SS] Remove the usage of StateStoreCustomMetric subclass enumeration dependency Jun 16, 2021
@vkorukanti vkorukanti changed the title [SPARK-35763][SS] Remove the usage of StateStoreCustomMetric subclass enumeration dependency [SPARK-35763][SS] Remove the StateStoreCustomMetric subclass enumeration dependency Jun 16, 2021
@HeartSaVioR
Copy link
Contributor

Thanks! Merging to master!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants