Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark multiversion support #1325

Open
wants to merge 85 commits into
base: master
Choose a base branch
from
Open

Spark multiversion support #1325

wants to merge 85 commits into from

Conversation

sumwale
Copy link
Contributor

@sumwale sumwale commented Jun 10, 2019

Changes proposed in this pull request

  • support for multiple Spark versions from same code base
  • SparkInternals interface to abstract out the internal APIs used by SnappyData/AQP layers that have
    changed between 2.1 to 2.4 with implementations for 2.1.0/2.1.1/2.3.2
  • updated build to allow for "spark.connector.version" property to build smart connector for non-default
    Spark version

Patch testing

precheckin

ReleaseNotes.txt changes

Documentation for multiple spark version support in smart connector mode

Other PRs

TIBCOSoftware/snappy-store#478
https://github.com/SnappyDataInc/snappy-aqp/pull/187

Sumedh Wale added 17 commits October 22, 2018 15:41
remaining build failures = 34 with Spark 2.3.x (was > 200 originally)
- product will always use compatible Spark version but connector can use a different one
- added couple of sub-projects (core-product and aqp-product) that will always use the compatible version
  while connector build can use a different one
- cluster will depend on normal build if connector version is same else it will use core-product
@ashishkshukla
Copy link

ashishkshukla commented Jun 19, 2019

@sumwale - I was looking into this code base and trying to make a build for spark 2.3.2.
I noticed that we have defined a def newSnappySessionState(snappySession: SnappySession): SnappySessionState in SparkInternals trait which creates an instance of SnappySessionState for the given spark version.
It seems this needs to implemented for every spark version we support, as it has direct call from SnappySession sessionState .
We have its implementation in Spark210Internals.scala for spark 2.1.1 and spark 2.1.0 but its implementation is missing for spark 2.3.2. Do we need to work on the implementation or we are handing in different way as I can not see its implementation in Spark232Internals.scala .

made CREATE FUNCTION to be consistent with Spark
- add search and explicit cleanup of broadcast exchanges at the end of query execution
  (else they would only be cleared with GC collects the reference)
- corrected GUI plan timings and cleanup the END message to deliver it reliably
  and not leave dangling SQL tasks running forever in some cases.
- other test changes for Spark 2.4.5 to fix failures and
- fix code generation issue seen in TPCH Q20
- correct SD's SQL listener to link any jobs during planning and execution phase of a query correctly
  (reintroduced SparkListenerSQLPlanExecutionStart/End and handle SparkListenerSQLExecutionStart to search
   for any existing execution from SparkListenerSQLPlanExecutionStart then mark it as active instead of creating new one)
also fix few dunit test failures in ColumnBatchAndExternalTableDUnitTest
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants