Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Jobserver to 2.4.4 Spark #1283

Merged
merged 8 commits into from Mar 18, 2020
Merged

Conversation

bsikander
Copy link
Contributor

@bsikander bsikander commented Feb 25, 2020

Pull Request checklist

  • The commit(s) message(s) follows the contribution guidelines ?
  • Tests for the changes have been added (for bug fixes / features) ?
  • Docs have been added / updated (for bug fixes / features) ?

Current behavior : (link exiting issues here : https://help.github.com/articles/basic-writing-and-formatting-syntax/#referencing-issues-and-pull-requests)
Only supports uptil 2.3.2

New behavior :
Support for 2.4.4 added

The PR contains 4 commits and each commit has a commit message with more details.

BREAKING CHANGES
Disabling of Hive is not currently done for Python based contexts. I will push this change soon.

Other information:


This change is Reviewable

* Hive doesn't cleanup some of the data after the tables are dropped.
As Jobserver project has for now 2 tests, which use Hive, add
additional parameter LOCATION to table creation and use different
paths for metastore.
* Also add own warehouse configuration into hive-site.xml,
to distinguish data in the future.
* Enable REST api (spark.master.rest.enabled) for master explicitly
because the default has been changed (to false) in 2.4.4.

Change-Id: Id8604d0156f60970c494dd373bfaaafdbcb4d63f
Change-Id: Ic1c66785ea8c8645789c86dae83154e80458b7ec
C* connector 2.4 indirectly depends on
common-configuration which is brought in the
classpath by Hadoop 2.7. This dependency has
been changed in Hadoop 3.x, so C* connector
2.4 is broken. Until it is fixed, jobserver
puts the dependency on the classpath.

https://datastax-oss.atlassian.net/browse/SPARKC-566

Change-Id: I532ab22d2bb97dc5fd118c7178f67207b06bf885
Change-Id: Ic5128c5f306250c289f8c59d22f53d31bf87674e
@bsikander bsikander changed the title 244 upgrade Upgrade Jobserver to 2.4.4 Spark Feb 25, 2020
@bsikander
Copy link
Contributor Author

#1253 (comment)

After upgrade to 2.4.4, python tests and context
started to through warnings like
"You are trying to pass an insecure Py4j gateway
to Spark. This presents a security risk."

This change is addressing the above problem by
passing a token to the python subprocess. Subprocess
uses the token for communication and is only allowed
by the py4j gateway if the token is valid.

Change-Id: I61e82b2996fd830315db1dc72af549578fc9a7a4
@bsikander bsikander force-pushed the 244_upgrade branch 2 times, most recently from 96990c1 to c96c4f8 Compare March 2, 2020 16:27
@noorul
Copy link
Contributor

noorul commented Mar 15, 2020

Spark 2.4.5 released

@bsikander
Copy link
Contributor Author

@noorul great news but I will prefer to first get 2.4.4 in, since it is a big change. Going from 2.4.4 to 2.4.5 should be really easy.

Btw, this current change blocked due to a hive test failure. I am trying to fix it. Locally it works for me but on travis somehow it is failing.

The tests related to checking if hive is disabled
were failing because the context from previous testcase
was not shutdown properly and had hive enabled.
This fix cleans the context properly and makes sure
that context is stopped.

Change-Id: If6f9cb26fcc6f8f2af3243057825ef75585378d8
Jobserver in opensource is using "pycodestyle" to
make the python files PEP8 complaint. subprocess.py
was not complaint and due to it the opensource
build failed.

Change-Id: I93f9718ed30e122441d6e775045fff0711342f08
Change-Id: I692a8ff1387aee99ea4db7863d4676f6dd8fa5c9
Copy link
Contributor

@noorul noorul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@noorul noorul merged commit a448f2d into spark-jobserver:master Mar 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants