Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Jobserver to 2.4.4 Spark #1283

Merged
merged 8 commits into from
Mar 18, 2020
Merged

Commits on Feb 25, 2020

  1. feat(jobserver): Upgrade spark to 2.4.4

    * Hive doesn't cleanup some of the data after the tables are dropped.
    As Jobserver project has for now 2 tests, which use Hive, add
    additional parameter LOCATION to table creation and use different
    paths for metastore.
    * Also add own warehouse configuration into hive-site.xml,
    to distinguish data in the future.
    * Enable REST api (spark.master.rest.enabled) for master explicitly
    because the default has been changed (to false) in 2.4.4.
    
    Change-Id: Id8604d0156f60970c494dd373bfaaafdbcb4d63f
    bsikander committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    1244df7 View commit details
    Browse the repository at this point in the history
  2. feat(jobserver): Disable hive support from SparkSession

    Change-Id: Ic1c66785ea8c8645789c86dae83154e80458b7ec
    bsikander committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    f44aa50 View commit details
    Browse the repository at this point in the history
  3. feat(jobserver): Add missing C* dependency

    C* connector 2.4 indirectly depends on
    common-configuration which is brought in the
    classpath by Hadoop 2.7. This dependency has
    been changed in Hadoop 3.x, so C* connector
    2.4 is broken. Until it is fixed, jobserver
    puts the dependency on the classpath.
    
    https://datastax-oss.atlassian.net/browse/SPARKC-566
    
    Change-Id: I532ab22d2bb97dc5fd118c7178f67207b06bf885
    bsikander committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    d88b956 View commit details
    Browse the repository at this point in the history
  4. fix(ci): Update CI scripts to 2.4.4

    Change-Id: Ic5128c5f306250c289f8c59d22f53d31bf87674e
    bsikander committed Feb 25, 2020
    Configuration menu
    Copy the full SHA
    da9dfd4 View commit details
    Browse the repository at this point in the history

Commits on Mar 2, 2020

  1. jobserver(python): Enable secure communication

    After upgrade to 2.4.4, python tests and context
    started to through warnings like
    "You are trying to pass an insecure Py4j gateway
    to Spark. This presents a security risk."
    
    This change is addressing the above problem by
    passing a token to the python subprocess. Subprocess
    uses the token for communication and is only allowed
    by the py4j gateway if the token is valid.
    
    Change-Id: I61e82b2996fd830315db1dc72af549578fc9a7a4
    bsikander committed Mar 2, 2020
    Configuration menu
    Copy the full SHA
    31fd78c View commit details
    Browse the repository at this point in the history

Commits on Mar 17, 2020

  1. fix(tests): Stop context cleanly

    The tests related to checking if hive is disabled
    were failing because the context from previous testcase
    was not shutdown properly and had hive enabled.
    This fix cleans the context properly and makes sure
    that context is stopped.
    
    Change-Id: If6f9cb26fcc6f8f2af3243057825ef75585378d8
    Sikander authored and bsikander committed Mar 17, 2020
    Configuration menu
    Copy the full SHA
    d53f507 View commit details
    Browse the repository at this point in the history
  2. refactor(python): Make subprocess.py PEP8 complaint

    Jobserver in opensource is using "pycodestyle" to
    make the python files PEP8 complaint. subprocess.py
    was not complaint and due to it the opensource
    build failed.
    
    Change-Id: I93f9718ed30e122441d6e775045fff0711342f08
    bsikander committed Mar 17, 2020
    Configuration menu
    Copy the full SHA
    875407a View commit details
    Browse the repository at this point in the history
  3. feat(python): Disable hive support from Python Spark Session

    Change-Id: I692a8ff1387aee99ea4db7863d4676f6dd8fa5c9
    bsikander committed Mar 17, 2020
    Configuration menu
    Copy the full SHA
    66f6150 View commit details
    Browse the repository at this point in the history