Skip to content
Branch: master
Commits on Oct 15, 2019
  1. [SPARK-10614][CORE] Add monotonic time to Clock interface

    vanzin authored and dongjoon-hyun committed Oct 15, 2019
    This change adds a new method to the Clock interface that returns
    the time from a monotonic time source, so that code that needs that
    feature can also mock the Clock in tests.
    The original getTimeMillis and waitTillTime methods are unchanged, since
    streaming code that uses the Clock interface seems to rely on wall clock
    semantics, not monotonic clock. So, in a way, this doesn't directly
    address the problem raised in the bug, that waitTillTime can be affected
    by drift, but then the places being modified to use the new API don't
    really rely on that API.
    The dynamic allocation code was modified to use the new time source,
    since they shouldn't be basing their decisions on wall clock time.
    For a longer discussion on how monotonic clocks work on Linux/x64, the
    following blog post (and links within) shed a lot of light on the safety of
    Tested with unit test and also running apps with dynamic allocation on.
    Closes #26058 from vanzin/SPARK-10614.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Dongjoon Hyun <>
Commits on Oct 7, 2019
  1. [SPARK-27468][CORE] Track correct storage level of RDDs and partitions

    vanzin authored and squito committed Oct 7, 2019
    Previously, the RDD level would change depending on the status reported
    by executors for the block they were storing, and individual blocks would
    reflect that. That is wrong because different blocks may be stored differently
    in different executors.
    So now the RDD tracks the user-provided storage level, while the individual
    partitions reflect the current storage level of that particular block,
    including the current number of replicas.
    Closes #25779 from vanzin/SPARK-27468.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Imran Rashid <>
Commits on Sep 18, 2019
  1. [SPARK-29082][CORE] Skip delegation token generation if no credential…

    vanzin committed Sep 18, 2019
    …s are available
    This situation can happen when an external system (e.g. Oozie) generates
    delegation tokens for a Spark application. The Spark driver will then run
    against secured services, have proper credentials (the tokens), but no
    kerberos credentials. So trying to do things that requires a kerberos
    credential fails.
    Instead, if no kerberos credentials are detected, just skip the whole
    delegation token code.
    Tested with an application that simulates Oozie; fails before the fix,
    passes with the fix. Also with other DT-related tests to make sure other
    functionality keeps working.
    Closes #25805 from vanzin/SPARK-29082.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
  2. [SPARK-29105][CORE] Keep driver log file size up to date in HDFS

    vanzin committed Sep 18, 2019
    HDFS doesn't update the file size reported by the NM if you just keep
    writing to the file; this makes the SHS believe the file is inactive,
    and so it may delete it after the configured max age for log files.
    This change uses hsync to keep the log file as up to date as possible
    when using HDFS. It also disables erasure coding by default for these
    logs, since hsync (& friends) does not work with EC.
    Tested with a SHS configured to aggressively clean up logs; verified
    a spark-shell session kept updating the log, which was not deleted by
    the SHS.
    Closes #25819 from vanzin/SPARK-29105.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Aug 19, 2019
  1. [SPARK-28634][YARN] Ignore kerberos login config in client mode AM

    vanzin committed Aug 19, 2019
    This change makes the client mode AM ignore any login configuration,
    which is now always handled by the driver. The previous code tried
    to achieve that by modifying the configuration visible to the AM, but
    that missed the case where old configuration names were being used.
    Tested in real cluster with reproduction provided in the bug.
    Closes #25467 from vanzin/SPARK-28634.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Aug 14, 2019
  1. [SPARK-28487][K8S] More responsive dynamic allocation with K8S

    vanzin committed Aug 14, 2019
    This change implements a few changes to the k8s pod allocator so
    that it behaves a little better when dynamic allocation is on.
    (i) Allow the application to ramp up immediately when there's a
    change in the target number of executors. Without this change,
    scaling would only trigger when a change happened in the state of
    the cluster, e.g. an executor going down, or when the periodical
    snapshot was taken (default every 30s).
    (ii) Get rid of pending pod requests, both acknowledged (i.e. Spark
    knows that a pod is pending resource allocation) and unacknowledged
    (i.e. Spark has requested the pod but the API server hasn't created it
    yet), when they're not needed anymore. This avoids starting those
    executors to just remove them after the idle timeout, wasting resources
    in the meantime.
    (iii) Re-work some of the code to avoid unnecessary logging. While not
    bad without dynamic allocation, the existing logging was very chatty
    when dynamic allocation was on. With the changes, all the useful
    information is still there, but only when interesting changes happen.
    (iv) Gracefully shut down executors when they become idle. Just deleting
    the pod causes a lot of ugly logs to show up, so it's better to ask pods
    to exit nicely. That also allows Spark to respect the "don't delete
    pods" option when dynamic allocation is on.
    Tested on a small k8s cluster running different TPC-DS workloads.
    Closes #25236 from vanzin/SPARK-28487.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Aug 1, 2019
  1. [SPARK-28584][CORE] Fix thread safety issue in blacklist timer, tests

    vanzin authored and dongjoon-hyun committed Aug 1, 2019
    There's a small, probably very hard to hit thread-safety issue in the blacklist
    abort timers in the task scheduler, where they access a non-thread-safe map without
    In the tests, the code was also calling methods on the TaskSetManager without
    holding the proper locks, which could cause threads to call non-thread-safe
    TSM methods concurrently.
    Closes #25317 from vanzin/SPARK-28584.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Dongjoon Hyun <>
  2. [SPARK-24352][CORE][TESTS] De-flake StandaloneDynamicAllocationSuite …

    vanzin authored and dongjoon-hyun committed Aug 1, 2019
    …blacklist test
    The issue is that the test tried to stop an existing scheduler and replace it with
    a new one set up for the test. That can cause issues because both were sharing the
    same RpcEnv underneath, and unregistering RpcEndpoints is actually asynchronous
    (see comment in Dispatcher.unregisterRpcEndpoint). So that could lead to races where
    the new scheduler tried to register before the old one was fully unregistered.
    The updated test avoids the issue by using a separate RpcEnv / scheduler instance
    altogether, and also avoids a misleading NPE in the test logs.
    Closes #25318 from vanzin/SPARK-24352.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Dongjoon Hyun <>
Commits on Jul 27, 2019
  1. [SPARK-28535][CORE][TEST] Slow down tasks to de-flake JobCancellation…

    vanzin authored and dongjoon-hyun committed Jul 27, 2019
    This test tries to detect correct behavior in racy code, where the event
    thread is racing with the executor thread that's trying to kill the running
    If the event that signals the stage end arrives first, any delay in the
    delivery of the message to kill the task causes the code to rapidly process
    elements, and may cause the test to assert. Adding a 10ms delay in
    LocalSchedulerBackend before the task kill makes the test run through
    ~1000 elements. A longer delay can easily cause the 10000 elements to
    be processed.
    Instead, by adding a small delay (10ms) in the test code that processes
    elements, there's a much lower probability that the kill event will not
    arrive before the end; that leaves a window of 100s for the event
    to be delivered to the executor. And because each element only sleeps for
    10ms, the test is not really slowed down at all.
    Closes #25270 from vanzin/SPARK-28535.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Dongjoon Hyun <>
Commits on Jul 22, 2019
  1. [SPARK-28455][CORE] Avoid overflow when calculating executor timeout …

    vanzin authored and dongjoon-hyun committed Jul 22, 2019
    This would cause the timeout time to be negative, so executors would be
    timed out immediately (instead of never).
    I also tweaked a couple of log messages that could get pretty long when
    lots of executors were active.
    Added unit test (which failed without the fix).
    Closes #25208 from vanzin/SPARK-28455.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Dongjoon Hyun <>
Commits on Jul 16, 2019
  1. [SPARK-27963][CORE] Allow dynamic allocation without a shuffle service.

    vanzin committed Jul 16, 2019
    This change adds a new option that enables dynamic allocation without
    the need for a shuffle service. This mode works by tracking which stages
    generate shuffle files, and keeping executors that generate data for those
    shuffles alive while the jobs that use them are active.
    A separate timeout is also added for shuffle data; so that executors that
    hold shuffle data can use a separate timeout before being removed because
    of being idle. This allows the shuffle data to be kept around in case it
    is needed by some new job, or allow users to be more aggressive in timing
    out executors that don't have shuffle data in active use.
    The code also hooks up to the context cleaner so that shuffles that are
    garbage collected are detected, and the respective executors not held
    Testing done with added unit tests, and also with TPC-DS workloads on
    YARN without a shuffle service.
    Closes #24817 from vanzin/SPARK-27963.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Jul 15, 2019
  1. [SPARK-28150][CORE][FOLLOW-UP] Don't try to log in when impersonating.

    vanzin committed Jul 15, 2019
    When fetching delegation tokens for a proxy user, don't try to log in,
    since it will fail.
    Closes #25141 from vanzin/SPARK-28150.2.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Jul 13, 2019
  1. [SPARK-28371][SQL] Make Parquet "StartsWith" filter null-safe

    vanzin authored and dongjoon-hyun committed Jul 13, 2019
    Parquet may call the filter with a null value to check whether nulls are
    accepted. While it seems Spark avoids that path in Parquet with 1.10, in
    1.11 that causes Spark unit tests to fail.
    Tested with Parquet 1.11 (and new unit test).
    Closes #25140 from vanzin/SPARK-28371.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Dongjoon Hyun <>
Commits on Jun 27, 2019
  1. [SPARK-28187][BUILD] Add support for hadoop-cloud to the PR builder.

    vanzin committed Jun 27, 2019
    Closes #24987 from vanzin/SPARK-28187.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
  2. [SPARK-28150][CORE] Log in user before getting delegation tokens.

    vanzin committed Jun 27, 2019
    This ensures that tokens are always created with an empty UGI, which
    allows multiple contexts to be (sequentially) started from the same JVM.
    Tested with code attached to the bug, and also usual kerberos tests.
    Closes #24955 from vanzin/SPARK-28150.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Jun 5, 2019
  1. [SPARK-20286][CORE] Improve logic for timing out executors in dynamic…

    vanzin authored and squito committed Jun 5, 2019
    … allocation.
    This change refactors the portions of the ExecutorAllocationManager class that
    track executor state into a new class, to achieve a few goals:
    - make the code easier to understand
    - better separate concerns (task backlog vs. executor state)
    - less synchronization between event and allocation threads
    - less coupling between the allocation code and executor state tracking
    The executor tracking code was moved to a new class (ExecutorMonitor) that
    encapsulates all the logic of tracking what happens to executors and when
    they can be timed out. The logic to actually remove the executors remains
    in the EAM, since it still requires information that is not tracked by the
    new executor monitor code.
    In the executor monitor itself, of interest, specifically, is a change in
    how cached blocks are tracked; instead of polling the block manager, the
    monitor now uses events to track which executors have cached blocks, and
    is able to detect also unpersist events and adjust the time when the executor
    should be removed accordingly. (That's the bug mentioned in the PR title.)
    Because of the refactoring, a few tests in the old EAM test suite were removed,
    since they're now covered by the newly added test suite. The EAM suite was
    also changed a little bit to not instantiate a SparkContext every time. This
    allowed some cleanup, and the tests also run faster.
    Tested with new and updated unit tests, and with multiple TPC-DS workloads
    running with dynamic allocation on; also some manual tests for the caching
    Closes #24704 from vanzin/SPARK-20286.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Imran Rashid <>
Commits on May 29, 2019
  1. [SPARK-27868][CORE] Better default value and documentation for socket…

    vanzin authored and dongjoon-hyun committed May 29, 2019
    … server backlog.
    First, there is currently no public documentation for this setting. So it's hard
    to even know that it could be a problem if your application starts failing with
    weird shuffle errors.
    Second, the javadoc attached to the code was incorrect; the default value just uses
    the default value from the JRE, which is 50, instead of having an unbounded queue
    as the comment implies.
    So use a default that is a "rounded" version of the JRE default, and provide
    documentation explaining that this value may need to be adjusted. Also added
    a log message that was very helpful in debugging an issue caused by this
    Closes #24732 from vanzin/SPARK-27868.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Dongjoon Hyun <>
Commits on May 15, 2019
  1. [SPARK-27678][UI] Allow user impersonation in the UI.

    vanzin committed May 15, 2019
    This feature allows proxy servers to identify the actual request user
    using a request parameter, and performs access control checks against
    that user instead of the authenticated user. Impersonation is only
    allowed if the authenticated user is configured as an admin.
    The request parameter used ("doAs") matches the one currently used by
    Knox, but it should be easy to change / customize if different proxy
    servers use a different way of identifying the original user.
    Tested with updated unit tests and also with a live server behind Knox.
    Closes #24582 from vanzin/SPARK-27678.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Mar 25, 2019
  1. [SPARK-27219][CORE] Treat timeouts as fatal in SASL fallback path.

    vanzin authored and dongjoon-hyun committed Mar 25, 2019
    When a timeout happens we don't know what's the state of the remote end,
    so there is no point in doing anything else since it will most probably
    fail anyway.
    The change also demotes the log message printed when falling back to
    SASL, since a warning is too noisy for when the fallback is really
    needed (e.g. old shuffle service, or shuffle service with new auth
    Closes #24160 from vanzin/SPARK-27219.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Dongjoon Hyun <>
Commits on Mar 20, 2019
  1. [SPARK-27094][YARN] Work around RackResolver swallowing thread interr…

    vanzin committed Mar 20, 2019
    To avoid the case where the YARN libraries would swallow the exception and
    prevent YarnAllocator from shutting down, call the offending code in a
    separate thread, so that the parent thread can respond appropriately to
    the shut down.
    As a safeguard, also explicitly stop the executor launch thread pool when
    shutting down the application, to prevent new executors from coming up
    after the application started its shutdown.
    Tested with unit tests + some internal tests on real cluster.
    Closes #24017 from vanzin/SPARK-27094.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Mar 11, 2019
  1. [SPARK-27004][CORE] Remove stale HTTP auth code.

    vanzin committed Mar 11, 2019
    This code is from the era when Spark used an HTTP server to distribute
    dependencies, which is long gone. Nowadays it only causes problems when
    someone is using dependencies from an HTTP server with Spark auth on.
    Closes #24033 from vanzin/SPARK-27004.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Mar 2, 2019
  1. [SPARK-26048][BUILD] Enable flume profile when creating 2.x releases.

    vanzin authored and dongjoon-hyun committed Mar 2, 2019
    Closes #23931 from vanzin/SPARK-26048.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Dongjoon Hyun <>
Commits on Mar 1, 2019
  1. [K8S][MINOR] Log minikube version when running integration tests.

    vanzin committed Mar 1, 2019
    Closes #23893 from vanzin/minikube-version.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
  2. [SPARK-26420][K8S] Generate more unique IDs when creating k8s resourc…

    vanzin authored and dongjoon-hyun committed Mar 1, 2019
    …e names.
    Using the current time as an ID is more prone to clashes than people generally
    realize, so try to make things a bit more unique without necessarily using a
    UUID, which would eat too much space in the names otherwise.
    The implemented approach uses some bits from the current time, plus some random
    bits, which should be more resistant to clashes.
    Closes #23805 from vanzin/SPARK-26420.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Dongjoon Hyun <>
Commits on Feb 27, 2019
  1. [SPARK-24736][K8S] Let spark-submit handle dependency resolution.

    vanzin committed Feb 27, 2019
    Before this change, there was some code in the k8s backend to deal
    with how to resolve dependencies and make them available to the
    Spark application. It turns out that none of that code is necessary,
    since spark-submit already handles all that for applications started
    in client mode - like the k8s driver that is run inside a Spark-created
    For that reason, specifically for pyspark, there's no need for the
    k8s backend to deal with PYTHONPATH; or, in general, to change the URIs
    provided by the user at all. spark-submit takes care of that.
    For testing, I created a pyspark script that depends on another module
    that is shipped with --py-files. Then I used:
    - --py-files http://.../ http://.../
    - --py-files http://.../ http://.../
    - --py-files local:/.../ local:/.../
    - --py-files local:/.../ local:/.../
    Without this change, all of the above commands fail. With the change, the
    driver is able to see the dependencies in all the above cases; but executors
    don't see the dependencies in the last two. That's a bug in shared Spark code
    that deals with local: dependencies in pyspark (SPARK-26934).
    I also tested a Scala app using the main jar from an http server.
    Closes #23793 from vanzin/SPARK-24736.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Feb 26, 2019
Commits on Feb 25, 2019
  1. [SPARK-26788][YARN] Remove SchedulerExtensionService.

    vanzin authored and srowen committed Feb 25, 2019
    Since the yarn module is actually private to Spark, this interface was never
    actually "public". Since it has no use inside of Spark, let's avoid adding
    a yarn-specific extension that isn't public, and point any potential users
    are more general solutions (like using a SparkListener).
    Closes #23839 from vanzin/SPARK-26788.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Sean Owen <>
Commits on Feb 19, 2019
  1. [SPARK-24894][K8S] Make sure valid host names are created for executors.

    vanzin committed Feb 19, 2019
    Since the host name is derived from the app name, which can contain arbitrary
    characters, it needs to be sanitized so that only valid characters are allowed.
    On top of that, take extra care that truncation doesn't leave characters that
    are valid except at the start of a host name.
    Closes #23781 from vanzin/SPARK-24894.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Feb 14, 2019
  1. [SPARK-26650][CORE] Demote noisy HBase-related log message.

    vanzin authored and HyukjinKwon committed Feb 14, 2019
    Make it a debug message so that it doesn't show up in the vast
    majority of cases, where HBase classes are not available.
    Closes #23776 from vanzin/SPARK-26650.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Hyukjin Kwon <>
Commits on Feb 6, 2019
  1. [SPARK-26733][K8S] Cleanup

    vanzin committed Feb 6, 2019
    Merge both case statements, and remove unused variables that
    are not set by the Scala code anymore.
    Author: Marcelo Vanzin <>
    Closes #23655 from vanzin/SPARK-26733.
Commits on Jan 30, 2019
  1. [SPARK-25689][CORE] Follow up: don't get delegation tokens when kerbe…

    vanzin committed Jan 30, 2019
    …ros not available.
    This avoids trying to get delegation tokens when a TGT is not available, e.g.
    when running in yarn-cluster mode without a keytab. That would result in an
    error since that is not allowed.
    Tested with some (internal) integration tests that started failing with the
    patch for SPARK-25689.
    Closes #23689 from vanzin/SPARK-25689.followup.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
  2. [SPARK-26732][CORE][TEST] Wait for listener bus to process events in …

    vanzin authored and maropu committed Jan 30, 2019
    Otherwise the RDD data may be out of date by the time the test tries to check it.
    Tested with an artificial delay inserted in AppStatusListener.
    Closes #23654 from vanzin/SPARK-26732.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Takeshi Yamamuro <>
Commits on Jan 28, 2019
  1. [SPARK-26595][CORE] Allow credential renewal based on kerberos ticket…

    vanzin committed Jan 28, 2019
    … cache.
    This change addes a new mode for credential renewal that does not require
    a keytab; it uses the local ticket cache instead, so it works while the
    user keeps the cache valid.
    This can be useful for, e.g., people running long spark-shell sessions where
    their kerberos login is kept up-to-date.
    The main change to enable this behavior is in HadoopDelegationTokenManager,
    with a small change in the HDFS token provider. The other changes are to avoid
    creating duplicate tokens when submitting the application to YARN; they allow
    the tokens from the scheduler to be sent to the YARN AM, reducing the round trips
    to HDFS.
    For that, the scheduler initialization code was changed a little bit so that
    the tokens are available when the YARN client is initialized. That basically
    takes care of a long-standing TODO that was in the code to clean up configuration
    propagation to the driver's RPC endpoint (in CoarseGrainedSchedulerBackend).
    Tested with an app designed to stress this functionality, with both keytab and
    cache-based logins. Some basic kerberos tests on k8s also.
    Closes #23525 from vanzin/SPARK-26595.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Jan 15, 2019
  1. [SPARK-25857][CORE] Add developer documentation regarding delegation …

    vanzin committed Jan 15, 2019
    Closes #23348 from vanzin/SPARK-25857.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Marcelo Vanzin <>
Commits on Jan 8, 2019
  1. [SPARK-24522][UI] Create filter to apply HTTP security checks consist…

    vanzin authored and squito committed Jan 8, 2019
    Currently there is code scattered in a bunch of places to do different
    things related to HTTP security, such as access control, setting
    security-related headers, and filtering out bad content. This makes it
    really easy to miss these things when writing new UI code.
    This change creates a new filter that does all of those things, and
    makes sure that all servlet handlers that are attached to the UI get
    the new filter and any user-defined filters consistently. The extent
    of the actual features should be the same as before.
    The new filter is added at the end of the filter chain, because authentication
    is done by custom filters and thus needs to happen first. This means that
    custom filters see unfiltered HTTP requests - which is actually the current
    behavior anyway.
    As a side-effect of some of the code refactoring, handlers added after
    the initial set also get wrapped with a GzipHandler, which didn't happen
    Tested with added unit tests and in a history server with SPNEGO auth
    Closes #23302 from vanzin/SPARK-24522.
    Authored-by: Marcelo Vanzin <>
    Signed-off-by: Imran Rashid <>
You can’t perform that action at this time.