Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOTFIX] SPARK-1637: There are some Streaming examples added after the PR #571 was last updated. #673

Closed
wants to merge 1 commit into from

Conversation

techaddict
Copy link
Contributor

This resulted in Compilation Errors.
cc @mateiz project not compiling currently.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@techaddict techaddict changed the title [SPARK-1637][HOTFIX] There are some Streaming examples added after the PR #571 was last updated. [HOTFIX][SPARK-1637] There are some Streaming examples added after the PR #571 was last updated. May 7, 2014
@techaddict techaddict changed the title [HOTFIX][SPARK-1637] There are some Streaming examples added after the PR #571 was last updated. [HOTFIX] SPARK-1637: There are some Streaming examples added after the PR #571 was last updated. May 7, 2014
@tdas
Copy link
Contributor

tdas commented May 7, 2014

@techaddict LOL, I just made the exact same PR. #674
Closing mine :)

…e PR apache#571 was last updated.

This resulted in Compilation Errors.
@@ -15,7 +15,7 @@
* limitations under the License.
*/

package org.apache.spark.streaming.examples;
package org.apache.spark.examples.streaming;

import com.google.common.collect.Lists;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change the package name in the instructions of both these files? See https://github.com/apache/spark/pull/674/files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14751/

@tdas
Copy link
Contributor

tdas commented May 7, 2014

Ignore this ^ build results, this is from the previous build, which I cancelled.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14753/

@pwendell
Copy link
Contributor

pwendell commented May 7, 2014

This test was stuck so I killed it, let's try again. Jenkins retest this please.

[info] - repeatedly failing task that crashes JVM (7 seconds, 56 milliseconds)
[info] - caching (3 seconds, 226 milliseconds)
[info] - caching on disk (3 seconds, 313 milliseconds)
[info] - caching in memory, replicated (3 seconds, 253 milliseconds)
[info] - caching in memory, serialized, replicated (3 seconds, 368 milliseconds)
[info] - caching on disk, replicated (3 seconds, 454 milliseconds)
[info] - caching in memory and disk, replicated (3 seconds, 332 milliseconds)

@pwendell
Copy link
Contributor

pwendell commented May 7, 2014

Jenkins, test this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14755/

@pwendell
Copy link
Contributor

pwendell commented May 7, 2014

Okay cool - merging this since it's a hotfix.

@pwendell
Copy link
Contributor

pwendell commented May 7, 2014

This has been merged but we're waiting for github to sync up

@asfgit asfgit closed this in fdae095 May 7, 2014
asfgit pushed a commit that referenced this pull request May 7, 2014
…e PR #571 was last updated.

This resulted in Compilation Errors.
cc @mateiz project not compiling currently.

Author: Sandeep <sandeep@techaddict.me>

Closes #673 from techaddict/SPARK-1637-HOTFIX and squashes the following commits:

b512f4f [Sandeep] [SPARK-1637][HOTFIX] There are some Streaming examples added after the PR #571 was last updated. This resulted in Compilation Errors.
(cherry picked from commit fdae095)

Signed-off-by: Patrick Wendell <pwendell@gmail.com>
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
…e PR apache#571 was last updated.

This resulted in Compilation Errors.
cc @mateiz project not compiling currently.

Author: Sandeep <sandeep@techaddict.me>

Closes apache#673 from techaddict/SPARK-1637-HOTFIX and squashes the following commits:

b512f4f [Sandeep] [SPARK-1637][HOTFIX] There are some Streaming examples added after the PR apache#571 was last updated. This resulted in Compilation Errors.
bulldozer-bot bot referenced this pull request in palantir/spark Nov 7, 2019
###### _excavator_ is a bot for automating changes across repositories.

Changes produced by the roomba/latest-baseline check.

# Release Notes
## 0.50.0
[feature] Warn against .parallel() calls on Java streams (#537) 
[fix] Correct prioritisation of versions.props to match nebula logic (#533)


## 0.51.0
- [feature] New 'com.palantir.baseline-reproducibility' plugin (#539) 
- [improvement] `./gradlew idea` deletes redundant ipr files (#550) 
- [fix] ValidateConstantMessage error-prone check is more accurate. (#546)


## 0.51.1
- [fix] Fix cleanup of old idea project files (#559) 
- [fix] Remove stale references to no longer existing puppycrawl checkstyle DTDs (#556)

## 0.52.0
- [improvement] errorprone 2.3.2 -> 2.3.3 (#561) 
- [feature] new plugin: `com.palantir.baseline-exact-dependencies` helps declare necessary and sufficient dependencies (#548)
- [improvement] Split out circle style plugin into generic junit reports plugin #564

## 0.53.0
- [improvement] Disallow javafx imports with checkstyle #569
- [fix] Avoid lambda to allow build caching of checkstyle results #576

## 0.54.0
- [feature] New `com.palantir.baseline-release-compatibility` plugin (#582)

## 0.55.0
[break] Enable running of unique class check on multiple configurations (#583)

## 0.55.1
[fix] checkImplicitDependencies shouldn't count ignored artifacts (#601) 

## 0.55.2
[fix] BaselineReleaseCompatibility up-to-date checking of compile tasks (#605)

## 0.56.0
[feature] Add an errorprone rule GradleCacheableTaskAction that prevents passing a lambda to Task.doFirst or Task.doLast when implementing gradle plugins (#608)

## 0.57.0
* [feature] Error prone rule to replace `Iterables.partition(List, int)` with `Lists.partition(List, int)` (#622)
* [feature] Error prone rule to prefer `Lists` or `Collections2` `transfrom` over `Iterables.transform` (#623)

## 0.58.0
[improvement] make CheckClassUniquenessTask cacheable (#637)
[fix] Add Javac Settings to uncheck "Use compiler from module target JDK when possible" (#629)
[fix] class uniqueness rule must have a config (#638) 

## 0.59.0
[improvement] Spotless to remove blank lines at start of methods (#641) 

## 0.60.0
* [improvement] New PreferBuiltInConcurrentKeySet suggestion (#649) 
* [improvement] Start publishing plugin to the [Gradle plugin portal](https://plugins.gradle.org/plugin/com.palantir.baseline) (#613)

## 0.61.0
- [improvement] Sensible defaults for test tasks (HeapDumpOnOutOfMemory) (#652)

## 0.62.0
* [improvement] Ensure Optional#orElse argument is not method invocation (#655)


## 0.62.1
[fix] Revert "[improvement] Ensure Optional#orElse argument is not method invocation" (#659)

## 0.63.0
[improvement] Support auto-applying error-prone suggested fixes (#660) 

## 0.64.0
* [improvement] Refaster rule compilation (#661)

## 0.64.1
- [improvement] JUnit 5 boilerplate #666

## 0.65.0
[improvement] Error-prone check to help prevent logging AuthHeader and BearerToken (#654)
[fix] fix potential NPE when configuring testing (#669) 
[fix] Fix refaster compilation to support version recommendations (#667)

## 0.66.0
[improvement] Ignore DesignForExtension for ParameterizedTest (#673) 

## 0.66.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | The PreventTokenLogging error-prone check will now correctly handle null use in SLF4J and Safe/Unsafe Arg functions. | palantir/gradle-baseline#674 |


## 1.0.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Add refaster rule to migrate away from optional.orElse(supplier.get()) | palantir/gradle-baseline#679 |
| Fix | Projects can now compile using Java12, because the one errorprone check that breaks (Finally) is now disabled when you use this toolchain. It remains enabled when compiling against earlier JDKs. | palantir/gradle-baseline#681 |


## 1.1.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Ensure that format tasks execute after compilation | palantir/gradle-baseline#688 |


## 1.1.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Auto-fix OptionalOrElseMethodInvocation using `-PerrorProneApply`. | palantir/gradle-baseline#690 |


## 1.2.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Spotless check for disallowing dangling parenthesis. | palantir/gradle-baseline#687 |


## 1.3.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Don't cache test tasks in the build cache by default.<br>It's possible to restore caching by adding `com.palantir.baseline.restore-test-cache = true` to your `gradle.properties`. | palantir/gradle-baseline#694 |


## 1.4.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | No longer cache javaCompile tasks when applying errorprone or refaster checks. | palantir/gradle-baseline#696 |
| Feature | Test helper for refaster checks. | palantir/gradle-baseline#697 |


## 1.5.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Determine whether to use junitPlatform on a per source set basis | palantir/gradle-baseline#701 |
| Feature | OptionalOrElseMethodInvocation now checks for constructor invocations. | palantir/gradle-baseline#702 |


## 1.6.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | The severity of PreferSafeLoggableExceptions and PreferSafeLoggingPreconditions is now WARNING. | palantir/gradle-baseline#704 |
| Fix | OptionalOrElseMethodInvocation now allows method references in orElse. | palantir/gradle-baseline#709 |


## 1.6.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Do not overwrite user provided test configure when using junit5 | palantir/gradle-baseline#712 |


## 1.7.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Baseline can now re-format all your Java files using the Eclipse formatter. This is currently an opt-in preview, try it out by running `./gradlew format -Pcom.palantir.baseline-format.eclipse`. | palantir/gradle-baseline#707 |
| Improvement | Add errorprone check to ensure junit5 tests are not used with junit4 Rule/ClassRule | palantir/gradle-baseline#714 |


## 1.8.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Checkstyle now tolerates empty lambda bodies (e.g. `() -> {}` | palantir/gradle-baseline#715 |


## 1.8.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Correctly set dependency between spotlessApply and baselineUpdateConfig to prevent a race | palantir/gradle-baseline#724 |


## 1.8.2
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Correctly handle `EnableRuleMigrationSupport` in `JUnit5RuleUsage` errorprone-rule | palantir/gradle-baseline#725 |


## 1.9.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Wrap long parameterized types where necessary | palantir/gradle-baseline#716 |
| Improvement | Allow suppression of the TODO checkstyle check by giving it an ID. Clarify its comment to allow // TODO(username): ... | palantir/gradle-baseline#727 |
| Improvement | IntelliJ GitHub issue navigation | palantir/gradle-baseline#729 |
| Improvement | print out suggestion for module dependencies inclusion in useful format | palantir/gradle-baseline#733 |
| Fix | The `checkImplicitDependencies` task will no longer suggest a fix of the current project. | palantir/gradle-baseline#736, palantir/gradle-baseline#567 |
| Improvement | Implement DangerousCompletableFutureUsage errorprone check | palantir/gradle-baseline#740 |


## 1.10.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Refaster to use `execute` over `submit` when the result is ignored | palantir/gradle-baseline#741 |


## 1.10.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Enable applying refaster rules for repos with -Xlint:deprecation | palantir/gradle-baseline#742 |


## 1.11.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Apply `InputStreamSlowMultibyteRead` error prone check at ERROR severity | palantir/gradle-baseline#749 |


## 1.12.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | The `baseline-idea` plugin now generates configuration more closely aligned with Gradle defaults. | palantir/gradle-baseline#718 |
| Improvement | Apply the suggested fixes for `UnusedMethod` and `UnusedVariable`. | palantir/gradle-baseline#751 |
| Improvement | Refaster `stream.sorted().findFirst()` into `stream.min(Comparator.naturalOrder())` | palantir/gradle-baseline#752 |
| Improvement | Error prone validation that Stream.sort is invoked on comparable streams | palantir/gradle-baseline#753 |
| Improvement | `DangerousStringInternUsage`: Disallow String.intern() invocations | palantir/gradle-baseline#754 |


## 1.12.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Do not apply the suggested fixes for `UnusedMethod` and `UnusedVariable` which automaticall remove code with side effects. | palantir/gradle-baseline#757 |


## 1.13.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Remove errorprone `LogSafePreconditionsConstantMessage` | palantir/gradle-baseline#755 |
| Improvement | Disable errorprone `Slf4jLogsafeArgs` in test code | palantir/gradle-baseline#756 |
| Improvement | error-prone now detects `Duration#getNanos` mistakes and bans URL in equals methods | palantir/gradle-baseline#758 |


## 1.14.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Implement `OptionalOrElseThrowThrows` to prevent throwing from orElseThrow | palantir/gradle-baseline#759 |


## 1.15.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | LogSafePreconditionsMessageFormat disallows slf4j-style format characters | palantir/gradle-baseline#761 |
| Improvement | Error Prone LambdaMethodReference check | palantir/gradle-baseline#763 |


## 1.16.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | baseline-circleci no longer integrates with the (deprecated) FindBugs plugin, as a pre-requisite for Gradle 6.0 compatibility. | palantir/gradle-baseline#766 |


## 1.17.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | The `TypeParameterUnusedInFormals` errorprone check is disabled when compiling on Java 13, to workaround an error-prone bug. | palantir/gradle-baseline#767 |
| Improvement | Publish scm information within POM | palantir/gradle-baseline#769 |


## 1.17.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | LambdaMethodReference avoids suggestions for non-static methods | palantir/gradle-baseline#771 |


## 1.17.2
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Remove pom only dependencies from analysis in checkUnusedDependencies | palantir/gradle-baseline#773 |


## 1.18.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | When computing unused dependencies, compileOnly and annotationProcessor<br>dependencies are ignored due to false positives as these dependencies<br>will not appear as dependencies in the generated byte-code, but are in<br>fact necessary dependencies to compile a given module. | palantir/gradle-baseline#783 |


## 1.19.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Disable `PreconditionsConstantMessage` on gradle plugins | palantir/gradle-baseline#790 |


## 2.0.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Break | Add gradle 6.0-20190904072820+0000 compatibiltiy. This raises minimum required version of gradle for plugins from this repo to 5.0. | palantir/gradle-baseline#791 |


## 2.1.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Feature | Automatically configure the [Intellij Eclipse format plugin](https://plugins.jetbrains.com/plugin/6546-eclipse-code-formatter) to use the eclipse formatter | palantir/gradle-baseline#794 |


## 2.1.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Stop applying error prone patches for checks that have been turned off. | palantir/gradle-baseline#793 |


## 2.2.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | baseline-circleci now validates that the rootProject.name isn't the CircleCI default (`project`) as can interfere with publishing. | palantir/gradle-baseline#775 |
| Improvement | Remove JGit dependency | palantir/gradle-baseline#798 |


## 2.2.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Don't add whitespace to blank lines inside comments. Fixes apache#799 | palantir/gradle-baseline#800 |
| Fix | Eclipse formatter now aligns multicatch so that it passes checkstyle. | palantir/gradle-baseline#807 |


## 2.2.2
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | ClassUniquenessPlugin now checks the `runtimeClasspath` configuration by default. | palantir/gradle-baseline#810 |


## 2.3.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | SafeLoggingExceptionMessageFormat disallows `{}` in safelog exception messages | palantir/gradle-baseline#815 |


## 2.4.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | A new `StrictUnusedVariable` check will catch any unused arguments (e.g. AuthHeaders) to public methods. If you need to suppress this, rename your variable to have an underscore prefix (e.g. `s/foo/_foo/`) or just run `./gradlew classes -PerrorProneApply` to auto-fix | palantir/gradle-baseline#819 |
| Improvement | Message format checks use instanceof rather than catching | palantir/gradle-baseline#821 |


## 2.4.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Avoid false positives caused by `module-info.class` when checking class uniqueness | palantir/gradle-baseline#823 |


## 2.4.2
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Checkstyle tasks only check their own source set and only actual java sources. They don't look in your `src/*/resources` directory anymore. | palantir/gradle-baseline#830 |


## 2.4.3
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Add link to StrictUnusedVariable that directs users to baseline repo. | palantir/gradle-baseline#829 |
| Fix | Long try-with-resources statements are now aligned such that the first assignment stays on the first line. | palantir/gradle-baseline#835 |


## 2.5.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Error Prone StringBuilderConstantParameters. StringBuilder with a constant number of parameters should be replaced by simple concatenation. The Java compiler (jdk8) replaces concatenation of a constant number of arguments with a StringBuilder, while jdk 9+ take advantage of JEP 280 (https://openjdk.java.net/jeps/280) to efficiently pre-size the result for better performance than a StringBuilder. | palantir/gradle-baseline#832 |


## 2.6.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Excavator PRs that apply other refaster rules (e.g. Witchcraft ones) will not also apply baseline refaster rules. | palantir/gradle-baseline#827 |
| Improvement | Added a new ErrorProne check `PreferAssertj` to assist migration to AssertJ from legacy test frameworks. It may be necessary to add a dependency on `org.assertj:assertj-core` in modules which do not already depend on AssertJ. If there's a technical reason that AssertJ cannot be used, `PreferAssertj` may be explicitly disabled to prevent future upgrades from attempting to re-run the migration. | palantir/gradle-baseline#841 |


## 2.7.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | `StrictUnusedVariable` now ignores variables prefixed with `_` and the suggested fix will rename all unused parameters in public methods instead of removing them | palantir/gradle-baseline#833 |
| Improvement | ErrorProne will now detect dangerous usage of `@RunWith(Suite.class)` that references JUnit5 classes, as this can cause tests to silently not run! | palantir/gradle-baseline#843 |


## 2.8.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | PreferAssertj provides better replacements fixes | palantir/gradle-baseline#850 |
| Improvement | Do not run error prone on any code in the build directory | palantir/gradle-baseline#853 |


## 2.8.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Fix hamcrest arrayContainingInAnyOrder conversion | palantir/gradle-baseline#859 |


## 2.9.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | StrictUnusedVariable can only be suppressed with `_` prefix | palantir/gradle-baseline#854 |
| Improvement | StrictUnusedVariable is now an error by default | palantir/gradle-baseline#855 |
| Fix | The PreferAssertj refactoring will only be applied if you have explicitly opted in (e.g. using `baselineErrorProne { patchChecks += 'PreferAssertj' }` | palantir/gradle-baseline#861 |


## 2.9.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Error prone will correctly ignore all source files in the build directory and in any generated source directory | palantir/gradle-baseline#864 |
| Fix | Ensure that `StrictUnusedVariable` correctly converts previously suppressed variables `unused` to `_` | palantir/gradle-baseline#865 |


## 2.9.2
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | When removing unused variables, `StrictUnusedVariable` will preserve side effects | palantir/gradle-baseline#870 |


## 2.10.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | A new `checkJUnitDependencies` task detects misconfigured JUnit dependencies which could result in some tests silently not running. | palantir/gradle-baseline#837 |
| Improvement | Some AssertJ assertions can now be automatically replaced with more idiomatic ones using refaster. | palantir/gradle-baseline#851 |
| Fix | PreferAssertj check avoids ambiguity in assertThat invocations | palantir/gradle-baseline#874 |
| Improvement | Improve performannce of error prone PreferAssertj check | palantir/gradle-baseline#875 |
| Improvement | StringBuilderConstantParameters suggested fix doesn't remove comments | palantir/gradle-baseline#877 |


## 2.10.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Allow junit4 dependencies to exist without junit4 tests | palantir/gradle-baseline#880 |


## 2.11.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | PreferAssertj supports migration of zero-delta floating point array asserts | palantir/gradle-baseline#883 |


## 2.11.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | checkJunitDependencies only checks Java source | palantir/gradle-baseline#885 |


## 2.11.2
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | AssertJ Refaster fixes use static `assertThat` imports | palantir/gradle-baseline#887 |


## 2.12.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Disable `UnusedVariable` error prone rule by default | palantir/gradle-baseline#888 |


## 2.13.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Refaster for AssertJ isZero/isNotZero/isOne and collections | palantir/gradle-baseline#881 |
| Improvement | AssertJ refaster migrations support string descriptions | palantir/gradle-baseline#891 |
| Fix | Certain error-prone checks are disabled in test code, and the presence of JUnit5's `@TestTemplate` annotation is now used to detect whether a class is test code. | palantir/gradle-baseline#892 |
| Fix | BaselineFormat task exclude generated code on Windows | palantir/gradle-baseline#896 |


## 2.14.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Refaster rules for AssertJ tests | palantir/gradle-baseline#898 |
| Improvement | refaster replacement for assertj hasSize(foo.size) -> hasSameSizeAs | palantir/gradle-baseline#900 |
| Fix | Keep spotless plugin from eagerly configuring all tasks | diffplug/spotless#444 |
| Fix | Continue when RefasterRuleBuilderScanner throws | palantir/gradle-baseline#904 |
| Improvement | Refaster now works on repos using Gradle 6.0 | palantir/gradle-baseline#804, palantir/gradle-baseline#906 |


## 2.15.0
_No documented user facing changes_

## 2.16.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Rewrite ImmutableCollection#addAll to add for arrays | palantir/gradle-baseline#743 |
| Improvement | Add refaster rule to simplify empty optional asserts | palantir/gradle-baseline#911 |
| Improvement | Baseline now allows static imports of AssertJ and Mockito methods. | palantir/gradle-baseline#915 |
| Improvement | Remove refaster AssertjIsOne rule. | palantir/gradle-baseline#917 |
| Improvement | Add assertj refaster rules for map size asserts | palantir/gradle-baseline#919 |
| Improvement | Added a Refaster rule to change `isEqualTo` checks into `hasValue` checks |  |


## 2.17.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Implement AssertjCollectionHasSameSizeAsArray | palantir/gradle-baseline#922 |
| Improvement | Implement assertj map refactors for containsKey and containsEntry | palantir/gradle-baseline#925 |
| Improvement | Refaster assertj migrations support descriptions with format args | palantir/gradle-baseline#926 |
| Improvement | Refaster out String.format from describedAs | palantir/gradle-baseline#927 |


## 2.18.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Refaster rules to simplify negated boolean expressions and extract null checks. | palantir/gradle-baseline#935 |
| Improvement | Refaster rules for checks that maps do not contain a specific key | palantir/gradle-baseline#935 |
| Improvement | Refaster rule 'CollectionStreamForEach' | palantir/gradle-baseline#942 |
| Improvement | ExecutorSubmitRunnableFutureIgnored as error prone ERROR | palantir/gradle-baseline#943 |


## 2.19.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | checkJUnitDependencies detects a possible misconfiguration with spock and JUnit5 which could lead to tests silently not running. | palantir/gradle-baseline#951 |


## 2.20.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Use Mockito verifyNoInteractions over deprecated verifyZeroInteractions | palantir/gradle-baseline#924 |
| Improvement | Errorprone rules for usage of Guava static factory methods | palantir/gradle-baseline#941 |
| Improvement | Fix error-prone `UnnecessaryParentheses` by default | palantir/gradle-baseline#952 |
| Improvement | Implement Error Prone `ThrowError` to discourage throwing Errors in production code<br>Errors are often handled poorly by libraries resulting in unexpected<br>behavior and resource leaks. It's not obvious that 'catch (Exception e)'<br>does not catch Error.<br>This check  is intended to be advisory - it's fine to<br>`@SuppressWarnings("ThrowError")` in certain cases, but is usually not<br>recommended unless you are writing a testing library that throws<br>AssertionError. | palantir/gradle-baseline#957 |
| Improvement | Improve TestCheckUtils.isTestCode test detection | palantir/gradle-baseline#958 |
| Improvement | Implement Error Prone `Slf4jLevelCheck` to validate that slf4j level checks agree with contained logging. | palantir/gradle-baseline#960 |


## 2.20.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Suppress error-prone PreferCollectionConstructors on jdk13 | palantir/gradle-baseline#968 |


## 2.21.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Feature | Users can opt-in to format their files using our fork of google-java-format (palantir-java-format) | palantir/gradle-baseline#936 |


## 2.22.0
_Automated release, no documented user facing changes_

## 2.23.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Implement error prone ReverseDnsLookup for unexpected reverse dns lookups<br><br>Calling address.getHostName may result in a DNS lookup which is a network request,<br>making the invocation significantly more expensive than expected depending on the<br>environment.<br>This check  is intended to be advisory - it's fine to<br>@SuppressWarnings("ReverseDnsLookup") in certain cases, but is usually not<br>recommended. | palantir/gradle-baseline#970 |


## 2.24.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | The deprecated `verifyZeroInteractions` now gets rewritten to `verifyNoMoreInteractions`, which has the same behaviour. | palantir/gradle-baseline#975 |
| Improvement | ReadReturnValueIgnored: Check that read operation results are not ignored | palantir/gradle-baseline#978 |
| Improvement | Stop migrating source sets to safe-logging, unless they already have the requisite library (`com.palantir.safe-logging:preconditions`). | palantir/gradle-baseline#981 |
| Improvement | For users who opted into palantir-java-format, we now reflow strings and reorder imports. | palantir/gradle-baseline#982 |


## 2.25.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | checkstyle Indentation rule is disabled when palantir-java-format is enabled | palantir/gradle-baseline#987 |
| Improvement | Load palantir-java-format dynamically from the same configuration set up by `com.palantir-java-format` which is also used to determine the version used by IntelliJ. | palantir/gradle-baseline#989 |


## 2.26.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Run `./gradlew formatDiff` to reformat the relevant sections of any uncommitted changed Java files (relies on `git diff -U0 HEAD` under the hood) | palantir/gradle-baseline#988 |


## 2.27.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Slf4jLogsafeArgs fixes safe-log wrapped throwables | palantir/gradle-baseline#1001 |
| Improvement | `DangerousParallelStreamUsage` checks for `Collection.parallelStream()` and `StreamSupport` utility methods with parallel=true. | palantir/gradle-baseline#1005 |
| Improvement | DangerousThrowableMessageSafeArg disallows Throwables in SafeArg values.<br>Throwables must be logged without an Arg wrapper as the last parameter, otherwise unsafe data may be leaked from the unsafe message or the unsafe message of a cause. | palantir/gradle-baseline#997 |
| Improvement | Implement a suggested fix for CatchBlockLogException | palantir/gradle-baseline#998 |


## 2.28.0
| Type | Description | Link |
| ---- | ----------- | ---- |
| Improvement | Implement `FinalClass` error prone check, replacing the checkstyle implementation | palantir/gradle-baseline#1008 |
| Improvement | Error prone validation to avoid redundant modifiers | palantir/gradle-baseline#1010 |


## 2.28.1
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Fix `RedundantModifier` interpretation of implicit modifiers | palantir/gradle-baseline#1014 |


## 2.28.2
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Fix RedundantModifier failures types nested in interfaces | palantir/gradle-baseline#1017 |


## 2.28.3
| Type | Description | Link |
| ---- | ----------- | ---- |
| Fix | Fix error-prone mathcing literal null as a subtype.<br>The most common issue this fixes is failures on `SafeArg.of("name", null)`<br>assuming that the null literal value parameter may be a throwable. | palantir/gradle-baseline#1020 |



To enable or disable this check, please contact the maintainers of Excavator.
holdenk added a commit to holdenk/spark that referenced this pull request Oct 27, 2020
[SPARK-21040][CORE] Speculate tasks which are running on decommission executors

This PR adds functionality to consider the running tasks on decommission executors based on some config.
In spark-on-cloud , we sometimes already know that an executor won't be alive for more than fix amount of time. Ex- In AWS Spot nodes, once we get the notification, we know that a node will be gone in 120 seconds.
So if the running tasks on the decommissioning executors may run beyond currentTime+120 seconds, then they are candidate for speculation.

Currently when an executor is decommission, we stop scheduling new tasks on those executors but the already running tasks keeps on running on them. Based on the cloud, we might know beforehand that an executor won't be alive for more than a preconfigured time. Different cloud providers gives different timeouts before they take away the nodes. For Ex- In case of AWS spot nodes, an executor won't be alive for more than 120 seconds. We can utilize this information in cloud environments and take better decisions about speculating the already running tasks on decommission executors.

Yes. This PR adds a new config "spark.executor.decommission.killInterval" which they can explicitly set based on the cloud environment where they are running.

Added UT.

Closes apache#28619 from prakharjain09/SPARK-21040-speculate-decommission-exec-tasks.

Authored-by: Prakhar Jain <prakharjain09@gmail.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

This pull request adds the ability to migrate shuffle files during Spark's decommissioning. The design document associated with this change is at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE .

To allow this change the `MapOutputTracker` has been extended to allow the location of shuffle files to be updated with `updateMapOutput`. When a shuffle block is put, a block update message will be sent which triggers the `updateMapOutput`.

Instead of rejecting remote puts of shuffle blocks `BlockManager` delegates the storage of shuffle blocks to it's shufflemanager's resolver (if supported). A new, experimental, trait is added for shuffle resolvers to indicate they handle remote putting of blocks.

The existing block migration code is moved out into a separate file, and a producer/consumer model is introduced for migrating shuffle files from the host as quickly as possible while not overwhelming other executors.

Recomputting shuffle blocks can be expensive, we should take advantage of our decommissioning time to migrate these blocks.

This PR introduces two new configs parameters, `spark.storage.decommission.shuffleBlocks.enabled` & `spark.storage.decommission.rddBlocks.enabled` that control which blocks should be migrated during storage decommissioning.

New unit test & expansion of the Spark on K8s decom test to assert that decommisioning with shuffle block migration means that the results are not recomputed even when the original executor is terminated.

This PR is a cleaned-up version of the previous WIP PR I made apache#28331 (thanks to attilapiros for his very helpful reviewing on it :)).

Closes apache#28708 from holdenk/SPARK-20629-copy-shuffle-data-when-nodes-are-being-shutdown-cleaned-up.

Lead-authored-by: Holden Karau <hkarau@apple.com>
Co-authored-by: Holden Karau <holden@pigscanfly.ca>
Co-authored-by: “attilapiros” <piros.attila.zsolt@gmail.com>
Co-authored-by: Attila Zsolt Piros <attilazsoltpiros@apiros-mbp16.lan>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-24266][K8S] Restart the watcher when we receive a version changed from k8s

Restart the watcher when it failed with a HTTP_GONE code from the kubernetes api. Which means a resource version has changed.

For more relevant information see here: fabric8io/kubernetes-client#1075

No

Running spark-submit to a k8s cluster.

Not sure how to make an automated test for this. If someone can help me out that would be great.

Closes apache#28423 from stijndehaes/bugfix/k8s-submit-resource-version-change.

Authored-by: Stijn De Haes <stijndehaes@gmail.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

This PR is a giant plumbing PR that plumbs an `ExecutorDecommissionInfo` along
with the DecommissionExecutor message.

The primary motivation is to know whether a decommissioned executor
would also be loosing shuffle files -- and thus it is important to know
whether the host would also be decommissioned.

In the absence of this PR, the existing code assumes that decommissioning an executor does not loose the whole host with it, and thus does not clear the shuffle state if external shuffle service is enabled. While this may hold in some cases (like K8s decommissioning an executor pod, or YARN container preemption), it does not hold in others like when the cluster is managed by a Standalone Scheduler (Master). This is similar to the existing `workerLost` field in the `ExecutorProcessLost` message.

In the future, this `ExecutorDecommissionInfo` can be embellished for
knowing how long the executor has to live for scenarios like Cloud spot
kills (or Yarn preemption) and the like.

No

Tweaked an existing unit test in `AppClientSuite`

Closes apache#29032 from agrawaldevesh/plumb_decom_info.

Authored-by: Devesh Agrawal <devesh.agrawal@gmail.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

This PR reduces the prospect of a job loss during decommissioning. It
fixes two holes in the current decommissioning framework:

- (a) Loss of decommissioned executors is not treated as a job failure:
We know that the decommissioned executor would be dying soon, so its death is
clearly not caused by the application.

- (b) Shuffle files on the decommissioned host are cleared when the
first fetch failure is detected from a decommissioned host: This is a
bit tricky in terms of when to clear the shuffle state ? Ideally you
want to clear it the millisecond before the shuffle service on the node
dies (or the executor dies when there is no external shuffle service) --
too soon and it could lead to some wastage and too late would lead to
fetch failures.

  The approach here is to do this clearing when the very first fetch
failure is observed on the decommissioned block manager, without waiting for
other blocks to also signal a failure.

Without them decommissioning a lot of executors at a time leads to job failures.

The task scheduler tracks the executors that were decommissioned along with their
`ExecutorDecommissionInfo`. This information is used by: (a) For handling a `ExecutorProcessLost` error, or (b) by the `DAGScheduler` when handling a fetch failure.

No

Added a new unit test `DecommissionWorkerSuite` to test the new behavior by exercising the Master-Worker decommissioning. I chose to add a new test since the setup logic was quite different from the existing `WorkerDecommissionSuite`. I am open to changing the name of the newly added test suite :-)

- Should I add a feature flag to guard these two behaviors ? They seem safe to me that they should only get triggered by decommissioning, but you never know :-).

Closes apache#29014 from agrawaldevesh/decom_harden.

Authored-by: Devesh Agrawal <devesh.agrawal@gmail.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-32417] Fix flakyness of BlockManagerDecommissionIntegrationSuite

This test tries to fix the flakyness of BlockManagerDecommissionIntegrationSuite.

Make the block manager decommissioning test be less flaky

An interesting failure happens when migrateDuring = true (and persist or shuffle is true):
- We schedule the job with tasks on executors 0, 1, 2.
- We wait 300 ms and decommission executor 0.
- If the task is not yet done on executor 0, it will now fail because
   the block manager won't be able to save the block. This condition is
   easy to trigger on a loaded machine where the github checks run.
- The task with retry on a different executor (1 or 2) and its shuffle
   blocks will land there.
- No actual block migration happens here because the decommissioned
   executor technically failed before it could even produce a block.

To remove the above race, this change replaces the fixed wait for 300 ms to wait for an actual task to succeed. When a task has succeeded, we know its blocks would have been written for sure and thus its executor would certainly be forced to migrate those blocks when it is decommissioned.

The change always decommissions an executor on which a real task finished successfully instead of picking the first executor. Because the system may choose to schedule nothing on the first executor and instead run the two tasks on one executor.

I have had bad luck with BlockManagerDecommissionIntegrationSuite and it has failed several times on my PRs. So fixing it.

No, unit test only change.

Github checks. Ran this test 100 times, 10 at a time in parallel in a script.

Closes apache#29226 from agrawaldevesh/block-manager-decom-flaky.

Authored-by: Devesh Agrawal <devesh.agrawal@gmail.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-31197][CORE] Shutdown executor once we are done decommissioning

Exit the executor when it has been asked to decommission and there is nothing left for it to do.

This is a rebase of apache#28817

If we want to use decommissioning in Spark's own scale down we should terminate the executor once finished.
Furthermore, in graceful shutdown it makes sense to release resources we no longer need if we've been asked to shutdown by the cluster manager instead of always holding the resources as long as possible.

The decommissioned executors will exit and the end of decommissioning. This is sort of a user facing change, however decommissioning hasn't been in any releases yet.

I changed the unit test to not send the executor exit message and still wait on the executor exited message.

Closes apache#29211 from holdenk/SPARK-31197-exit-execs-redone.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

Connect decommissioning to dynamic scaling

Because the mock always says there is an RDD we may replicate more than once, and now that there are independent threads

Make Spark's dynamic allocation use decommissioning

Track the decommissioning executors in the core dynamic scheduler so we don't scale down too low, update the streaming ExecutorAllocationManager to also delegate to decommission

Fix up executor add for resource profile

Fix our exiting and cleanup thread for better debugging next time. Cleanup the locks we use in decommissioning and clarify some more bits.

Verify executors decommissioned, then killed by external external cluster manager are re-launched

Verify some additional calls are not occuring in the executor allocation manager suite.

Dont' close the watcher until the end of the test

Use decommissionExecutors and set adjustTargetNumExecutors to false so that we can match the pattern for killExecutor/killExecutors

bump numparts up to 6

Revert "bump numparts up to 6"

This reverts commit daf96dd.

Small coment & visibility cleanup

CR feedback/cleanup

Cleanup the merge

[SPARK-21040][CORE] Speculate tasks which are running on decommission executors

This PR adds functionality to consider the running tasks on decommission executors based on some config.
In spark-on-cloud , we sometimes already know that an executor won't be alive for more than fix amount of time. Ex- In AWS Spot nodes, once we get the notification, we know that a node will be gone in 120 seconds.
So if the running tasks on the decommissioning executors may run beyond currentTime+120 seconds, then they are candidate for speculation.

Currently when an executor is decommission, we stop scheduling new tasks on those executors but the already running tasks keeps on running on them. Based on the cloud, we might know beforehand that an executor won't be alive for more than a preconfigured time. Different cloud providers gives different timeouts before they take away the nodes. For Ex- In case of AWS spot nodes, an executor won't be alive for more than 120 seconds. We can utilize this information in cloud environments and take better decisions about speculating the already running tasks on decommission executors.

Yes. This PR adds a new config "spark.executor.decommission.killInterval" which they can explicitly set based on the cloud environment where they are running.

Added UT.

Closes apache#28619 from prakharjain09/SPARK-21040-speculate-decommission-exec-tasks.

Authored-by: Prakhar Jain <prakharjain09@gmail.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown

This pull request adds the ability to migrate shuffle files during Spark's decommissioning. The design document associated with this change is at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE .

To allow this change the `MapOutputTracker` has been extended to allow the location of shuffle files to be updated with `updateMapOutput`. When a shuffle block is put, a block update message will be sent which triggers the `updateMapOutput`.

Instead of rejecting remote puts of shuffle blocks `BlockManager` delegates the storage of shuffle blocks to it's shufflemanager's resolver (if supported). A new, experimental, trait is added for shuffle resolvers to indicate they handle remote putting of blocks.

The existing block migration code is moved out into a separate file, and a producer/consumer model is introduced for migrating shuffle files from the host as quickly as possible while not overwhelming other executors.

Recomputting shuffle blocks can be expensive, we should take advantage of our decommissioning time to migrate these blocks.

This PR introduces two new configs parameters, `spark.storage.decommission.shuffleBlocks.enabled` & `spark.storage.decommission.rddBlocks.enabled` that control which blocks should be migrated during storage decommissioning.

New unit test & expansion of the Spark on K8s decom test to assert that decommisioning with shuffle block migration means that the results are not recomputed even when the original executor is terminated.

This PR is a cleaned-up version of the previous WIP PR I made apache#28331 (thanks to attilapiros for his very helpful reviewing on it :)).

Closes apache#28708 from holdenk/SPARK-20629-copy-shuffle-data-when-nodes-are-being-shutdown-cleaned-up.

Lead-authored-by: Holden Karau <hkarau@apple.com>
Co-authored-by: Holden Karau <holden@pigscanfly.ca>
Co-authored-by: “attilapiros” <piros.attila.zsolt@gmail.com>
Co-authored-by: Attila Zsolt Piros <attilazsoltpiros@apiros-mbp16.lan>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-24266][K8S] Restart the watcher when we receive a version changed from k8s

Restart the watcher when it failed with a HTTP_GONE code from the kubernetes api. Which means a resource version has changed.

For more relevant information see here: fabric8io/kubernetes-client#1075

No

Running spark-submit to a k8s cluster.

Not sure how to make an automated test for this. If someone can help me out that would be great.

Closes apache#28423 from stijndehaes/bugfix/k8s-submit-resource-version-change.

Authored-by: Stijn De Haes <stijndehaes@gmail.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-32217] Plumb whether a worker would also be decommissioned along with executor

This PR is a giant plumbing PR that plumbs an `ExecutorDecommissionInfo` along
with the DecommissionExecutor message.

The primary motivation is to know whether a decommissioned executor
would also be loosing shuffle files -- and thus it is important to know
whether the host would also be decommissioned.

In the absence of this PR, the existing code assumes that decommissioning an executor does not loose the whole host with it, and thus does not clear the shuffle state if external shuffle service is enabled. While this may hold in some cases (like K8s decommissioning an executor pod, or YARN container preemption), it does not hold in others like when the cluster is managed by a Standalone Scheduler (Master). This is similar to the existing `workerLost` field in the `ExecutorProcessLost` message.

In the future, this `ExecutorDecommissionInfo` can be embellished for
knowing how long the executor has to live for scenarios like Cloud spot
kills (or Yarn preemption) and the like.

No

Tweaked an existing unit test in `AppClientSuite`

Closes apache#29032 from agrawaldevesh/plumb_decom_info.

Authored-by: Devesh Agrawal <devesh.agrawal@gmail.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-32199][SPARK-32198] Reduce job failures during decommissioning

This PR reduces the prospect of a job loss during decommissioning. It
fixes two holes in the current decommissioning framework:

- (a) Loss of decommissioned executors is not treated as a job failure:
We know that the decommissioned executor would be dying soon, so its death is
clearly not caused by the application.

- (b) Shuffle files on the decommissioned host are cleared when the
first fetch failure is detected from a decommissioned host: This is a
bit tricky in terms of when to clear the shuffle state ? Ideally you
want to clear it the millisecond before the shuffle service on the node
dies (or the executor dies when there is no external shuffle service) --
too soon and it could lead to some wastage and too late would lead to
fetch failures.

  The approach here is to do this clearing when the very first fetch
failure is observed on the decommissioned block manager, without waiting for
other blocks to also signal a failure.

Without them decommissioning a lot of executors at a time leads to job failures.

The task scheduler tracks the executors that were decommissioned along with their
`ExecutorDecommissionInfo`. This information is used by: (a) For handling a `ExecutorProcessLost` error, or (b) by the `DAGScheduler` when handling a fetch failure.

No

Added a new unit test `DecommissionWorkerSuite` to test the new behavior by exercising the Master-Worker decommissioning. I chose to add a new test since the setup logic was quite different from the existing `WorkerDecommissionSuite`. I am open to changing the name of the newly added test suite :-)

- Should I add a feature flag to guard these two behaviors ? They seem safe to me that they should only get triggered by decommissioning, but you never know :-).

Closes apache#29014 from agrawaldevesh/decom_harden.

Authored-by: Devesh Agrawal <devesh.agrawal@gmail.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-32417] Fix flakyness of BlockManagerDecommissionIntegrationSuite

This test tries to fix the flakyness of BlockManagerDecommissionIntegrationSuite.

Make the block manager decommissioning test be less flaky

An interesting failure happens when migrateDuring = true (and persist or shuffle is true):
- We schedule the job with tasks on executors 0, 1, 2.
- We wait 300 ms and decommission executor 0.
- If the task is not yet done on executor 0, it will now fail because
   the block manager won't be able to save the block. This condition is
   easy to trigger on a loaded machine where the github checks run.
- The task with retry on a different executor (1 or 2) and its shuffle
   blocks will land there.
- No actual block migration happens here because the decommissioned
   executor technically failed before it could even produce a block.

To remove the above race, this change replaces the fixed wait for 300 ms to wait for an actual task to succeed. When a task has succeeded, we know its blocks would have been written for sure and thus its executor would certainly be forced to migrate those blocks when it is decommissioned.

The change always decommissions an executor on which a real task finished successfully instead of picking the first executor. Because the system may choose to schedule nothing on the first executor and instead run the two tasks on one executor.

I have had bad luck with BlockManagerDecommissionIntegrationSuite and it has failed several times on my PRs. So fixing it.

No, unit test only change.

Github checks. Ran this test 100 times, 10 at a time in parallel in a script.

Closes apache#29226 from agrawaldevesh/block-manager-decom-flaky.

Authored-by: Devesh Agrawal <devesh.agrawal@gmail.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

[SPARK-31197][CORE] Shutdown executor once we are done decommissioning

Exit the executor when it has been asked to decommission and there is nothing left for it to do.

This is a rebase of apache#28817

If we want to use decommissioning in Spark's own scale down we should terminate the executor once finished.
Furthermore, in graceful shutdown it makes sense to release resources we no longer need if we've been asked to shutdown by the cluster manager instead of always holding the resources as long as possible.

The decommissioned executors will exit and the end of decommissioning. This is sort of a user facing change, however decommissioning hasn't been in any releases yet.

I changed the unit test to not send the executor exit message and still wait on the executor exited message.

Closes apache#29211 from holdenk/SPARK-31197-exit-execs-redone.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: Holden Karau <hkarau@apple.com>

Connect decommissioning to dynamic scaling

Because the mock always says there is an RDD we may replicate more than once, and now that there are independent threads

Make Spark's dynamic allocation use decommissioning

Track the decommissioning executors in the core dynamic scheduler so we don't scale down too low, update the streaming ExecutorAllocationManager to also delegate to decommission

Fix up executor add for resource profile

Fix our exiting and cleanup thread for better debugging next time. Cleanup the locks we use in decommissioning and clarify some more bits.

Verify executors decommissioned, then killed by external external cluster manager are re-launched

Verify some additional calls are not occuring in the executor allocation manager suite.

Dont' close the watcher until the end of the test

Use decommissionExecutors and set adjustTargetNumExecutors to false so that we can match the pattern for killExecutor/killExecutors

bump numparts up to 6

Revert "bump numparts up to 6"

This reverts commit daf96dd.

Small coment & visibility cleanup

CR feedback/cleanup

Fix up the merge

CR feedback, move adjustExecutors to a common utility function

Exclude some non-public APIs

Remove junk

More CR feedback

Fix adjustExecutors backport

This test fails for me locally and from what I recall it's because we use a different method of resolving the bind address than upstream so disabling the test

This test fails for me locally and from what I recall it's because we use a different method of resolving the bind address than upstream so disabling the test

Cleanup and drop watcher changes from the backport
sunchao pushed a commit to sunchao/spark that referenced this pull request Jun 2, 2023
PRs Merged
1. [Internal] Add AppleAwsClientFactory for Mascot (apache#577)
2. Hive: Log new metadata location in commit (apache#4681)
3. change timeout to 120 for now (apache#661)
4. Internal: Add hive_catalog parameter to SparkCatalog (apache#670)
5. Internal: Pull catalog setting to CachedClientPool (apache#673)
6. Core: Defer reading Avro metadata until ManifestFile is read (apache#5206)
7. API: Fix ID assignment in schema merging (apache#5395)
8. AWS: S3OutputStream - failure to close should persist on subsequent close calls (apache#5311)
9. API: Allow schema updates to find fields with case-insensitivity (apache#5440)
10. Spark 3.3: Spark mergeSchema to respect Spark Case Sensitivity Configuration (apache#5441)
RolatZhang pushed a commit to RolatZhang/spark that referenced this pull request Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants