Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-29677][DStreams] amazon-kinesis-client 1.12.0 #26333

Closed
wants to merge 1 commit into from

Conversation

etspaceman
Copy link
Contributor

What changes were proposed in this pull request?

Upgrading the amazon-kinesis-client dependency to 1.12.0.

Why are the changes needed?

The current amazon-kinesis-client version is 1.8.10. This version depends on the use of describeStream, which has a hard limit on an AWS account (10 reqs / second). Versions 1.9.0 and up leverage listShards, which has no such limit. For large customers, this can be a major problem.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests

@dongjoon-hyun
Copy link
Member

ok to test

@dongjoon-hyun
Copy link
Member

Thank you for your first contribution, @etspaceman .

@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #112990 has finished for PR 26333 at commit a859ce4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@etspaceman
Copy link
Contributor Author

The unit test failures seem unrelated to the change. I see that the kinesis package has all passing tests.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #113035 has finished for PR 26333 at commit a859ce4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@etspaceman
Copy link
Contributor Author

Seems to have had a separate unrelated error...

@srowen
Copy link
Member

srowen commented Nov 1, 2019

Looks OK if tests pass (yes those failures are unrelated).
BTW there are a lot of deprecation warnings from the current code; anyone interested in this module is welcome to fix them up.
Do we need to update the kinesis producer too? it's just used in tests though.

@SparkQA
Copy link

SparkQA commented Nov 1, 2019

Test build #4911 has finished for PR 26333 at commit a859ce4.

  • This patch fails build dependency tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@etspaceman
Copy link
Contributor Author

The KPL doesn't currently have a release that uses listShards I'm afraid. Though I've heard that's coming soon.

Still, if it's just for tests, I'm not too worried about that.

Just rebased on the current master.

@SparkQA
Copy link

SparkQA commented Nov 1, 2019

Test build #113091 has finished for PR 26333 at commit ddf4931.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 1, 2019

Test build #4912 has finished for PR 26333 at commit a859ce4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@etspaceman
Copy link
Contributor Author

Anything else required before this is merged?

@srowen
Copy link
Member

srowen commented Nov 2, 2019

I usually leave it open for a day or two if it's not urgent, esp at the weekend, just to see if anyone has further comments. It's probably fine.

@etspaceman
Copy link
Contributor Author

This one is quite urgent for our team at Disney Streaming Services. We are going through a massive Kinesis scale-up and the DescribeStream limits have been a major problem. Some of our apps use this module, so it is critical for us to have this merged.

We are Databricks customers and this was brought up in a conversation w/ our support staff w/ them last week.

If possible, we'd also really appreciate a hotfix release of the spark-streaming-kinesis-asl module.

@srowen
Copy link
Member

srowen commented Nov 2, 2019

Understood, and I expect it's fine to merge this to 2.4 as well. While I'm also at Databricks, we all wear our ASF project 'hat' here. Databricks support is separate, and you'd have to ask for updates to that distribution (including any maintenance releases) there. I don't think the OSS project here would cut new releases just for this -- relatively speaking it's not a critical issue for most users. But, can be in the next maintenance release.

@srowen srowen closed this in be022d9 Nov 2, 2019
srowen pushed a commit that referenced this pull request Nov 2, 2019
### What changes were proposed in this pull request?
Upgrading the amazon-kinesis-client dependency to 1.12.0.

### Why are the changes needed?
The current amazon-kinesis-client version is 1.8.10. This version depends on the use of `describeStream`, which has a hard limit on an AWS account (10 reqs / second). Versions 1.9.0 and up leverage `listShards`, which has no such limit. For large customers, this can be a major problem.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Existing tests

Closes #26333 from etspaceman/kclUpgrade.

Authored-by: Eric Meisel <eric.steven.meisel@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
(cherry picked from commit be022d9)
Signed-off-by: Sean Owen <sean.owen@databricks.com>
@srowen
Copy link
Member

srowen commented Nov 2, 2019

Merged to master/2.4

@gatorsmile
Copy link
Member

Upgrading from amazon-kinesis-client version from 1.8.10 to 1.12.0 is risky in the maintenance release of Spark 2.4. We always avoid such a dependence upgrade in the maintenance releases.

I would suggest to revert this from Spark 2.4 release. cc @srowen @dongjoon-hyun @zsxwing @yhuai

@srowen
Copy link
Member

srowen commented Nov 16, 2019

I don't really have a strong opinion on it. It appears to be a clean dependency update for a secondary module. @etspaceman describes a reasonably compelling reason to update it, and I don't know of a problem it causes - do we otherwise have a concern about it?

dongjoon-hyun added a commit that referenced this pull request Feb 28, 2020
### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](#26333) upgrades AWS Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, `aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
dongjoon-hyun added a commit that referenced this pull request Feb 28, 2020
### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](#26333) upgrades AWS Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, `aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 3995728)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
dongjoon-hyun added a commit that referenced this pull request Feb 28, 2020
### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](#26333) upgrades AWS Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, `aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
(cherry picked from commit 3995728)
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](apache#26333) upgrades AWS Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, `aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes apache#27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants