Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Removing assemblies from Spark. #2

Closed
wants to merge 1 commit into from
Closed

Conversation

vanzin
Copy link
Owner

@vanzin vanzin commented Sep 23, 2015

No description provided.

dependencies be copied to a known directory under the backend’s build directory - not much different
from how things work today.

That leaves the YARN shuffle service. This is the only module where I see an assembly really adding

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could still generate a small assembly for this shuffle service. In fact I think it has it's own assembly separate from the larger one, since it intentionally has a tiny number of dependencies.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what I meant.

@tgravescs
Copy link

I think my biggest concern here is backwards compatibility. I understand this doesn't change api per say but lots of people could have scripts (launch and deploy) and configs that would need changing. Like mentioned, people could be pointing to assemly jar in pom, etc..

@vanzin
Copy link
Owner Author

vanzin commented Sep 25, 2015

lots of people could have scripts (launch and deploy) and configs that would need changing

yeah, backwards compatibility is a big worry. I think that the best way forward is to make the code changes to accommodate both, and leave "build with assembly" as an option. Then with time we can slowly migrate away from using the assembly.

@vanzin
Copy link
Owner Author

vanzin commented Oct 16, 2015

For those following, I incorporated some of the feedback from here and filed SPARK-11157 to track actual work going forward. Thanks for all the comments!

@vanzin vanzin closed this Oct 16, 2015
@vanzin vanzin deleted the rfc-assembly branch October 16, 2015 20:28
vanzin pushed a commit that referenced this pull request Mar 25, 2016
## What changes were proposed in this pull request?

This reopens apache#11836, which was merged but promptly reverted because it introduced flaky Hive tests.

## How was this patch tested?

See `CatalogTestCases`, `SessionCatalogSuite` and `HiveContextSuite`.

Author: Andrew Or <andrew@databricks.com>

Closes apache#11938 from andrewor14/session-catalog-again.
vanzin pushed a commit that referenced this pull request Jan 22, 2018
## What changes were proposed in this pull request?

There were two related fixes regarding `from_json`, `get_json_object` and `json_tuple` ([Fix #1](apache@c8803c0),
 [Fix #2](apache@86174ea)), but they weren't comprehensive it seems. I wanted to extend those fixes to all the parsers, and add tests for each case.

## How was this patch tested?

Regression tests

Author: Burak Yavuz <brkyvz@gmail.com>

Closes apache#20302 from brkyvz/json-invfix.
vanzin pushed a commit that referenced this pull request Mar 6, 2018
## What changes were proposed in this pull request?

There were two related fixes regarding `from_json`, `get_json_object` and `json_tuple` ([Fix #1](apache@c8803c0),
 [Fix #2](apache@86174ea)), but they weren't comprehensive it seems. I wanted to extend those fixes to all the parsers, and add tests for each case.

## How was this patch tested?

Regression tests

Author: Burak Yavuz <brkyvz@gmail.com>

Closes apache#20302 from brkyvz/json-invfix.

(cherry picked from commit e01919e)
Signed-off-by: hyukjinkwon <gurwls223@gmail.com>
vanzin pushed a commit that referenced this pull request Nov 5, 2019
### What changes were proposed in this pull request?
`org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite` failed lately. After had a look at the logs it just shows the following fact without any details:
```
Caused by: sbt.ForkMain$ForkError: sun.security.krb5.KrbException: Server not found in Kerberos database (7) - Server not found in Kerberos database
```
Since the issue is intermittent and not able to reproduce it we should add more debug information and wait for reproduction with the extended logs.

### Why are the changes needed?
Failing test doesn't give enough debug information.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
I've started the test manually and checked that such additional debug messages show up:
```
>>> KrbApReq: APOptions are 00000000 00000000 00000000 00000000
>>> EType: sun.security.krb5.internal.crypto.Aes128CtsHmacSha1EType
Looking for keys for: kafka/localhostEXAMPLE.COM
Added key: 17version: 0
Added key: 23version: 0
Added key: 16version: 0
Found unsupported keytype (3) for kafka/localhostEXAMPLE.COM
>>> EType: sun.security.krb5.internal.crypto.Aes128CtsHmacSha1EType
Using builtin default etypes for permitted_enctypes
default etypes for permitted_enctypes: 17 16 23.
>>> EType: sun.security.krb5.internal.crypto.Aes128CtsHmacSha1EType
MemoryCache: add 1571936500/174770/16C565221B70AAB2BEFE31A83D13A2F4/client/localhostEXAMPLE.COM to client/localhostEXAMPLE.COM|kafka/localhostEXAMPLE.COM
MemoryCache: Existing AuthList:
#3: 1571936493/200803/8CD70D280B0862C5DA1FF901ECAD39FE/client/localhostEXAMPLE.COM
#2: 1571936499/985009/BAD33290D079DD4E3579A8686EC326B7/client/localhostEXAMPLE.COM
#1: 1571936499/995208/B76B9D78A9BE283AC78340157107FD40/client/localhostEXAMPLE.COM
```

Closes apache#26252 from gaborgsomogyi/SPARK-29580.

Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants