[CARBONDATA-3300] Fixed ClassNotFoundException when using UDF in spark-shell #3132

kunal642 · 2019-02-22T06:55:08Z

Analysis:
When a spark-shell is run a scala interpreter session is started which is the main thread for that shell. This session uses TranslatingClassLoader, therefore the UDF( $anonfun$1 in the stacktrace) that is defined would be loaded into TranslatingClassLoader.

When deserialization happens an ObjectInputStream is create and the application tries to read the object, the ObjectInputStream uses a native method(sun.misc.VM.latestUserDefinedLoader() ) call to determine the ClassLoader that will be used to load the class. This native method returns URLClassLoader which is the parent of TranslatingClassLoader where the class was loaded.
Because of this ClassNotFoundException is thrown.

Class Loader Hierarchy

ExtClassLoader(head) -> AppClassLoader -> URLClassLoader -> TranslatingClassLoader

This looks like a bug in the java ObjectInputStream implementation as suggested by the following post
https://stackoverflow.com/questions/1771679/difference-between-threads-context-class-loader-and-normal-classloader

Operation	Thread	Thread ClassLoader	ClassLoader
Register	Main	Translating	Translating
Serialize	Main	Translating	Translating
Deserialize	Thread-1	Translating	URLClassLoader

Solution:
Use ClassLoaderObjectInputStream to specify the class loader that should be used to load the class.

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

Any interfaces changed?
Any backward compatibility impacted?
Document update required?
Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

CarbonDataQA · 2019-02-22T07:13:19Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2597/

ravipesala · 2019-02-22T07:54:01Z

LGTM, please make sure any other place we have used ObjectInputStream apart from it and change in the same way.

kunal642 · 2019-02-22T08:05:47Z

@ravipesala ok

CarbonDataQA · 2019-02-22T08:06:24Z

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10856/

CarbonDataQA · 2019-02-22T08:09:17Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2827/

CarbonDataQA · 2019-02-22T08:19:34Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2598/

CarbonDataQA · 2019-02-22T09:21:24Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2828/

CarbonDataQA · 2019-02-22T09:49:37Z

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10857/

qiuchenjian · 2019-02-22T12:05:07Z

core/src/main/java/org/apache/carbondata/core/metadata/blocklet/BlockletInfo.java

@@ -261,7 +262,8 @@ private void writeChunkInfoForOlderVersions(DataOutput output) throws IOExceptio

  private DataChunk deserializeDataChunk(byte[] bytes) throws IOException {
    ByteArrayInputStream stream = new ByteArrayInputStream(bytes);
-    ObjectInputStream inputStream = new ObjectInputStream(stream);
+    ObjectInputStream inputStream =


this method seems that "inputStream.close();" doesn't use finally block, can you add the protection in this pr

It does not open the file to mandatorily close the stream, it just opens on the byte[] so it is not needed to close in finally

qiuchenjian · 2019-02-22T12:07:18Z

core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java

@@ -1536,7 +1537,8 @@ public static ValueEncoderMeta deserializeEncoderMetaV2(byte[] encoderMeta) {
    ValueEncoderMeta meta = null;
    try {
      aos = new ByteArrayInputStream(encoderMeta);
-      objStream = new ObjectInputStream(aos);
+      objStream =


"CarbonUtil.closeStreams(objStream);" cann't be called when not IOException

no need for other exception handling because stream throw only ClassNotFoundException/IOException.

kunal642 · 2019-03-07T04:58:28Z

retest this please

CarbonDataQA · 2019-03-07T05:09:35Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2658/

CarbonDataQA · 2019-03-07T06:20:52Z

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10918/

CarbonDataQA · 2019-03-07T06:35:34Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2888/

kunal642 · 2019-03-08T05:00:21Z

@ravipesala Please review and merge

ravipesala · 2019-03-12T10:08:38Z

retest this please

CarbonDataQA · 2019-03-12T10:19:31Z

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2680/

CarbonDataQA · 2019-03-12T11:24:12Z

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2908/

CarbonDataQA · 2019-03-12T11:27:39Z

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10939/

ravipesala · 2019-03-12T12:55:04Z

LGTM

…k-shell Analysis: When a spark-shell is run a scala interpreter session is started which is the main thread for that shell. This session uses TranslatingClassLoader, therefore the UDF( in the stacktrace) that is defined would be loaded into TranslatingClassLoader. When deserialization happens an ObjectInputStream is create and the application tries to read the object, the ObjectInputStream uses a native method(sun.misc.VM.latestUserDefinedLoader() ) call to determine the ClassLoader that will be used to load the class. This native method returns URLClassLoader which is the parent of TranslatingClassLoader where the class was loaded. Because of this ClassNotFoundException is thrown. Class Loader Hierarchy ExtClassLoader(head) -> AppClassLoader -> URLClassLoader -> TranslatingClassLoader This looks like a bug in the java ObjectInputStream implementation as suggested by the following post https://stackoverflow.com/questions/1771679/difference-between-threads-context-class-loader-and-normal-classloader Operation Thread Thread ClassLoader ClassLoader Register Main Translating Translating Serialize Main Translating Translating Deserialize Thread-1 Translating URLClassLoader Solution: Use ClassLoaderObjectInputStream to specify the class loader that should be used to load the class. This closes #3132

…k-shell Analysis: When a spark-shell is run a scala interpreter session is started which is the main thread for that shell. This session uses TranslatingClassLoader, therefore the UDF( in the stacktrace) that is defined would be loaded into TranslatingClassLoader. When deserialization happens an ObjectInputStream is create and the application tries to read the object, the ObjectInputStream uses a native method(sun.misc.VM.latestUserDefinedLoader() ) call to determine the ClassLoader that will be used to load the class. This native method returns URLClassLoader which is the parent of TranslatingClassLoader where the class was loaded. Because of this ClassNotFoundException is thrown. Class Loader Hierarchy ExtClassLoader(head) -> AppClassLoader -> URLClassLoader -> TranslatingClassLoader This looks like a bug in the java ObjectInputStream implementation as suggested by the following post https://stackoverflow.com/questions/1771679/difference-between-threads-context-class-loader-and-normal-classloader Operation Thread Thread ClassLoader ClassLoader Register Main Translating Translating Serialize Main Translating Translating Deserialize Thread-1 Translating URLClassLoader Solution: Use ClassLoaderObjectInputStream to specify the class loader that should be used to load the class. This closes apache#3132

Fixed ClassNotFoundException when using UDF in spark-shell

b25d8f2

kunal642 force-pushed the bug/CARBONDATA-3300 branch from 9bca36f to b25d8f2 Compare February 22, 2019 08:08

qiuchenjian reviewed Feb 22, 2019

View reviewed changes

asfgit closed this in dda9c4d Mar 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CARBONDATA-3300] Fixed ClassNotFoundException when using UDF in spark-shell #3132

[CARBONDATA-3300] Fixed ClassNotFoundException when using UDF in spark-shell #3132

kunal642 commented Feb 22, 2019 •

edited

CarbonDataQA commented Feb 22, 2019

ravipesala commented Feb 22, 2019

kunal642 commented Feb 22, 2019

CarbonDataQA commented Feb 22, 2019

CarbonDataQA commented Feb 22, 2019

CarbonDataQA commented Feb 22, 2019

CarbonDataQA commented Feb 22, 2019

CarbonDataQA commented Feb 22, 2019

qiuchenjian Feb 22, 2019

ravipesala Feb 23, 2019

qiuchenjian Feb 22, 2019

kunal642 Mar 7, 2019

kunal642 commented Mar 7, 2019

CarbonDataQA commented Mar 7, 2019

CarbonDataQA commented Mar 7, 2019

CarbonDataQA commented Mar 7, 2019

kunal642 commented Mar 8, 2019

ravipesala commented Mar 12, 2019 •

edited

CarbonDataQA commented Mar 12, 2019

CarbonDataQA commented Mar 12, 2019

CarbonDataQA commented Mar 12, 2019

ravipesala commented Mar 12, 2019

[CARBONDATA-3300] Fixed ClassNotFoundException when using UDF in spark-shell #3132

[CARBONDATA-3300] Fixed ClassNotFoundException when using UDF in spark-shell #3132

Conversation

kunal642 commented Feb 22, 2019 • edited

CarbonDataQA commented Feb 22, 2019

ravipesala commented Feb 22, 2019

kunal642 commented Feb 22, 2019

CarbonDataQA commented Feb 22, 2019

CarbonDataQA commented Feb 22, 2019

CarbonDataQA commented Feb 22, 2019

CarbonDataQA commented Feb 22, 2019

CarbonDataQA commented Feb 22, 2019

qiuchenjian Feb 22, 2019

Choose a reason for hiding this comment

ravipesala Feb 23, 2019

Choose a reason for hiding this comment

qiuchenjian Feb 22, 2019

Choose a reason for hiding this comment

kunal642 Mar 7, 2019

Choose a reason for hiding this comment

kunal642 commented Mar 7, 2019

CarbonDataQA commented Mar 7, 2019

CarbonDataQA commented Mar 7, 2019

CarbonDataQA commented Mar 7, 2019

kunal642 commented Mar 8, 2019

ravipesala commented Mar 12, 2019 • edited

CarbonDataQA commented Mar 12, 2019

CarbonDataQA commented Mar 12, 2019

CarbonDataQA commented Mar 12, 2019

ravipesala commented Mar 12, 2019

kunal642 commented Feb 22, 2019 •

edited

ravipesala commented Mar 12, 2019 •

edited