Skip to content

[HUDI-5151] Fix bug with broken flink data skipping caused by ClassNotFoundException of InLineFileSystem#7124

Merged
codope merged 2 commits intoapache:masterfrom
trushev:classloader
Nov 29, 2022
Merged

[HUDI-5151] Fix bug with broken flink data skipping caused by ClassNotFoundException of InLineFileSystem#7124
codope merged 2 commits intoapache:masterfrom
trushev:classloader

Conversation

@trushev
Copy link
Contributor

@trushev trushev commented Nov 3, 2022

Change Logs

Pls follow the ticket for more details

This problem has already been fixed by #5194. But the patch doesn't fix flink's issue

We should use InLineFileSystem.class.getClassLoader() instead of Thread.currentThread().getContextClassLoader() because method lookupRecords(keys, fullKey) is called from commonForkJoinPool-worker thread which may contain the wrong contextClassLoader

Impact

Fixed flink data skipping issue

7799 [main] WARN  org.apache.hudi.source.FileIndex [] - Read column stats for data skipping error
org.apache.hudi.exception.HoodieException: org.apache.hudi.exception.HoodieException: Error occurs when executing map
	...
Caused by: java.lang.ClassNotFoundException: Class org.apache.hudi.common.fs.inline.InLineFileSystem not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2329) ~[hadoop-common-2.10.1.jar:?]
	...
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) ~[?:1.8.0_345]

Risk level medium

Flink's ut and it

Documentation Update

No need

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@trushev
Copy link
Contributor Author

trushev commented Nov 3, 2022

@nsivabalan Could you please take a look at this fix

@trushev trushev changed the title [HUDI-5151] Flink data skipping doesn't work with ClassNotFoundException of InLineFileSystem [HUDI-5151] Fix bug with broken flink data skipping caused by ClassNotFoundException of InLineFileSystem Nov 3, 2022
@hudi-bot
Copy link
Collaborator

hudi-bot commented Nov 5, 2022

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@nsivabalan nsivabalan added priority:critical Production degraded; pipelines stalled metadata labels Nov 7, 2022
Copy link
Contributor

@danny0405 danny0405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, nice catch ~

@trushev
Copy link
Contributor Author

trushev commented Nov 14, 2022

Pls merge this fix

@codope codope merged commit 88db1ca into apache:master Nov 29, 2022
satishkotha pushed a commit that referenced this pull request Dec 13, 2022
alexeykudinkin pushed a commit to onehouseinc/hudi that referenced this pull request Dec 14, 2022
alexeykudinkin pushed a commit to onehouseinc/hudi that referenced this pull request Dec 14, 2022
alexeykudinkin pushed a commit to onehouseinc/hudi that referenced this pull request Dec 14, 2022
alexeykudinkin pushed a commit to onehouseinc/hudi that referenced this pull request Dec 14, 2022
alexeykudinkin pushed a commit to onehouseinc/hudi that referenced this pull request Dec 14, 2022
fengjian428 pushed a commit to fengjian428/hudi that referenced this pull request Apr 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority:critical Production degraded; pipelines stalled

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants