Skip to content

[HUDI-4950] Fix read log lead to oom not be catched issue#6827

Closed
TJX2014 wants to merge 1 commit intoapache:masterfrom
TJX2014:HUDI-4950
Closed

[HUDI-4950] Fix read log lead to oom not be catched issue#6827
TJX2014 wants to merge 1 commit intoapache:masterfrom
TJX2014:HUDI-4950

Conversation

@TJX2014
Copy link
Contributor

@TJX2014 TJX2014 commented Sep 29, 2022

Change Logs

Make org.apache.hudi.util.StreamerUtil#getLatestTableSchema can catch oom exception caused by read log/base file to infer schema

Impact

Use HoodieHiveCatalog to infer schema can be consistent with HoodieCatalog, which can catch oom.

Risk level: none | low | medium | high
none

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

Comment on lines -560 to 563
return getTableAvroSchema(metaClient, false);
} catch (Exception e) {
LOG.warn("Error while resolving the latest table schema", e);
} catch (Throwable throwable) {
LOG.warn("Error while resolving the latest table schema.", throwable);
// ignored
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not a good practice to ignore Throwable. why don't you want to fail loud here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to consistent with org.apache.hudi.table.catalog.HoodieCatalog#getLatestTableSchema which catch Throwable rather than Exception, why only ignore Exception, there are many situation that not the subclass of Exception for example oom.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TJX2014 then getLatestTableSchema should be fixed too. We don't catch Throwable because Error should not be caught. Quote from javadoc

An Error is a subclass of Throwable that indicates serious problems that a reasonable application should not try to catch

So what is the strong reason to catch and ignore errors like OOM? You'd need to fail loud in that case.

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@TJX2014 TJX2014 requested a review from xushiyan September 30, 2022 07:49
@nsivabalan nsivabalan added priority:critical Production degraded; pipelines stalled reader-core labels Oct 19, 2022
@xushiyan
Copy link
Member

as mentioned in this comment https://github.com/apache/hudi/pull/6827/files#r985062923 we should make some improvements for places where catch Throwable. I'm closing this now. If further discussion needed, we can continue it here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority:critical Production degraded; pipelines stalled

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants