Skip to content

branch-4.0: [fix](job) fix NPE in routine load Kafka meta request #63180#63510

Open
github-actions[bot] wants to merge 1 commit into
branch-4.0from
auto-pick-63180-branch-4.0
Open

branch-4.0: [fix](job) fix NPE in routine load Kafka meta request #63180#63510
github-actions[bot] wants to merge 1 commit into
branch-4.0from
auto-pick-63180-branch-4.0

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Cherry-picked from #63180

### What problem does this PR solve?

Problem Summary:

In a single-BE deployment, Kafka routine load fetches topic metadata
through the only BE. If that BE cannot connect to Kafka, the metadata
request fails and the BE is skipped in the current retry loop. Then FE
may have no normal candidate backend left and falls back to backend ids
in the routine load blacklist.

The blacklist can contain stale backend ids that no longer exist in
`SystemInfoService`. In that case, `KafkaUtil` may get a null `Backend`
and throw a NullPointerException when calling `be.getHost()`. This hides
the real Kafka metadata error, such as broker connection failure.

This PR filters stale backend ids when reading the routine load
blacklist and adds a final null check before creating the BE address.
The original Kafka metadata error is preserved instead of being replaced
by the secondary NPE.

A regression case is added with an invalid `kafka_broker_list` to verify
that routine load reports the expected Kafka metadata error path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant