Skip to content

[server] Close StoreIngestionTask when the last partition of a storage engine is dropped by Helix#20

Merged
sixpluszero merged 13 commits intolinkedin:masterfrom
sixpluszero:fixStorageEngineBug
Oct 1, 2022
Merged

[server] Close StoreIngestionTask when the last partition of a storage engine is dropped by Helix#20
sixpluszero merged 13 commits intolinkedin:masterfrom
sixpluszero:fixStorageEngineBug

Conversation

@sixpluszero
Copy link
Copy Markdown
Contributor

@sixpluszero sixpluszero commented Sep 29, 2022

Close StoreIngestionTask when the last partition of a storage engine is dropped by Helix

This pull request fixes a bug in StoreIngestionTask(SIT) logic: When last partition of a storage engine is dropped but StoreIngestionTask is not close, new consumption request towards the same topic will use the removed empty storage engine which is still associated with the current SIT. However, StorageService will create a separated storage engine and create RocksDB partition in the new storage engine. When drainer is processing and persisting its first message, it will use the old storage engine and found no partition instance in there and thus lead to PersistenceFailureException

Now, it will close SIT when Helix is issuing the OFFLINE->DROPPED message towards the last partition of the storage engine.
When there is a new OFFLINE->STANDBY helix request coming in for the same topic, the KafkaStoreIngestionTask will create open new storage engine and create new SIT and associate to it.

How was this PR tested?

Internal CI Test

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

Copy link
Copy Markdown
Contributor

@FelixGV FelixGV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot Jialin! Left some minor comments, please take a look.

FelixGV
FelixGV previously approved these changes Sep 29, 2022
Copy link
Copy Markdown
Contributor

@FelixGV FelixGV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jialin, LGTM.

Copy link
Copy Markdown
Contributor

@FelixGV FelixGV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jialin, I left a few more comments. Please take a look. We can sync up if needed.

Comment thread clients/da-vinci-client/src/main/java/com/linkedin/davinci/VersionBackend.java Outdated
Copy link
Copy Markdown
Contributor

@FelixGV FelixGV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot Jialin! LGTM!

@sixpluszero sixpluszero merged commit c49536d into linkedin:master Oct 1, 2022
@sixpluszero sixpluszero deleted the fixStorageEngineBug branch October 1, 2022 00:52
eolivelli pushed a commit to eolivelli/venice that referenced this pull request May 26, 2023
ZacAttack pushed a commit that referenced this pull request Jun 1, 2023
…ion with Kafka (#452)

* [push-job] Fix VenicePushJob - loading Kafka configuration

* [tests] Fix tests failing due to SASL implementation

* [controller][server] Ensure that Kafka SASL properties are preserved (#10)

* [samza][pulsar-sink] Pass SASL properties to the Producer

* [server][pulsar-sink] More fixes about SASL authentication (#11)

- Fix Venice Server loading Kafka SASL parameters
- Remove JAAS configuration from PulsarSink tests, Kafka is not
configured with SASL
- Add a generic `writerConfig` to tune the PulsarSink

* [tests] Add Kafka SASL integration tests end-to-end (#20)

* Fix build

* Fix build

* Increase code coverage

* fix test
@sixpluszero sixpluszero changed the title [server] Close StoreIngestionTask when last partition is dropped by H… [server] Close StoreIngestionTask when the last partition of a storage engine is dropped by Helix Jun 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants