Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: merge the partition implementation of s3stream into new trunk branch of automq kafka #879

Merged
merged 285 commits into from
Mar 5, 2024

Conversation

daniel-y
Copy link
Contributor

@daniel-y daniel-y commented Mar 5, 2024

More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.

Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

TheR1sing3un and others added 30 commits August 29, 2023 11:05
1. add commit wal object test

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
1. still commit valid stream in wal object

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
1. replay WalObjectRecord to advance range's end offset

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
feat(s3): advance range's end offset by WalObjectRecord
1. support new version stream open/close operations

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
1. pass checkstyle

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
feat(s3): support new version stream open/close operations
1. support get streams' offset range
2. remove useless
CompactObject
related request/response
3. refactor WalObjectRequest to
contain
compacted objects and stream objects
4. increase all stream
related request/response apiKey

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
feat(s3): support get streams' offset range
1. complete wal object commit process in controller

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
feat(s3): complete wal object commit process in controller
1. replay new added state in StreamImage

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
feat(s3): replay new added state in StreamImage
1. boostrap with controller-metadata-manager

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
* fix: fix checkstyle and add ci flow for checkstyle and spotbugs (#106)

ci(s3): add checkstyle and spotbugs in ci

1. add checkstyle and spotbugs in ci
2. fix to pass checkstyle

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>

* fix: fix fetching problem brought by thread pool separating; fix style problems; add more thread for fetching and appending thread pool (#107)

* fix: fix fetching problem brought by thread pool separating; fix style problems; add more thread for fetching and appending thread pool

Signed-off-by: Curtis Wan <wcy9988@163.com>

* remove 'SLOW_FETCH_TIMEOUT_MILLIS - 1' case in quickFetch test and change 2 -> 'SLOW_FETCH_TIMEOUT_MILLIS / 2' in slowFetch test

Signed-off-by: Curtis Wan <wcy9988@163.com>

* refactor: add more comments to make logic more clear

Signed-off-by: Curtis Wan <wcy9988@163.com>

---------

Signed-off-by: Curtis Wan <wcy9988@163.com>

* refactor: close #105; add more threads for partition opening or closing (#109)

Signed-off-by: Curtis Wan <wcy9988@163.com>

* feat(es): client factory SPI (#112)

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>

* fix: close #108; make sure topics are all cleaned at the end of the test (#115)

Signed-off-by: Curtis Wan <wcy9988@163.com>

* fix: close #110; shutdown additional thread pools right now when broker is in shutdown (#111)

* fix: close #110; shutdown additional thread pools right now when broker is in shutdown

Signed-off-by: Curtis Wan <wcy9988@163.com>

* fix: move thread pools into AlwaysSuccessClient

Signed-off-by: Curtis Wan <wcy9988@163.com>

* fix: add more logs; fix partition leading or following

Signed-off-by: Curtis Wan <wcy9988@163.com>

---------

Signed-off-by: Curtis Wan <wcy9988@163.com>

* fix: close #117; fix return too early; add stack for log closing (#118)

Signed-off-by: Curtis Wan <wcy9988@163.com>

* fix: close #114; use position rather than offset as nextOffset for indexes; close metastream when log is closing (#116)

* wip: add more logs

Signed-off-by: Curtis Wan <wcy9988@163.com>

* fix: add ExceptionUtil; use position rather than offset as nextOffset for indexes

Signed-off-by: Curtis Wan <wcy9988@163.com>

* fix: fix style check

Signed-off-by: Curtis Wan <wcy9988@163.com>

---------

Signed-off-by: Curtis Wan <wcy9988@163.com>

* fix: close #119; skip deleting segments when metaStream is closed (#120)

Signed-off-by: Curtis Wan <wcy9988@163.com>

* feat: close #123; Handle no-retryable exceptions thrown by Elastic Stream SDK (#124)

* feat: close #123; Handle no-retryable exceptions thrown by Elastic Stream SDK. The corresponding partitions will be offline; Add context info to pass unit tests for
cases that fetching just follows appending.

Signed-off-by: Curtis Wan <wcy9988@163.com>

* feat: enhancement codes refered in discussions

Signed-off-by: Curtis Wan <wcy9988@163.com>

* fix: style problem

Signed-off-by: Curtis Wan <wcy9988@163.com>

* test: wait for more time for quick/slow fetch tests

Signed-off-by: Curtis Wan <wcy9988@163.com>

* test: start fetching after appending finished

Signed-off-by: Curtis Wan <wcy9988@163.com>

---------

Signed-off-by: Curtis Wan <wcy9988@163.com>

* test: add slowFetchDelay (#135)

Signed-off-by: Curtis Wan <wcy9988@163.com>

* feat(core): Add implementation of AutoBalancer components and unit tests  (#134)

* feat(core): Add customized MetricsReporter and partition-level metrics

- add partition level metrics for BytesInPerSec and BytesOutPerSec
- implement CruiseControlMetrics to monitor and report interested Yammer
metrics

Closes #78

Signed-off-by: sc.nieh <s.c.ney516@gmail.com>

* feat(core): AutoBalancerMetricsReporter optimization

- pre-aggregate for broker and partition level metrics
- fill with empty value for probably missing metrics

#78

Signed-off-by: sc.nieh <s.c.ney516@gmail.com>

* feat(core): Implement load retriever for auto balancer

#77

Signed-off-by: sc.nieh <s.c.ney516@gmail.com>

* feat(core): Implement AutoBalancerManager and AnomalyDetector

#77

Signed-off-by: sc.nieh <s.c.ney516@gmail.com>

* fix(core): Fix inconsistent in-flight request check

Closes #121

Signed-off-by: sc.nieh <s.c.ney516@gmail.com>

* feat(core): Create metrics reporter topic on controller

Signed-off-by: sc.nieh <s.c.ney516@gmail.com>

* fix(core): fix execute interval of ExecutionManager

Signed-off-by: sc.nieh <s.c.ney516@gmail.com>

* fix: add esUnit tag to unit tests of autobalancer

Signed-off-by: sc.nieh <s.c.ney516@gmail.com>

---------

Signed-off-by: sc.nieh <s.c.ney516@gmail.com>

---------

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
Signed-off-by: Curtis Wan <wcy9988@163.com>
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
Signed-off-by: sc.nieh <s.c.ney516@gmail.com>
Co-authored-by: TheR1sing3un <87409330+TheR1sing3un@users.noreply.github.com>
Co-authored-by: Robin Han <hanxvdovehx@gmail.com>
Co-authored-by: Shichao Nie <s.c.ney516@gmail.com>
1. support controller-kv client
2. change some logs' level from info to
trace

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
1. replace object-id with order-id for WAL

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
1. return first object id in preparedObjectResponse

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
…ager` (#64)

* refactor(s3): remove inflight wal objects

1. remove inflight wal objects
2. delete redundant classes

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>

* feat(s3): support blocking `getObjects` in `StreamMetadataManager`

1. support blocking `getObjects` in `StreamMetadataManager`
2. refactor
`getObjects`
3. change the unit of `s3.cache.size` from `MB`
to `B`

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>

* feat(s3): more suitable log level

1. more suitable log level

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>

* fix(s3): add more concurrent protection

1. add more concurrent protection

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>

---------

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
1. handle stream objects commit in controller

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
1. handle invalid commit request

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
* feat: rename GetStreamsOffset to GetOpeningStreams

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>

* feat: get broker opening streams

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>

---------

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
* feat(s3): unify S3 object metadata

1. unify S3 object metadata

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>

* fix(s3): fix after rebasing

1. fix after rebasing

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>

* feat(s3): log stream-objects' info when commit wal object

1. log stream-objects' info when commit wal object

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>

---------

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
1. support retry to-controller request

Signed-off-by: TheR1sing3un <ther1sing3un@163.com>
* feat: get broker opening streams

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>

* feat: recover from crash (WAL buggy version)

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>

---------

Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
* fix(s3): fixed bugs on uploading WAL object during compaction

- eliminate race condition on uploading WAL object by using sequential
writing
- prevent blocking on multipart upload when part size is less
than MIN_PART_SIZE

Signed-off-by: Shichao Nie <niesc@automq.com>

* fix(s3): used pooled buffer in DataBlockWriter; add newFixedThreadPool to Threads

Signed-off-by: Shichao Nie <niesc@automq.com>

---------

Signed-off-by: Shichao Nie <niesc@automq.com>
SCNieh and others added 21 commits February 26, 2024 18:12
…er (#843)

Signed-off-by: Shichao Nie <niesc@automq.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
* feat(metrics): metrics on fetch limiters

Signed-off-by: Ning Yu <ningyu@automq.com>

* feat(metrics): metrics on fetch executors' queue size

Signed-off-by: Ning Yu <ningyu@automq.com>

* style: fix check style

Signed-off-by: Ning Yu <ningyu@automq.com>

---------

Signed-off-by: Ning Yu <ningyu@automq.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
- deduplication for consumer lag calculation on partition reassignment
- add get/put object avg latency panels

Signed-off-by: Shichao Nie <niesc@automq.com>
* fix(telemetry): add missing percentile metrics

Signed-off-by: Shichao Nie <niesc@automq.com>

* fix(telemetry): increase metric expiration time for collector

Signed-off-by: Shichao Nie <niesc@automq.com>

---------

Signed-off-by: Shichao Nie <niesc@automq.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
Signed-off-by: Robin Han <hanxvdovehx@gmail.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
…eporter interval (#875)

Signed-off-by: Shichao Nie <niesc@automq.com>
Signed-off-by: Shichao Nie <niesc@automq.com>
@CLAassistant
Copy link

CLAassistant commented Mar 5, 2024

CLA assistant check
All committers have signed the CLA.

@superhx superhx merged commit bcfd816 into merge_trunk Mar 5, 2024
1 check passed
@superhx superhx deleted the s3stream-to-trunk branch March 5, 2024 04:51
@daniel-y daniel-y restored the s3stream-to-trunk branch March 14, 2024 08:50
daniel-y pushed a commit that referenced this pull request Mar 14, 2024
…t exist (#879)

Signed-off-by: SSpirits <admin@lv5.moe>
ShadowySpirits added a commit that referenced this pull request Mar 14, 2024
…t exist (#879)

Signed-off-by: SSpirits <admin@lv5.moe>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants