Skip to content

DAOS-18607 object: misc fixes to improve server side congestion#18032

Merged
gnailzenh merged 1 commit intomasterfrom
Nasf-Fan/DAOS-18607_3
Apr 28, 2026
Merged

DAOS-18607 object: misc fixes to improve server side congestion#18032
gnailzenh merged 1 commit intomasterfrom
Nasf-Fan/DAOS-18607_3

Conversation

@Nasf-Fan
Copy link
Copy Markdown
Contributor

@Nasf-Fan Nasf-Fan commented Apr 17, 2026

Include the following improvements:

  1. Increase RPC retry latency to reduce server load that is caused by resent IO requests.

  2. Use current ULT to send DTX RPC instead of via helper::dss_chore to avoid being blocked
    when helper xstream is too busy.

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

@github-actions
Copy link
Copy Markdown

Errors are Unable to load ticket data
https://daosio.atlassian.net/browse/DAOS-18607

@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-18032/1/testReport/

@Nasf-Fan
Copy link
Copy Markdown
Contributor Author

test_dfuse_daos_build_wb failed for DAOS-18813, not related with the patch.

Comment thread src/dtx/dtx_rpc.c
* Direct use current ULT instead of dss_chore to send DTX RPC to avoid being
* blocked by some slow dss_chore users. DAOS-18607.
*/
if (0 && dss_has_enough_helper()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if possible we might expose a new helper like dss_core_is_busy() and do something like rather than totally disable dss_core.

if (dss_has_enough_helper() && !dss_core_is_busy())

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if possible we might expose a new helper like dss_core_is_busy() and do something like rather than totally disable dss_core.

if (dss_has_enough_helper() && !dss_core_is_busy())

In theory, that is possible. But the difficulty is how to define and measure the busy status.

@gnailzenh gnailzenh marked this pull request as ready for review April 20, 2026 10:22
@gnailzenh gnailzenh requested review from a team as code owners April 20, 2026 10:22
@Nasf-Fan Nasf-Fan requested review from gnailzenh and mchaarawi April 20, 2026 16:22
@Nasf-Fan Nasf-Fan force-pushed the Nasf-Fan/DAOS-18607_3 branch 3 times, most recently from d5d0ba5 to 3e0806f Compare April 23, 2026 16:06
Include the following improvements:

1. Increase RPC retry latency to reduce server load that is caused by
   resent IO requests.

2. Use current ULT to send DTX RPC instead of via dss_chore to avoid
   being blocked by some slow dss_chore uers.

Signed-off-by: Fan Yong <fan.yong@hpe.com>
@Nasf-Fan Nasf-Fan force-pushed the Nasf-Fan/DAOS-18607_3 branch from 3e0806f to 0337f5e Compare April 24, 2026 02:09
@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-18032/6/testReport/

@Nasf-Fan
Copy link
Copy Markdown
Contributor Author

  • continuous-integration/jenkins/pr-head

test_dfuse_daos_build_wb still failed for DAOS-18813, not related with the patch.

@gnailzenh gnailzenh merged commit 0466eba into master Apr 28, 2026
37 of 39 checks passed
@gnailzenh gnailzenh deleted the Nasf-Fan/DAOS-18607_3 branch April 28, 2026 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants