Skip to content

feat(exec): Add --no-session flag for improved performance#26727

Merged
openshift-merge-bot[bot] merged 1 commit intocontainers:mainfrom
ryanmccann1024:feature/26588-exec-no-session
Nov 20, 2025
Merged

feat(exec): Add --no-session flag for improved performance#26727
openshift-merge-bot[bot] merged 1 commit intocontainers:mainfrom
ryanmccann1024:feature/26588-exec-no-session

Conversation

@ryanmccann1024
Copy link
Contributor

@ryanmccann1024 ryanmccann1024 commented Jul 31, 2025

Fixes: #26588

For use cases like HPC, where podman exec is called in rapid succession, the standard exec process can become a bottleneck due to container locking and database I/O for session tracking.

This commit introduces a new --no-session flag to podman exec. When used, this flag invokes a new, lightweight backend implementation (ExecNoSession) that:

  • Skips container locking, reducing lock contention.
  • Bypasses the creation, tracking, and removal of exec sessions in the database.
  • Executes the command directly and retrieves the exit code without persisting session state.

Does this PR introduce a user-facing change?

Added a new `--no-session` flag to `podman exec` to provide a performance-optimized execution path that bypasses container locking and database session tracking. This is ideal for high-concurrency environments like HPC where exec session tracking is not required.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 31, 2025

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot added the do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None label Jul 31, 2025
@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from c03d7e3 to 5263c85 Compare July 31, 2025 20:21
@packit-as-a-service
Copy link

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch 3 times, most recently from d63e18d to f7d110b Compare August 1, 2025 01:41
@ryanmccann1024
Copy link
Contributor Author

Hello @mheon,

I'm not really sure why some of the pipelines fail, I'm stuck.

@mheon
Copy link
Member

mheon commented Aug 1, 2025

Integration tests, you need a SkipIfRemote in your tests. System tests are both flakes.

@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from f7d110b to e637650 Compare August 1, 2025 13:39
@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from e637650 to 1853aea Compare August 1, 2025 18:43
@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch 2 times, most recently from f5cfc13 to cd2b89f Compare August 6, 2025 15:53
@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from cd2b89f to 82dcb63 Compare August 6, 2025 22:32
@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from 82dcb63 to 1a91259 Compare August 7, 2025 15:18
Copy link
Member

@Honny1 Honny1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, LGTM

The failed Healthcheck seems to be a flake. I tested it locally and the Healthcheck passed.

@ryanmccann1024 ryanmccann1024 requested a review from mheon August 12, 2025 15:52
@mheon
Copy link
Member

mheon commented Aug 13, 2025

LGTM

@ryanmccann1024
Copy link
Contributor Author

I think there's still a pending review @mheon

Or I'm not sure if I should do something else before that.

@mheon
Copy link
Member

mheon commented Aug 15, 2025

@containers/podman-maintainers PTAL

return define.TranslateExecErrorToExitCode(ec, err), err
}

func (ic *ContainerEngine) ContainerExecNoSession(ctx context.Context, nameOrID string, options entities.ExecOptions, streams define.AttachStreams) (int, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason to define a new function on the interface like this, it could easily be added in ContainerExec()

In particular this function simply removes several required things that ContainerExec() does.

First this here never calls, getContainers() which means --latest will be broken

Second, it misses the if options.Tty branch to add the TERM env which means you get different behavior fro TERM in session on no session mode which seems very unexpected

Lastly it also doesn't do the tty resize logic that is in ExecAttachCtr() so the terminal is not set into the right state I think

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep this separate - I'm expecting further changes down the line to diverge from normal Exec()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

different how? we can split in libpod but doing this here on the cli/ContainerEngine just seems to bypass basic code that we must always do as I pointed out above.

We can always do if execNoSession inside ContainerExec() once we looked up the container and configured the basic exec config.

what further changes are expected here where this function makes sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complete removal of the Conmon backend in favor of directly calling OCI runtime exec directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure but that can still happen within ContainerExec(), right now we duplicate common lookup logic which I find quite bad due the bugs mentioned, at the end of ContainerExec() a simple if options.NoSession would be easier IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there something I should do related to this?

I'm a little unsure, will resolve it in the meantime!

@giuseppe
Copy link
Member

Do we have any numbers on what's the improvement is?

Can you run it under hyperfine and check what's the difference with a regular exec?

@giuseppe
Copy link
Member

Do we have any numbers on what's the improvement is?

Can you run it under hyperfine and check what's the difference with a regular exec?

I did the test on my machine and the results are very good:

➜ hyperfine 'bin/podman exec foo true' 'bin/podman exec --no-session foo true'
Benchmark 1: bin/podman exec foo true
  Time (mean ± σ):      80.2 ms ±   2.9 ms    [User: 22.6 ms, System: 14.6 ms]
  Range (min … max):    74.9 ms …  86.5 ms    34 runs

Benchmark 2: bin/podman exec --no-session foo true
  Time (mean ± σ):      29.9 ms ±   6.8 ms    [User: 20.1 ms, System: 11.9 ms]
  Range (min … max):    22.3 ms …  54.1 ms    97 runs

Summary
  bin/podman exec --no-session foo true ran
    2.68 ± 0.62 times faster than bin/podman exec foo true

@Luap99
Copy link
Member

Luap99 commented Sep 9, 2025

Got it!

I think I'm all set to keep working on this although I'm not fully clear on the last suggestion made by @Luap99 (I'm a newbie).

There is this in makeExecConfig() which sets a exit command on conmon which we don't want.

	// TODO: Add some ability to toggle syslog
	exitCommandArgs, err := specgenutil.CreateExitCommandArgs(storageConfig, runtimeConfig, logrus.IsLevelEnabled(logrus.DebugLevel), false, false, true)
	if err != nil {
		return nil, fmt.Errorf("constructing exit command for exec session: %w", err)
	}
	execConfig.ExitCommand = exitCommandArgs

So in theory all you need to do is to make sure the command is nil for you exec config then it should work I think.

@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from 1a91259 to bb3eaf0 Compare September 10, 2025 22:38
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 10, 2025
@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from bb3eaf0 to bb967f2 Compare September 10, 2025 22:43
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 10, 2025
@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from bb967f2 to 0d8b404 Compare September 10, 2025 22:45
@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from 0d8b404 to d15725c Compare September 11, 2025 14:06
Expect(execResult).Should(ExitWithError(127, "OCI runtime attempted to invoke a command that was not found"))

execSession := podmanTest.Podman([]string{"exec", "--no-session", ctrName, "sleep", "30"})
killSession := podmanTest.Podman([]string{"exec", ctrName, "sh", "-c", "kill -9 $(pgrep sleep)"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem here is that this is running serially - maybe put the execSession bit in a goroutine so they run concurrent? Though you'd need a sleep to sequence them, give time for the first exec to start (our CI is really slow)

@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch 2 times, most recently from d718ce4 to 2cb0bce Compare October 5, 2025 17:05
@ryanmccann1024
Copy link
Contributor Author

Any suggestions on how to fix the failure on that pipeline due to the merge?

I think there's also a linting error from upstream?

@Honny1
Copy link
Member

Honny1 commented Oct 6, 2025

Linting error fix: #27234

@Honny1
Copy link
Member

Honny1 commented Oct 16, 2025

@ryanmccann1024 Could you please rebase on main?

@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from 2cb0bce to 90057c1 Compare November 10, 2025 21:52
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 10, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Honny1, ryanmccann1024

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 10, 2025
@ryanmccann1024
Copy link
Contributor Author

Hello,

Should I do anything else for this PR?

Copy link
Member

@Honny1 Honny1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM. Please just rebase onto the latest main branch, as there were changes to the linter. This will verify that the new linter rules are satisfied.

cc @Luap99

Fixes: containers#26588

For use cases like HPC, where `podman exec` is called in rapid succession, the standard exec process can become a bottleneck due to container locking and database I/O for session tracking.

This commit introduces a new `--no-session` flag to `podman exec`. When used, this flag invokes a new, lightweight backend implementation that:

- Skips container locking, reducing lock contention
- Bypasses the creation, tracking, and removal of exec sessions in the database
- Executes the command directly and retrieves the exit code without persisting session state
- Maintains consistency with regular exec for container lookup, TTY handling, and environment setup
- Shares implementation with health check execution to avoid code duplication

The implementation addresses all performance bottlenecks while preserving compatibility with existing exec functionality including --latest flag support and proper exit code handling.

Changes include:
- Add --no-session flag to cmd/podman/containers/exec.go
- Implement lightweight execution path in libpod/container_exec.go
- Ensure consistent container validation and environment setup
- Add comprehensive exit code testing including signal handling (exit 137)
- Optimize configuration to skip unnecessary exit command setup

Signed-off-by: Ryan McCann <ryan_mccann@student.uml.edu>
Signed-off-by: ryanmccann1024 <ryan_mccann@student.uml.edu>
@ryanmccann1024 ryanmccann1024 force-pushed the feature/26588-exec-no-session branch from 90057c1 to 61cbc0c Compare November 19, 2025 17:45
@mheon
Copy link
Member

mheon commented Nov 19, 2025

Restarted 3 flakes

@mheon
Copy link
Member

mheon commented Nov 19, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 19, 2025
@mheon mheon removed the do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None label Nov 19, 2025
@mheon
Copy link
Member

mheon commented Nov 19, 2025

I don't know why it's not picking up the release note. Removed the label.

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I am still not happy with the duplication between ContainerExecNoSession() and ContainerExec(), people will forget to update them in sync if new option get added and it seems really unnecessary to me to define more of these "engine" interface functions when there is exactly one call only different.

But since others want to merge this then I won't object further

@openshift-merge-bot openshift-merge-bot bot merged commit 7cd9b81 into containers:main Nov 20, 2025
79 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sessionless Exec

6 participants