Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cephadm: ISCSI: Allow unlimited number of threads #42214

Closed

Conversation

sebastian-philipp
Copy link
Contributor

Otherwise we're not able to create max luns per target.

Signed-off-by: Sebastian Wagner sewagner@redhat.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@sebastian-philipp sebastian-philipp requested a review from a team as a code owner July 7, 2021 11:43
@@ -2498,6 +2498,8 @@ def get_container(ctx: CephadmContext,
# So the container can modprobe iscsi_target_mod and have write perms
# to configfs we need to make this a privileged container.
privileged = True
# --pids-limit 0 to allow more than 2048 threads
container_args.append('--pids-limit=0')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are consciously ignoring TasksMax=infinity from systemd/ceph*@.service.in templates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately the cephadm binary has not yet access to the templates there. That depends on #41855 . See

ceph/src/cephadm/cephadm

Lines 3094 to 3146 in f0b79b3

def get_unit_file(ctx, fsid):
# type: (CephadmContext, str) -> str
extra_args = ''
if isinstance(ctx.container_engine, Podman):
extra_args = ('ExecStartPre=-/bin/rm -f %t/%n-pid %t/%n-cid\n'
'ExecStopPost=-/bin/rm -f %t/%n-pid %t/%n-cid\n'
'Type=forking\n'
'PIDFile=%t/%n-pid\n')
if ctx.container_engine.version >= CGROUPS_SPLIT_PODMAN_VERSION:
extra_args += 'Delegate=yes\n'
docker = isinstance(ctx.container_engine, Docker)
u = """# generated by cephadm
[Unit]
Description=Ceph %i for {fsid}
# According to:
# http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget
# these can be removed once ceph-mon will dynamically change network
# configuration.
After=network-online.target local-fs.target time-sync.target{docker_after}
Wants=network-online.target local-fs.target time-sync.target
{docker_requires}
PartOf=ceph-{fsid}.target
Before=ceph-{fsid}.target
[Service]
LimitNOFILE=1048576
LimitNPROC=1048576
EnvironmentFile=-/etc/environment
ExecStart=/bin/bash {data_dir}/{fsid}/%i/unit.run
ExecStop=-{container_path} stop ceph-{fsid}-%i
ExecStopPost=-/bin/bash {data_dir}/{fsid}/%i/unit.poststop
KillMode=none
Restart=on-failure
RestartSec=10s
TimeoutStartSec=120
TimeoutStopSec=120
StartLimitInterval=30min
StartLimitBurst=5
{extra_args}
[Install]
WantedBy=ceph-{fsid}.target
""".format(container_path=ctx.container_engine.path,
fsid=fsid,
data_dir=ctx.data_dir,
extra_args=extra_args,
# if docker, we depend on docker.service
docker_after=' docker.service' if docker else '',
docker_requires='Requires=docker.service\n' if docker else '')
return u

for the current template.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since all those templates override TasksMax, perhaps TasksMax=infinity could be added to this template in the interim?

Copy link
Contributor

@idryomov idryomov Jul 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering because --pids-limit seems to be treated differently between podman and docker:

$ podman run --help | grep pids-limit
      --pids-limit int                           Tune container pids limit (set 0 for unlimited, -1 for server defaults) (default 2048)

$ docker run --help | grep pids-limit
      --pids-limit int                 Tune container pids limit (set -1 for unlimited)

And also because there are other settings in those templates that we are missing in cephadm deployments. For example LimitNOFILE=1048576?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LimitNOFILE etc is there already:

ceph/src/cephadm/cephadm

Lines 3122 to 3123 in f0b79b3

LimitNOFILE=1048576
LimitNPROC=1048576

So this must come from somewhere else. But you're right, we should probably see, if we can adopt other settings as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed the shared section. Gihub quotes can be deceiving...

So why TasksMax=infinity isn't there in the shared section? Is there a particular reason it was omitted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed. Does this work for you? Yes, we need to get the service files in sync definitely.

Copy link
Contributor

@dsavineau dsavineau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't fix the tcmu-runner container TaskMax value as the container isn't managed by systemd.

@sebastian-philipp
Copy link
Contributor Author

This doesn't fix the tcmu-runner container TaskMax value as the container isn't managed by systemd.

By any chance, do you know what needs to be done to fix this?

@idryomov
Copy link
Contributor

idryomov commented Jul 8, 2021

I suspect my initial idea of passing --pids-limit would work since that would go directly to podman/libpod and not through systemd but that is definitely not a proper fix.

@sebastian-philipp
Copy link
Contributor Author

I think we have to do both:

  • TasksMax in order to be independent of any systemd defaults for all other containers
  • --pids-limit for the tcmu-runner

@dsavineau
Copy link
Contributor

I think we have to do both:

  • TasksMax in order to be independent of any systemd defaults for all other containers
  • --pids-limit for the tcmu-runner

It looks like using TasksMax=infinity in the systemd unit doesn't have an impact on the ceph processes running the container.

Only using --pids-limit is the right solution

Limits (like e.g. TaskMax) of the tcmu-container are not managed by systemd, as the
container is executed in the background. Thus we have to set
again here.

Signed-off-by: Sebastian Wagner <sewagner@redhat.com>
@sebastian-philipp
Copy link
Contributor Author

@dsavineau + @idryomov : in pacific, with the exception of the tcmu-runner, the cgroups are indeed shared between the systemd unit and the containers. Haven't yet landed in downstream, but in upstream this fix should be enough.

@sebastian-philipp sebastian-philipp added needs-qa wip-swagner-testing My Teuthology tests and removed wip-swagner-testing My Teuthology tests labels Jul 14, 2021
Copy link
Contributor

@idryomov idryomov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @lxbsz has indicated that we want unlimited threads for both iscsi containers (i.e. non tcmu-runner container too), so this workaround appears incomplete to me.

@adk3798
Copy link
Contributor

adk3798 commented Jul 15, 2021

it seems like more changes might be wanted here but at least the current version of this is passing

http://pulpito.front.sepia.ceph.com/adking-2021-07-14_17:47:06-rados:cephadm-wip-adk-testing-2021-07-14-1117-distro-basic-smithi/

@lxbsz
Copy link
Member

lxbsz commented Jul 16, 2021

I think @lxbsz has indicated that we want unlimited threads for both iscsi containers (i.e. non tcmu-runner container too), so this workaround appears incomplete to me.

Yeah, we need to fix this in both containers, or we will hit the same issue sooner or later. In that BZ I hit the threads limitation issue in the tcmu-runner container and Gopi hit it in the ceph-iscsi container instead.

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@stale
Copy link

stale bot commented Jan 9, 2022

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@idryomov
Copy link
Contributor

This PR is replaced by #44579.

@idryomov idryomov closed this Jan 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants