Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/cephadm: validating service_id for MDS #44928

Merged
merged 1 commit into from Feb 23, 2022

Conversation

rkachach
Copy link
Contributor

@rkachach rkachach commented Feb 7, 2022

Fixes: https://tracker.ceph.com/issues/54184

Signed-off-by: Redouane Kachach rkachach@redhat.com

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@@ -1074,6 +1074,9 @@ def _apply_mds(self,
placement=PlacementSpec.from_string(placement),
unmanaged=unmanaged,
preview_only=dry_run)

spec.validate() # force any validation exceptions to be caught correctly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validate() method checks the spec object and validate its data to detect any error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it actually ends up reaching the validate anyway even without the validate call. I commented this line out and got

[ceph: root@vm-00 /]# ceph orch apply mds 5 
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1701, in _handle_command
    return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 171, in handle_command
    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 433, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in <lambda>
    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)  # noqa: E731
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/module.py", line 1080, in _apply_mds
    return self._apply_misc([spec], dry_run, format, no_overwrite)
  File "/usr/share/ceph/mgr/orchestrator/module.py", line 1046, in _apply_misc
    raise_if_exception(completion)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 228, in raise_if_exception
    raise e
Exception: MDS service id cannot start with a numeric digit

which, in the cephadm mgr module actually went through

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 125, in wrapper
    return OrchResult(f(*args, **kwargs))
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2504, in apply
    results.append(self._apply(spec))
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2377, in _apply
    return self._apply_service_spec(cast(ServiceSpec, spec))
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2482, in _apply_service_spec
    allow_colo=self.cephadm_services[spec.service_type].allow_colo(),
  File "/usr/share/ceph/mgr/cephadm/schedule.py", line 174, in validate
    self.spec.validate()
  File "/lib/python3.6/site-packages/ceph/deployment/service_spec.py", line 684, in validate
    raise SpecValidationError('MDS service id cannot start with a numeric digit')
ceph.deployment.hostspec.SpecValidationError: MDS service id cannot start with a numeric digit

so I think we don't actually need this call

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are already doing the same (validating the spec ASAP) in for the SNMPGatewaySpec:

        spec = SNMPGatewaySpec(
            snmp_version=snmp_version,
            port=port,
            credentials=credentials,
            snmp_destination=destination,
            engine_id=engine_id,
            auth_protocol=auth_protocol,
            privacy_protocol=privacy_protocol,
            placement=PlacementSpec.from_string(placement),
            unmanaged=unmanaged,
            preview_only=dry_run
        )

       spec.validate()  # force any validation exceptions to be caught correctly

In my opinion data (in general) have to be validated ASAP. In fact in the above trace we make several calls, in several classes to end up calling the validation (to detect that the spec is invalid). Something that we could have done earlier. In think both for code readability (and for analyzing errors in the future if any) it's better to do the validation ASAP so in this case I'd prefer to keep this explicit call to .validate() method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in standup, we're okay with having this check called directly, but maybe we should actually add this to other apply commands as well to have the functionality be uniform

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added validate() checks for the other services

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see also: #44995

src/python-common/ceph/deployment/service_spec.py Outdated Show resolved Hide resolved
@@ -1115,6 +1114,8 @@ def _apply_rgw(self,
preview_only=dry_run
)

spec.validate() # force any validation exceptions to be caught correctly
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is minor but do you mean "early" or "immediately" rather than "correctly" here?

Copy link
Contributor Author

@rkachach rkachach Feb 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case is immediatly since the exception will be raised right away

service_id: Optional[str] = None,
placement: Optional[PlacementSpec] = None,
unmanaged: bool = False,
preview_only: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgfritch do you think there are any other parameters we could maybe need for mds or is this enough

@rkachach
Copy link
Contributor Author

jenkins retest this please

src/pybind/mgr/mds_autoscaler/module.py Outdated Show resolved Hide resolved
Fixes:  https://tracker.ceph.com/issues/54184

Signed-off-by: Redouane Kachach <rkachach@redhat.com>
@adk3798 adk3798 merged commit 5e8c4dc into ceph:master Feb 23, 2022
12 checks passed
rkachach added a commit to rkachach/ceph that referenced this pull request Mar 2, 2022
(cherry picked from commit db765bd)
rkachach added a commit to rkachach/ceph that referenced this pull request Mar 2, 2022
(cherry picked from commit db765bd)
rkachach added a commit to rkachach/ceph that referenced this pull request Mar 2, 2022
rkachach added a commit to rkachach/ceph that referenced this pull request Mar 2, 2022
line 2
Backport of ceph#44928
rkachach added a commit to rkachach/ceph that referenced this pull request Mar 2, 2022
@rkachach rkachach deleted the fix_issue_54184 branch March 10, 2022 11:04
@adk3798 adk3798 mentioned this pull request Apr 27, 2022
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants