New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mgr/cephadm: validating service_id for MDS #44928
Conversation
| @@ -1074,6 +1074,9 @@ def _apply_mds(self, | |||
| placement=PlacementSpec.from_string(placement), | |||
| unmanaged=unmanaged, | |||
| preview_only=dry_run) | |||
|
|
|||
| spec.validate() # force any validation exceptions to be caught correctly | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validate() method checks the spec object and validate its data to detect any error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it actually ends up reaching the validate anyway even without the validate call. I commented this line out and got
[ceph: root@vm-00 /]# ceph orch apply mds 5
Error EINVAL: Traceback (most recent call last):
File "/usr/share/ceph/mgr/mgr_module.py", line 1701, in _handle_command
return self.handle_command(inbuf, cmd)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 171, in handle_command
return dispatch[cmd['prefix']].call(self, cmd, inbuf)
File "/usr/share/ceph/mgr/mgr_module.py", line 433, in call
return self.func(mgr, **kwargs)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in <lambda>
wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) # noqa: E731
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper
return func(*args, **kwargs)
File "/usr/share/ceph/mgr/orchestrator/module.py", line 1080, in _apply_mds
return self._apply_misc([spec], dry_run, format, no_overwrite)
File "/usr/share/ceph/mgr/orchestrator/module.py", line 1046, in _apply_misc
raise_if_exception(completion)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 228, in raise_if_exception
raise e
Exception: MDS service id cannot start with a numeric digit
which, in the cephadm mgr module actually went through
Traceback (most recent call last):
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 125, in wrapper
return OrchResult(f(*args, **kwargs))
File "/usr/share/ceph/mgr/cephadm/module.py", line 2504, in apply
results.append(self._apply(spec))
File "/usr/share/ceph/mgr/cephadm/module.py", line 2377, in _apply
return self._apply_service_spec(cast(ServiceSpec, spec))
File "/usr/share/ceph/mgr/cephadm/module.py", line 2482, in _apply_service_spec
allow_colo=self.cephadm_services[spec.service_type].allow_colo(),
File "/usr/share/ceph/mgr/cephadm/schedule.py", line 174, in validate
self.spec.validate()
File "/lib/python3.6/site-packages/ceph/deployment/service_spec.py", line 684, in validate
raise SpecValidationError('MDS service id cannot start with a numeric digit')
ceph.deployment.hostspec.SpecValidationError: MDS service id cannot start with a numeric digit
so I think we don't actually need this call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are already doing the same (validating the spec ASAP) in for the SNMPGatewaySpec:
spec = SNMPGatewaySpec(
snmp_version=snmp_version,
port=port,
credentials=credentials,
snmp_destination=destination,
engine_id=engine_id,
auth_protocol=auth_protocol,
privacy_protocol=privacy_protocol,
placement=PlacementSpec.from_string(placement),
unmanaged=unmanaged,
preview_only=dry_run
)
spec.validate() # force any validation exceptions to be caught correctly
In my opinion data (in general) have to be validated ASAP. In fact in the above trace we make several calls, in several classes to end up calling the validation (to detect that the spec is invalid). Something that we could have done earlier. In think both for code readability (and for analyzing errors in the future if any) it's better to do the validation ASAP so in this case I'd prefer to keep this explicit call to .validate() method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in standup, we're okay with having this check called directly, but maybe we should actually add this to other apply commands as well to have the functionality be uniform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added validate() checks for the other services
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see also: #44995
2e44173
to
767a134
Compare
| @@ -1115,6 +1114,8 @@ def _apply_rgw(self, | |||
| preview_only=dry_run | |||
| ) | |||
|
|
|||
| spec.validate() # force any validation exceptions to be caught correctly | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is minor but do you mean "early" or "immediately" rather than "correctly" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this case is immediatly since the exception will be raised right away
| service_id: Optional[str] = None, | ||
| placement: Optional[PlacementSpec] = None, | ||
| unmanaged: bool = False, | ||
| preview_only: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgfritch do you think there are any other parameters we could maybe need for mds or is this enough
|
jenkins retest this please |
d52a9b0
to
d9c9dd1
Compare
d9c9dd1
to
3e108f6
Compare
Fixes: https://tracker.ceph.com/issues/54184 Signed-off-by: Redouane Kachach <rkachach@redhat.com>
3e108f6
to
db765bd
Compare
(cherry picked from commit db765bd)
(cherry picked from commit db765bd)
line 2 Backport of ceph#44928
Fixes: https://tracker.ceph.com/issues/54184
Signed-off-by: Redouane Kachach rkachach@redhat.com
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox