Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wip 13829 #7390

Merged
merged 2 commits into from Mar 2, 2016
Merged

Wip 13829 #7390

merged 2 commits into from Mar 2, 2016

Conversation

badone
Copy link
Contributor

@badone badone commented Jan 28, 2016

A recent commit disabled negative values but they are required for variables
such as filestore_merge_threshold.

Add test case to qa/workunits/cephtool/test.sh to test this in future and remove redundant tests for negative values.

Fixes: #13829

A recent commit disabled negative values but they are required for variables
such as filestore_merge_threshold.

Fixes: ceph#13829

Signed-off-by: Kefu Chai <kchai@redhat.com>
Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
qa/workunits/cephtool/test.sh: add test for setting negative int options
src/test/daemon_config.cc: remove tests for failed negative values

Fixes: ceph#13829

Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
@tchaikov
Copy link
Contributor

tchaikov commented Feb 2, 2016

hi @jecluis could you help review this change?

@jecluis
Copy link
Member

jecluis commented Feb 2, 2016

The changes to strict_si_cast look good, and TIL about std::numeric_limits, which is by itself cool.

The test however concerns me a bit. We were testing assuming this function would return error on a negative value; the patch now removes that assumption without making sure the users of the function are protected against negative values. I fear this may cause regressions (e.g., are we sure 'num_clients' will return error if someone tries to set it to '-1' ?)

@badone
Copy link
Contributor Author

badone commented Feb 2, 2016

The question is where should the responsibility for validating input lie? Do we have any guidelines regarding this? How did we manage this in 0.94.1 before the change that began rejecting all negative values and led to 13829? If guidelines have not been set about who/where we should validate inputs then we should probably start that discussion right?

@tchaikov
Copy link
Contributor

tchaikov commented Feb 2, 2016

The test however concerns me a bit. We were testing assuming this function would return error on a negative value; the patch now removes that assumption without making sure the users of the function are protected against negative values.

@jecluis we used to accept a negative num_clients before my fix of http://tracker.ceph.com/issues/11484. and i thought it was a bug to accept a negative number for this setting. but it turns out we do use a negative OPT_INT under some circumstances (http://tracker.ceph.com/issues/13829).

so, i think a better way to address the negative num_clients issue, is to apply this patch

--- a/src/common/config_opts.h
+++ b/src/common/config_opts.h
@@ -19,7 +19,7 @@ OPTION(public_addr, OPT_ADDR, entity_addr_t())
 OPTION(cluster_addr, OPT_ADDR, entity_addr_t())
 OPTION(public_network, OPT_STR, "")
 OPTION(cluster_network, OPT_STR, "")
-OPTION(num_client, OPT_INT, 1)
+OPTION(num_client, OPT_U32, 1)
 OPTION(monmap, OPT_STR, "")
 OPTION(mon_host, OPT_STR, "")
 OPTION(lockdep, OPT_BOOL, false)

and then revert the partial revert in @badone 's commit of 994ac29 . what do you think?

@jecluis
Copy link
Member

jecluis commented Feb 2, 2016

@tchaikov now that you mentioned just changing the config option type, I thought I'd better figure out what are the callers of strict_si_cast(), as I don't want to be talking for the sake of being annoying.

Funny thing I found though, is that changing from OPT_INT to OPT_U32 would be very much the same thing. See https://github.com/ceph/ceph/blob/master/src/common/config.cc#L959

Again, as I said on IRC, I don't mind changing the semantics of the function, but we must make sure the callers are safe with this change. Removing a test without making sure it's not needed, and that the safe guards guaranteed by the previous semantics are in place, just doesn't sit well with me.

@tchaikov
Copy link
Contributor

tchaikov commented Feb 2, 2016

the tests removed in this pr were introduced in #4494, but the "safe guard" guaranteed by them were not needed actually, on the contrary, they are guarding a bug and and came with a regression (http://tracker.ceph.com/issues/13829). because a negative OPT_INT is legitimate for us.

@tchaikov
Copy link
Contributor

tchaikov commented Feb 5, 2016

@jecluis

we are changing

  • strict_si_cast<int>()
  • strict_si_cast<long long>()
  • strict_si_cast<uint64_t>()

among which, the semantic of strict_si_cast<int>(), strict_si_cast<long long>() is changed in this pull request, so they are now able to convert a string presenting a negative number to the corresponding number instead of complaining that

strict_sistrtoll: value should not be negative

that's why the test for the conf->set_val("num_client", "-1") was removed. as it is guarding a bug where an OPT_INT option can not accept a negative number. the caller of the strict_si_cast<int>() and strict_si_cast<long long>() were able to handle negative before d62f80d , so i believe that they should be still fine with the semantic of this pull request.

so the net result of d62f80d plus this change is: strict_si_cast() checks the integer overflow, and also be able to return negative values as it was.

@jecluis
Copy link
Member

jecluis commented Mar 1, 2016

Sorry for forgetting about this far too many times to count.

I don't have a problem with the fix for the bug, or the changes made, per se. The changes seem sane and they do make sense.

I still do have a huge problem with dropping the test for num_clients. As I see it, there were two bugs, and this pull request only fixes one of them. The other one is allowing a negative value to be assigned to 'num_client'.

You can easily argue that is out of the scope of this PR, and I will accept that. But the truth is that you are removing a test that made sure this would not happen, regardless of how bogus it was.

Also, this also brings back some serious concerns about allowing negative values to be passed to options that expect a UINT (as 'num_client' should have in the first place). Yes, we were testing the interface, and the function that handled it was broken -- any large enough unsigned value would always be assumed as negative.

But dropping the test without a proper solution concerns me.

Feel free to merge this whenever you are ready. I'll be creating a ticket to tackle the signal issue, as best as possible, in order to ensue the correct interface is exposed to the user in a way that makes sense and is predictable.

@jecluis
Copy link
Member

jecluis commented Mar 1, 2016

Okay, for future reference, created http://tracker.ceph.com/issues/14934

tchaikov added a commit that referenced this pull request Mar 2, 2016
common: Allow OPT_INT settings with negative values

Reviewed-by: Joao Eduardo Luis <joao@suse.de>
@tchaikov tchaikov merged commit 389ecbb into ceph:jewel Mar 2, 2016
@tchaikov
Copy link
Contributor

tchaikov commented Mar 2, 2016

merged

@jecluis thank you. we will follow it up on http://tracker.ceph.com/issues/14934 to make sure that we have a proper interface for the non-negative settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants