Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib and tool: install large bit mask type when more than 32 protocols #9534

Closed
wants to merge 1 commit into from

Conversation

monnerat
Copy link
Contributor

@monnerat monnerat commented Sep 18, 2022

In the library, complete support for more than 32 protocols:

  • Describe a conditional sequence to define a new protocol bit mask.
  • Introduce a conditional curl_prot_t definition based on high protocols availability.
  • Fix a cast in a curl_prot_t assignment.
  • Remove an outdated comment.

Large prototype bit mask type is also introduced in the cli tool. As there is no way to determine whether the library supports high protocols at compile time, this is unconditional.

Follow-up to cd5ca80.

@bagder
Copy link
Member

bagder commented Sep 18, 2022

Doesn't this need some precaution or checks for the case where curl_off_t is not larger than unsigned int ? I imagine we could possibly make libcurl work with >32 protocols even if curl_off_t is 32 bits...

@monnerat
Copy link
Contributor Author

monnerat commented Sep 18, 2022

Doesn't this need some precaution or checks for the case where curl_off_t is not larger than unsigned int ?

I already had some thinking about it and ended up with the following points:

  • As it is now in this PR, we'll surely have compile errors on 32-bit curl_off_t platforms if a high protocol is enabled.
  • If we want to avoid it, what to do in the conditional branch for 32-bit curl_off_t and high protocol enabled? I don't see something useful but a #error, which is also a compile error.
  • As cited in the previous days' discussions, we are not even sure such 32-bit only platforms currently exist and support curl!

I came to the conclusion that adding several lines in configure and in header files just to generate another error message is not worth it.
Any suggestion?

I imagine we could possibly make libcurl work with >32 protocols even if curl_off_t is 32 bits...

For that, we should have an fd_set-like curl_prot_t and forget about simple C bitwise operators on it, I'm afraid.
If this feature is retained, we better differentiate single protocol and protocol set types as I did in the cli tool, because a non-scalar curl_prot_t field is hardly initialisable statically.

@bagder
Copy link
Member

bagder commented Sep 19, 2022

forget about simple C bitwise operators on it, I'm afraid

I don't see how we are married to the bitwise operators in the tool code. We could replace that without the same limitations. It doesn't even have to be lightning fast because we dont' do those checks very often.

@bagder
Copy link
Member

bagder commented Sep 19, 2022

Also, I think you're mixing two completely separate changes in a single PR here. The tool change and library one seem unrelated. Right?

@monnerat
Copy link
Contributor Author

monnerat commented Sep 19, 2022

I don't see how we are married to the bitwise operators in the tool code.

I was rather and primary speaking about the library. Your mailing list answer https://curl.se/mail/lib-2022-09/0022.html seems to tell you don't want a non-scalar curl_prot_t in the library..

Also, I think you're mixing two completely separate changes in a single PR here. The tool change and library one seem unrelated. Right?

Yes and no! They are unrelated in terms of code independence. They however are both needed to support >32 protocols.
Explanation: to break the 32-bit barrier and make the sieve PR work again, I had to apply changes both in lib and tool.

@bagder
Copy link
Member

bagder commented Sep 19, 2022

I was rather and primary speaking about the library

See, we're mixing things here now because the change is in both areas and they are very different territories.

In the libcurl code I think we can stick to only do the new protocols if we are >32 bits - until someone actually shows up that has a system without long long. I suspect they are virtually extinct these days.

@monnerat
Copy link
Contributor Author

monnerat commented Sep 19, 2022

In the libcurl code I think we can stick to only do the new protocols if we are >32 bits - until someone actually shows up that has a system without long long. I suspect they are virtually extinct these days.

Yes, I think so too Thus there is no need for a cumbersome test on curl_off_t size in urldata.h. Right ? Those hypothetical platforms will get a compile error anyway.

I imagine we could possibly make libcurl work with >32 protocols even if curl_off_t is 32 bits...

IMO, both quotes above are contradictory.

@bagder
Copy link
Member

bagder commented Sep 19, 2022

IMO, both quotes above are contradictory.

No they are not. They are two separate ones:

  1. I don't think we currently need to support new protocols for "small" curl_off_t
  2. In a future we might make libcurl work with >32 protocols even if curl_off_t is 32 bits

@bagder
Copy link
Member

bagder commented Sep 19, 2022

We retain the option (2) because there is now nothing in the API that specifies any particular limit on 32 (nor 64) since we went CURLOPT_PROTOCOLS_STR.

@monnerat
Copy link
Contributor Author

monnerat commented Sep 19, 2022

  1. I don't think we currently need to support new protocols for "small" curl_off_t

Right, because we don't have > 32 protocols.

2, In a future we might make libcurl work with >32 protocols even if curl_off_t is 32 bits

That day, a non-scalar curl_prot_t will be needed.

There are then 2 options:

  1. We implement this feature before the next protocol and the latter will then be immediately available on all platforms.
  2. We do not care about "small" platforms at the next protocol commit and wait for someone to moan about a compile error.

Changes in this PR fit option 2.

@bagder
Copy link
Member

bagder commented Sep 19, 2022

We do not care about "small" platforms at the next protocol commit and wait for someone to moan about a compile error.

The current tool code still makes the tool do assumptions about how many the protocols are and that they will fit in a single scalar variable. It does not know this and there is no guarantee in any API that they will.

Even if we expand that limit (for most systems) with going to curl_off_t.


#ifndef PROTO_TYPE_SMALL
typedef curl_off_t curl_prot_t;
#else
Copy link
Member

@bagder bagder Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds code and a comment for the future only, and I am generally against that and I am now too. Because most commonly, once that future arrives, some things have changed anyway causing things to get done differently.

Copy link
Contributor Author

@monnerat monnerat Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you then prepare the way for a developer to submit us a PR for a new protocol ?

Copy link
Member

@bagder bagder Sep 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjust that commit to what is needed for the new protocol. I presume that would then remove the comment that says something about the future with actual code for the work in progress.

Copy link
Contributor Author

@monnerat monnerat Sep 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@monnerat
Copy link
Contributor Author

monnerat commented Sep 19, 2022

Even if we expand that limit (for most systems) with going to curl_off_t.

(Speaking about the tool) Yes, but this is yet at least a superset of what is supported in the lib.
The universal alternative should be fully dynamic, probably dealing with string arrays only.
Another important point is the protocol order (for tests purpose): we could move to the alpha order to respect what is returned by curl_version_info() rather than the current CURLPROTO_* order

@bagder
Copy link
Member

bagder commented Sep 19, 2022

The universal alternative should be fully dynamic, probably dealing with string arrays only.

Not only universal, also the not making assumptions about the number of protocols version. I don't think there's a hurry to fix this, but I think it would be prudent and correct to do it.

Another important point is the protocol order (for tests purpose): we could move to the alpha order to respect what is returned by curl_version_info() rather than the current CURLPROTO_* order

I'm not really following, but yes the protocol supported should be shown alpha-sorted for -V.

@monnerat
Copy link
Contributor Author

monnerat commented Sep 19, 2022

Not only universal, also the not making assumptions about the number of protocols version. I don't think there's a hurry to fix this, but I think it would be prudent and correct to do it.

It doesn't really make assumptions: there are of course some limitations (like static array size or dynbuf max length), but protocol numbers are private to the tool and only limited by the proto_t capacity (that means at least 2^32-1). The proto_set_t type limits the number of usable protocols only, not their absolute range in the lib. In particular, if we have disabled low lib-numbered protocols, they "make room" for a high numbered one in the proto_set_t type.

@bagder
Copy link
Member

bagder commented Sep 19, 2022

In src/tool_libinfo.h this code

#define PROTO_BIT(p) ((p) < PROTO_MAX? (proto_set_t) 1 << (p): \

... is an assumption that there is less than 33 or 65 protocols, depending on size of proto_set_t. Am I wrong?

@monnerat
Copy link
Contributor Author

monnerat commented Sep 19, 2022

... is an assumption that there is less than 33 or 65 protocols, depending on size of proto_set_t. Am I wrong?

That's only an internal limitiation due to the proto_set_t type implementation. As I said higher this can be changed but I do not see the need for the tool to have much more capacity than the library. The change in this PR sets it to >= library capacity.

I'm not really following, but yes the protocol supported should be shown alpha-sorted for -V.

-V is already OK. the importance of order for the tests is to have a defined order of protocols listed for matches in the <verify> sections (see test 1401).

@bagder
Copy link
Member

bagder commented Sep 19, 2022

I do not see the need for the tool to have much more capacity than the library.

The tool simply does not know how many protocols the library supports (without dynamically counting the number curl_version_info returns). The tool implementation of today should be able to run with the library we ship in ten years. Maybe that library supports a different amount of protocols. Therefore, any assumption that the number fits as number of bits within a scalar variable is a gamble.

@monnerat
Copy link
Contributor Author

monnerat commented Sep 19, 2022

The tool implementation of today should be able to run with the library we ship in ten years.

do you intend to support >64 protocols in 10 years ?

Therefore, any assumption that the number fits as number of bits within a scalar variable is a gamble.

Yes, but not more than in the current library.

Anyway I agree with you: it should be more scalable. An all-dynamic solution would probably be a luxury, but we should support a large number of protocols via a non-scalar proto_set_t or any other method. A limit would however exist if not all-dynamic.

Just a personal question: why are you so preoccupated by the future of the tool and opposed to advanced prepration of the lib at the same time ? This appears to me as antagonist, although distinct.

@bagder
Copy link
Member

bagder commented Sep 19, 2022

do you intend to support >64 protocols in 10 years ?

No, I use that to illustrate how we should make the tool only use the knowledge it has, not assume things.

Yes, but not more than in the current library.

The library doesn't assume it knows.

@bagder
Copy link
Member

bagder commented Sep 19, 2022

why are you so preoccupated by the future

I think we should use the API and not guess what goes beyond it. Future, history and current.

@bagder
Copy link
Member

bagder commented Sep 22, 2022

@monnerat, after #9546 was merged, what to do want to do with this PR?

@monnerat
Copy link
Contributor Author

monnerat commented Sep 22, 2022

what to do want to do with this PR?

  • Resolve conflict first: the tool change is obsolete and not needed anymore.
  • Still keep the PR, because it fixes a potential bug in lib/url.c.
  • Find some way to move the definition of curl_prot_t outside the WS conditionals that satisfies your wish of not designing future things. This to ease new protocols design. Protocols are future but their design is not.

Move the curl_prot_t to its own conditional block. Introduce symbol
PROTO_TYPE_SMALL to control it.

Fix a cast in a curl_prot_t assignment.
Remove an outdated comment.

Follow-up to cd5ca80.
#define PROTO_TYPE_SMALL

#ifndef PROTO_TYPE_SMALL
typedef curl_off_t curl_prot_t;
Copy link
Member

@bagder bagder Sep 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this check be easier and make more sense if reversed?

Like with #ifdef PROTY_TYPE_BIG ...

So that we have a define we set when we cross the 32 boundary and need to use a larger type? Especially since that new protocol will require a 64 bit data type, so going beyond 32 needs to be conditional (at least until we require a 64 bit type).

Copy link
Contributor Author

@monnerat monnerat Sep 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. I started by implementing it as PROTO_TYPE_LARGE, but I noted this is easier to undefine a symbol than to redefine it and I reversed the logic.

  • #undef can be applied on an undefined symbol: it's just ignored.
  • #define might redefine an existing symbol only if its value does not change. But this is implementation-dependent, thus we better consider it may not.

When several new protocols (!) will be introduced, they will all need to act on this, so this will be easier to simply #undef the symbol than to redefine it for each protocol.

I know this is future, but we have to prepare for it :-)

Copy link
Member

@bagder bagder Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fair enough. We can always change it later if we think of improvements down the line.

bagder
bagder approved these changes Sep 23, 2022
@bagder
Copy link
Member

bagder commented Sep 23, 2022

Thanks!

@bagder bagder closed this in 91e06e6 Sep 23, 2022
@monnerat
Copy link
Contributor Author

monnerat commented Sep 23, 2022

Thanks for pulling!

@monnerat monnerat deleted the proto64 branch Sep 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants