Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add not_null<T> from the Guidelines Support Library #24423

Open
dongcarl opened this issue Feb 22, 2022 · 10 comments
Open

Add not_null<T> from the Guidelines Support Library #24423

dongcarl opened this issue Feb 22, 2022 · 10 comments

Comments

@dongcarl
Copy link
Contributor

dongcarl commented Feb 22, 2022

There are many places in the codebase where we use a pointer type (*, shared_ptr, unique_ptr, etc.) because of historical reasons (CBlockIndex*) or for polymorphism. In many such cases, we may want to avoid needing to handle the pointer being null, or make it clear in the function contract that we require a non-null pointer.

not_null<T> is the solution proposed by the C++ Core Guidelines (relevant section), and restricts pointers and smart pointers to hold non-null values.

Since I've learned about this, I've repeatedly ran into scenarios where I wish I could use it so as to make call semantics clearer and avoid misuse / unnecessary null handling.


There is a header-only implementation of not_null<T> here, however, there are two issues we need to solve:

  1. The header-only implementation linked above requires the Expects and Ensures macros defined here, do we also import these or can we tweak our existing assertion macros to work?
  2. There is also a strict_not_null<T>, which requires explicit construction. Do we need that or just not_null<T>?
@fanquake
Copy link
Member

@theuni @ryanofsky @ajtowns you might have opinions / thoughts?

@ryanofsky
Copy link
Contributor

ryanofsky commented Feb 23, 2022

I guess I'd be curious to see real examples of where it is useful. It seems to me like most of the places where you could use not_null<T*> you could use T& instead and references seem nicer to me aesthetically, and I like how they limit what you can do with the variable (just access the data not do mess with pointer or ownership). The only cases where not_null seems really useful is when when you need the pointer variable to be able change which object it is pointing to. This could be better in some data structures than raw pointers or reference_wrappers (https://stackoverflow.com/questions/33306553/gslnot-nullt-vs-stdreference-wrappert-vs-t), but I don't think these cases are very common in our codebase.

I think in general replacing T& with not_null<T*> would be bad, while using not_null<T*> in certain data structures could be good. A more neutral case could be replacing things like:

const CChain& active = Assert(m_node.chainman)->ActiveChain();

with

const CChain& active = m_node.chainman->ActiveChain();

You get same runtime assert in either case here, but in first case author is clearly stating intention for chainman to be not null here at this line of the code, while in second case with not_null author is just saying the pointer shouldn't be null any place it is dereferenced, which isn't saying much. I could see people preferring the second version though.

@maflcko
Copy link
Member

maflcko commented Feb 23, 2022

I guess I'd be curious to see real examples of where it is useful

To turn runtime errors into compile errors #19273 ?

@ryanofsky
Copy link
Contributor

To turn runtime errors into compile errors #19273 ?

Is #19273 the right issue? I think point of not_null has more to do with safety around accessing data, not allocating objects.

To be clear why I was asking for examples, I don't have any problem not_null, but I do think references are better than not_nulls in most cases and am responding to "In many such cases, we may want to avoid needing to handle the pointer being null, or make it clear in the function contract that we require a non-null pointer" in the issue description.

@maflcko
Copy link
Member

maflcko commented Feb 23, 2022

So you are saying there should be a UniqueReference (similar to std::unique_ptr when it comes to memory management and similar to std::reference_wrapper when it comes to accessing data)?

@ryanofsky
Copy link
Contributor

So you are saying there should be a UniqueReference (similar to std::unique_ptr when it comes to memory management and similar to std::reference_wrapper when it comes to accessing data)?

I think the point of references is that they are transparent aliases to data somewhere that you should just access and not try to do memory management with. So if you are writing a function that just needs to access data
f(Data& data) is better than f(not_null<Data*> data) or f(not_null<unique_ptr<Data>> data). I could imagine not_null<unique_ptr> and not_null<shared_ptr> stuff being useful in some places, just not a lot of places, so I'm asking what the motivating examples are to see if they are really calling for not_null.

@dongcarl
Copy link
Contributor Author

I can see how references and reference_wrapper (which I didn't know was zero-cost until today) can solve most of the cases. I do think that not_null<unique_ptr> and not_null<shared_ptr> are useful when we're passing around (shared) ownership and perhaps we can focus on those use-cases (e.g. RegisterSharedValidationInterface)

@theuni
Copy link
Member

theuni commented Feb 23, 2022

This seems pretty niche, but I like the idea of being extra expressive when possible. I agree with @ryanofsky that references should be generally preferred though.

I think @dongcarl pointed to a concrete example of where passing a reference or std::reference_wrapper won't work as an access model: when inheritance is needed. So, I can imagine not_null<BaseClass*> potentially being a useful pattern there.

I'm curious though, is this better than using the nonnull attribute when needed? In terms of expressiveness, it seems to me that it might be more useful to peg the property to a specific parameter and locale than a storage duration. It wouldn't be correct-by-construction that way, but we could still get warnings/errors.

So I guess my question is: is it more useful to enforce as part of the function signature, or storage duration?

@ryanofsky
Copy link
Contributor

ryanofsky commented Feb 24, 2022

I think @dongcarl pointed to a concrete example of where passing a reference or std::reference_wrapper won't work as an access model: when inheritance is needed. So, I can imagine not_null<BaseClass*> potentially being a useful pattern there.

I'm curious about this, but I don't see what it is referring to. It seems like inheritance should be orthogonal to nullability and which pointer or reference mechanism is used.

I'm curious though, is this better than using the nonnull attribute when needed? In terms of expressiveness, it seems to me that it might be more useful to peg the property to a specific parameter and locale than a storage duration. It wouldn't be correct-by-construction that way, but we could still get warnings/errors.

So I guess my question is: is it more useful to enforce as part of the function signature, or storage duration?

It seems like ideal thing would be for the not_null type to automatically apply the nonnull compiler hint. Question of which is "more useful" is probably hard to answer without knowing specifics of a situation, and what level of analysis and optimization compiler will do, but I think it should be safe to say both are useful, and have some overlapping benefits, and some non-overlapping benefits.

On larger issue I definitely think not_null is useful in some cases. Just not very many cases I'm aware of. And in most cases where it is possible to use references, I think it is better to use references.

@dongcarl
Copy link
Contributor Author

dongcarl commented Mar 2, 2022

In any case, I think having not_null for not_null<unqiue_ptr<T>> is quite useful in itself as we incrementally move towards having clearer ownership semantics.

ryanofsky added a commit to ryanofsky/bitcoin that referenced this issue Feb 21, 2024
The util::ResultPtr class is a wrapper for util::Result providing syntax sugar
to make it less awkward to use with returned pointers. Instead of needing to be
deferenced twice with **result or (*result)->member, it only needs to be
dereferenced once with *result or result->member. Also when it is used in a
boolean context, instead of evaluating to true when there is no error and the
pointer is null, it evaluates to false so it is straightforward to determine
whether the result points to something.

This PR is only partial a solution to bitcoin#26004 because while it makes it easier
to check for null pointer values, it does not make it impossible to return a
null pointer value inside a successful result. It would be possible to enforce
that successful results always contain non-null pointers by using a not_null
type (bitcoin#24423) in combination with
ResultPtr, though.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this issue May 1, 2024
The util::ResultPtr class is a wrapper for util::Result providing syntax sugar
to make it less awkward to use with returned pointers. Instead of needing to be
deferenced twice with **result or (*result)->member, it only needs to be
dereferenced once with *result or result->member. Also when it is used in a
boolean context, instead of evaluating to true when there is no error and the
pointer is null, it evaluates to false so it is straightforward to determine
whether the result points to something.

This PR is only partial a solution to bitcoin#26004 because while it makes it easier
to check for null pointer values, it does not make it impossible to return a
null pointer value inside a successful result. It would be possible to enforce
that successful results always contain non-null pointers by using a not_null
type (bitcoin#24423) in combination with
ResultPtr, though.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this issue May 1, 2024
The util::ResultPtr class is a wrapper for util::Result providing syntax sugar
to make it less awkward to use with returned pointers. Instead of needing to be
deferenced twice with **result or (*result)->member, it only needs to be
dereferenced once with *result or result->member. Also when it is used in a
boolean context, instead of evaluating to true when there is no error and the
pointer is null, it evaluates to false so it is straightforward to determine
whether the result points to something.

This PR is only partial a solution to bitcoin#26004 because while it makes it easier
to check for null pointer values, it does not make it impossible to return a
null pointer value inside a successful result. It would be possible to enforce
that successful results always contain non-null pointers by using a not_null
type (bitcoin#24423) in combination with
ResultPtr, though.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this issue Jun 17, 2024
The util::ResultPtr class is a wrapper for util::Result providing syntax sugar
to make it less awkward to use with returned pointers. Instead of needing to be
deferenced twice with **result or (*result)->member, it only needs to be
dereferenced once with *result or result->member. Also when it is used in a
boolean context, instead of evaluating to true when there is no error and the
pointer is null, it evaluates to false so it is straightforward to determine
whether the result points to something.

This PR is only partial a solution to bitcoin#26004 because while it makes it easier
to check for null pointer values, it does not make it impossible to return a
null pointer value inside a successful result. It would be possible to enforce
that successful results always contain non-null pointers by using a not_null
type (bitcoin#24423) in combination with
ResultPtr, though.
ryanofsky added a commit to ryanofsky/bitcoin that referenced this issue Jun 17, 2024
The util::ResultPtr class is a wrapper for util::Result providing syntax sugar
to make it less awkward to use with returned pointers. Instead of needing to be
deferenced twice with **result or (*result)->member, it only needs to be
dereferenced once with *result or result->member. Also when it is used in a
boolean context, instead of evaluating to true when there is no error and the
pointer is null, it evaluates to false so it is straightforward to determine
whether the result points to something.

This PR is only partial a solution to bitcoin#26004 because while it makes it easier
to check for null pointer values, it does not make it impossible to return a
null pointer value inside a successful result. It would be possible to enforce
that successful results always contain non-null pointers by using a not_null
type (bitcoin#24423) in combination with
ResultPtr, though.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
@theuni @fanquake @dongcarl @maflcko @ryanofsky and others