-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] Make InlineArray
call its elements' destructor
#2965
base: nightly
Are you sure you want to change the base?
[stdlib] Make InlineArray
call its elements' destructor
#2965
Conversation
InlineArray
not calling destructor and remove unsafe_uninitialized
InlineArray
not calling its elements' destructor
InlineArray
not calling its elements' destructorInlineArray
call its elements' destructor
@ConnorGray it's ready for review, please take a look when you have the time :) |
good job @gabrieldemarmiesse , fn __moveinit__(inout self, owned other: Self):
self._size = other._size
other.size = 0 # taking move (see ._del)
self._array = other._array^ |
@rd4com I think it can be added in another PR, this one is already (too?) big. Furthermore, the |
!sync |
size.value, | ||
`, `, | ||
UnsafeMaybeUninitialized[Self.ElementType], | ||
`>`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand the intention here, but I think this may not actually be required.
I think the state we want to aim for is:
-
InlineArray.__init__(unsafe_uninitialized)
is replaced with:@staticmethod fn unsafe_uninitialized() -> InlineArray[UnsafeMaybeUninitialized[T]]: ...
-
All the remaining
InlineArray.__init__(...)
constructors should (and currently do) always fully initialize theInlineArray
.(This is okay to do without making the
pop.array
storage type useUnsafeMaybeUninitialized
because the compiler knows that inside an__init__
methodself
and the fields ofself
won't be fully initialized anyway.
With this approach, all the methods off InlineArray
will be able to assume one of the following two things is true:
- The current
InlineArray
is fully initialized - The current
InlineArray
may not be fully initialized, but its element types are definitely wrapped inUnsafeMaybeUninitialized[..]
, so that's okay.
We'd then also want to add this constructor for the programmer to "assert" that an uninitialized InlineArray
is now fully initialized.
struct InlineArray[T]:
fn __init__(
inout self,
owned unsafe_assume_initialized: InlineArray[UnsafeMaybeUninitialized[T]]
):
# transmute the argument `InlineArray` into one that is now considered fully initialized
...
@gabrieldemarmiesse What do you think of this?
I think it would eliminate the need for some of the changes that had to be made in the current iteration of this PR throughout the InlineArray
implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using UnsafeMaybeUninitialized
inside InlineArray
is indeed not necessary, but it was intentional to modify the struct like this. The idea was to move away from pointers and use only UnsafeMaybeUninitialized
which has a restricted use compared to pointers (notably, no pointer arithmetic). We still want to carefully control the lifecycle, and UnsafeMaybeUninitialized
allows it. With pointers the notion of lifecycle and memory address are mixed up in the same struct, using UnsafeMaybeUninitialized
allows to manage both independently.
I can change it back and use plain pointers again, if that's really what you want, though I believe the code would be less readable. Let me know what you want there.
For the other changes you requested fn unsafe_uninitialized() -> InlineArray[UnsafeMaybeUninitialized[T]]:
and the new constructor, can it be done in another pull request? The CI is currently passing, so I believe those methods are more in the realm of "nice to have" rather than "necessary to fix the bug".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a pointer-based implementation is still my personal preference, since it limits the number of types involved. But I see the merits of your argument too, and I don't want to hold up this excellent work on a small difference in perspective like that 🙂
So I'm going to resync this and merge it, and let this shake out—we can always revisit and cleanup later if there's a readability or maintainability problem in practice.
And I agree, we can / should add @staticmethod fn unsafe_uninitialized()
in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind reverting one day to pointers if we see that this is causing some issues down the road :)
@@ -20,6 +20,7 @@ from collections import InlineList | |||
""" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love how clean the changes are in this file with the introduction of UnsafeMaybeUninitialized
🙂
Thank you for putting together such an excellent explanation for this work @gabrieldemarmiesse. This was a treat to review 🙂 My main suggestion boils down to basically I think we can keep the implementation of Thank you again for pushing on this work Gabriel, this will make a huge difference in how precisely we can model the invariants of our collection types. |
Signed-off-by: gabrieldemarmiesse <gabrieldemarmiesse@gmail.com>
7ed580b
to
ad2a5d8
Compare
!sync |
✅🟣 This contribution has been merged 🟣✅ Your pull request has been merged to the internal upstream Mojo sources. It will be reflected here in the Mojo repository on the nightly branch during the next Mojo nightly release, typically within the next 24-48 hours. We use Copybara to merge external contributions, click here to learn more. |
Landed in 990c776! Thank you for your contribution 🎉 |
…(#42041) [External] [stdlib] Make `InlineArray` call its elements' destructor Fix this issue: * #2869 A little explanation of what is going on here, this should make the review easier. ### Modifications to `InlineArray` `InlineArray` made use of the `@value` decorator, which generates, in addition to a constructor, the following methods: `__moveinit__`, `__copyinit__` and `__del__`. While it's easy to understand what is generated when working with attributes that are values, it's less clear what is happenning when working with plain MLIR types like we do here. The next step is to manually define those to make sure they do what we want. **We will always assume that the `InlineArray` is completely initialized**. Thus when copying, moving or deleting, we must copy, move and delete each element in the `InlineArray` too. To make the implementation easier, instead of using an `UnsafePointer` and its methods to control the lifecycle of the values (`move_from_pointee` & friends), we fill the array with `UnsafeMaybeUninitialized`. This struct has methods that allows us to control the lifecycle without necessarily work with pointers. ### What do we do with `InlineArray[...](unsafe_uninitialized=True)`? This method is a massive footgun. This is because throughout the methods of `InlineArray`, we always assume that the values are initialized. So doing any moving, copying or deleting with uninitialized memory is UB. This is not much of an issue for very trivial types where moving and copying are just copying bits and the destructor does nothing. This can still cause UB if the user uses some uninitialized memory but it's fine if the caller is careful. For example, in the stdlib, we use `unsafe_uninitialized` to store some `UInt8` here and there and it's fine. As such, the constructor was kept, but an extensive docstring was added to explain in which situation it is okay to use it. ### How `InlineArray` is now used by `InlineList` `InlineList` wants to carefully manage the lifecycle of each element in the `InlineArray` while still making the `InlineArray` believe that it owns the lifecycle of its elements. In those kind of situation, `UnsafeMaybeUninitialized` is the right struct to use. So we use an `InlineArray[UnsafeMaybeUninitialized[ElementType]]` inside `InlineList`. I know it can be confusing since `InlineArray` itself also uses `UnsafeMaybeUninitialized` but this shouldn't be visible from outside `InlineArray` so it should be fine. Basically it disables all lifecycle methods, and `__del__` is a no-op. This allows `InlineList` to manually control the lifecycle of each elements which has an index below the size (above the size of the `InlineList`, it's truly uninitialized memory). Co-authored-by: Gabriel de Marmiesse <gabriel.demarmiesse@datadoghq.com> Closes #2965 MODULAR_ORIG_COMMIT_REV_ID: 5d98bc8f2a9bfc35ee0a004b8ee6a36f4a475452
Re-opening to try to merge this internally again. |
!sync |
I saw that it got reverted for performance reasons. If you can pinpoint the reason or have tools to help pinpoint the issue, I'd be happy to know, just to satisfy my curiosity |
Fix this issue:
InlineArray
does not invoke the destructors of its elements #2869A little explanation of what is going on here, this should make the review easier.
Modifications to
InlineArray
InlineArray
made use of the@value
decorator, which generates, in addition to a constructor, the following methods:__moveinit__
,__copyinit__
and__del__
. While it's easy to understand what is generated when working with attributes that are values, it's less clear what is happenning when working with plain MLIR types like we do here.The next step is to manually define those to make sure they do what we want. We will always assume that the
InlineArray
is completely initialized. Thus when copying, moving or deleting, we must copy, move and delete each element in theInlineArray
too.To make the implementation easier, instead of using an
UnsafePointer
and its methods to control the lifecycle of the values (move_from_pointee
& friends), we fill the array withUnsafeMaybeUninitialized
. This struct has methods that allows us to control the lifecycle without necessarily work with pointers.What do we do with
InlineArray[...](unsafe_uninitialized=True)
?This method is a massive footgun. This is because throughout the methods of
InlineArray
, we always assume that the values are initialized. So doing any moving, copying or deleting with uninitialized memory is UB.This is not much of an issue for very trivial types where moving and copying are just copying bits and the destructor does nothing. This can still cause UB if the user uses some uninitialized memory but it's fine if the caller is careful. For example, in the stdlib, we use
unsafe_uninitialized
to store someUInt8
here and there and it's fine.As such, the constructor was kept, but an extensive docstring was added to explain in which situation it is okay to use it.
How
InlineArray
is now used byInlineList
InlineList
wants to carefully manage the lifecycle of each element in theInlineArray
while still making theInlineArray
believe that it owns the lifecycle of its elements. In those kind of situation,UnsafeMaybeUninitialized
is the right struct to use.So we use an
InlineArray[UnsafeMaybeUninitialized[ElementType]]
insideInlineList
. I know it can be confusing sinceInlineArray
itself also usesUnsafeMaybeUninitialized
but this shouldn't be visible from outsideInlineArray
so it should be fine. Basically it disables all lifecycle methods, and__del__
is a no-op.This allows
InlineList
to manually control the lifecycle of each elements which has an index below the size (above the size of theInlineList
, it's truly uninitialized memory).Some guideline for the code review:
I advise to read the files in this order:
stdlib/test/test_utils/types.mojo
andstdlib/test/collections/test_inline_list.mojo
have been refactored to make the tests ofInlineArray
use theValueDestructorRecorder
. This could have been a separate PR but I was too lazy to split this PR even further.stdlib/test/utils/test_tuple.mojo
has a new test that ensure that all the elements in theInlineArray
have their destructors called.stdlib/src/utils/static_tuple.mojo
has all the changes that makesInlineArray
useUnsafeMaybeUninitialized
internally. While we could have technically use pointers, it makes writing code easier when controlling lifecycle.InlineList
instdlib/src/collections/inline_list.mojo
. We now useInlineArray[UnsafeMaybeUninitialized[ElementType]]
which allows theInlineList
to control exactly the lifecycle. I repeat here: yes in can be confusing because overall, we use twoUnsafeMaybeUninitialized
on top of each other when usingInlineArray[UnsafeMaybeUninitialized[ElementType]]
, but the code is still readable, because one "belongs" toInlineArray
and one "belongs" toInlineList.