-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pointerof an instance variable in an initialiser does not return the final pointer #1685
Comments
This also happens with class Foo
def initialize
self2 = self
puts "pointerof(self) = #{pointerof(self2)}"
end
end
foo = Foo.new
puts "pointerof(foo) = #{pointerof(foo)}" prints:
…14c8 ≠ …1748. |
The first might be a bug, but it would be interesting to know your use case. Since a struct is passed by value, I wouldn't rely on pointers to it, because they can quickly become invalid. The second case is not a bug: self and self2 are two different variables, their memory address is different. |
My actual code looks something like struct Test
def initialize()
@test = StaticArray(UInt32, 4)
external_fn_taking_pointer(@test.to_unsafe)
end
end
struct Test2
def initialize()
@test = Test.new
end
end
test2 = Test2.new which I would expect to produce a consistent memory address (as the array is a member variable of another member variable) For comparison, the equivalent C++ works as expected: #include <cstdio>
#include <cstdint>
struct Test
{
uint32_t test[4];
Test()
{
printf("%x\n", test);
}
};
struct Test2
{
Test test;
};
int main()
{
Test2 test;
printf("%x\n", test.test.test);
return 0;
} results in
|
Sounds a bit strange that structures of any kind should be "inited in one place" and then copied - if that happens for every type instantiation it would be a performance drag. |
As of Error in test.cr:12: instantiating 'Test:Class#new()'
test = Test.new
^~~
in test.cr:3: Can't infer the type of instance variable '@test' of Test
The type of a instance variable, if not declared explicitly with
`@test : Type`, is inferred from assignments to it across
the whole program.
The assignments must look like this:
1. `@test = 1` (or other literals), inferred to the literal's type
2. `@test = Type.new`, type is inferred to be Type
3. `@test = Type.method`, where `method` has a return type
annotation, type is inferred from it
4. `@test = arg`, with 'arg' being a method argument with a
type restriction 'Type', type is inferred to be Type
5. `@test = arg`, with 'arg' being a method argument with a
default value, type is inferred using rules 1, 2 and 3 from it
6. `@test = uninitialized Type`, type is inferred to be Type
7. `@test = LibSome.func`, and `LibSome` is a `lib`, type
is inferred from that fun.
8. `LibSome.func(out @test)`, and `LibSome` is a `lib`, type
is inferred from that fun argument.
Other assignments have no effect on its type.
Can't infer the type of instance variable '@test' of Test
@test = StaticArray(UInt32, 4)
^~~~~ |
@sdogruyol Issue is still valid, see code below: struct Test
def initialize
@test = StaticArray(UInt32, 4).new(10_u32)
p pointerof(@test)
end
def array_ptr
p pointerof(@test)
end
end
test = Test.new
test.array_ptr |
I would change this from |
agreed, this isn't entirely unexpected given struct semantics |
I don't understand why copy is required at returning from struct A
@foo : Int32
@bar : Pointer(Int32)
def initialize(@foo)
@bar = pointerof(@foo)
end
end
a = uninitialized A
a.initialize(100) Which extra process is there before or after calling I'm worrying that a struct is (implicitly) copied.I would like to have an option to disallow any copy operation to a defined struct, like C++ can do this with putting the specific constructor declaration into I (and some people) expect that struct is fixed location and fixed duration of lifetime. This semantic is useful to generate efficient code and sometimes required to call underlying C library, without loosing the experience of the user of wrapped library. For example, considering that the author of C library suggests to their function void user_func(...)
{
struct X data;
Y *ptr;
ptr = foo(&data, ...);
/* play with `ptr`, but actually read or write to `data` */
bar(ptr, ...);
baz(ptr, ...);
} Although the content of To implement this in Crystal we need to write: lib LibFoo
struct X
# omit
end
alias Y = Void
fun foo(X*, ...) : Y*
fun bar(Y*, ...) : Void
fun baz(Y*, ...) : Void
end
def user_func(...)
data = uninitialized LibFoo::X
ptr = LibFoo.foo(pointerof(data), ...)
# play with `ptr`
LibFoo.bar(ptr, ...)
LibFoo.baz(ptr, ...)
# Is it guaranteed to `data` exists until here?
end Developers who use Crystal consider this is unsafe and unwant to do. So I have to put this safe without loosing flexibility on calling About preferable usage of
|
Operation | Safety |
---|---|
Passing to an argument of another method | Safe if the calling method does not do any of followings |
Copying to another local variable in the block | Safe |
Passing to nested block | Unsafe unless executed inside this "scope" |
Copying to a captured variable | Can be Unsafe |
Returning y from block |
Can be Unsafe |
Passing to raise |
Unsafe (because one can implement this with longjmp ) |
... | ... |
Moreover, we have to nest to use multiple objects at once. This is inconvenient.
def user_func(...)
bind_to_Y(...) do |y|
bind_to_Y(...) do |z|
# play with `y` and `z`, `y` gets copied (right?).
y.bar(...)
z.baz(...)
y.baz(...)
end
end
end
Putting data
to an instance variable of struct to manage in pair
Because data
and ptr
is paired together, I may put it to a single struct:
struct Y
@data : LibFoo::X
@ptr : Pointer(LibFoo::Y)
@refdata : ...
def initialize(@refdata)
@data = uninitialized LibFoo::X
@ptr = LibFoo.foo(pointerof(@data), @refdata)
end
def bar(...)
LibFoo.bar(@ptr, ...)
end
def baz(...)
LibFoo.baz(@ptr, ...)
end
end
# with this, a user can:
def user_func(...)
y = Y.new(...)
# play with `y`.
y.bar(...)
y.baz(...)
end
This is not work as expected because it is copied when return from #new
.
This is the best semantic of providing the experience of LibFoo
, however, it is still unsafe:
Operation | Safety |
---|---|
Passing to an argument of another method | Safe but the passed value still points the original variable |
Copying to another local variable in the block | Safe but it still points the original variable |
Passing to nested block | Unsafe unless executed inside this "scope", also it still points the original variable |
Copying to a captured variable | Can be Unsafe, also it still points the original variable |
Returning y from method |
Unsafe |
Passing to raise |
Unsafe (because one can implement this with longjmp ) |
... | ... |
Workaround of above one
Since above does not work, we can make it to do delayed initialization:
struct Y
@data : LibFoo::X
@ptr : Pointer(LibFoo::Y)
@refdata : ...
def initialize(@refdata)
@data = uninitialized LibFoo::X
@ptr = Pointer(LibFoo::Y).null
end
def bar(...)
LibFoo.bar(self, ...)
end
def baz(...)
LibFoo.baz(self, ...)
end
def to_unsafe
if @ptr.null?
@ptr = LibFoo.foo(pointerof(@data), @refdata)
end
@ptr
end
end
# with this, a user can:
def user_func(...)
y = Y.new(...)
# play with `y`.
y.bar(...)
y.baz(...)
end
In this case, LibFoo.foo
will be called multiple times, if it has been copied before the calling one of these method.
y = Y.new(...)
z = y
z.bar(...) # LibFoo.foo will be called for `z`
y.baz(...) # LibFoo.foo will be called for `y`
Any copying before calling its methods is safe, but copying after calling any of its methods is possibly unsafe in same condition described above. This is too complicated.
Conclusion
We need a temporary object to bridge the user of binding library and underlying C library, but we don't want to fill the sea of heap memory with the garbage of temporary object, which can be used many many times. However, currently we can not handle a pointer from C library with a struct gracefully. Because it is freely and implicitly copied one to another. Disallowing copy can let the user of binding library to use the struct only in safe way, without loosing efficiency.
If you want a real world use case, here it is.
I have ideas of the methods to disallow copying:
- Having protected or private
#dup
- Same to do with
Reference
. (see below) Value#dup
becomes an "special method" which makes a copya = b
whereb
is aValue
should be interpreted asa = b.dup
a = A.new
whereA
is aStruct
class can be interpreted asa = A.new.dup
- Same to do with
- Having protected or private
#initialize(another : self)
- Same as C++.
a = b
whereb
is aValue
should be interpreted asa = b.class.new(b)
a = A.new
whereA
is aStruct
class never involves copy.
- Special annotation to a struct
- The user of an annotated
struct
can never "monkey patch" to reenable copy operation of the struct.
- The user of an annotated
N.B.
To protect Reference
from copying is easy. Because it requires #dup
to make a copy, so just put #dup
to protected
or private
:
class A
protected def dup
self
end
end
a = A.new
b = a.dup
Showing last frame. Use --error-trace for full trace.
error in line 8
Error: protected method 'dup' called for A
@asterite suggested to disallow to use pointerof
to instance variable of struct
in #9700. The language reference says using pointerof
is (always) unsafe. If so, just disallowing it does not make sense, I think. Any invalid usage of pointerof
should not be responsible to Crystal's developers, however you should provide the detailed specification such as "when returned pointer goes invalid". This should be guaranteed by language in some way.
(ex.)
Source of pointer | Guaranteed Lifetime |
---|---|
Any Reference type |
Until no variable references it |
Any instance variables of Value type in Reference (class ) |
Until no variable references the container of it |
Any instance variables of Value type in #initialize of Value (struct ) |
Until return from the method |
Any instance variables of Value type in any other method of Value (struct ) |
Until return from the caller's scope |
Any local variable of Value type |
Until return from the current scope |
Any captured variable of Value type |
Until return from the block |
Any constant variable of Value type |
Forever |
(Non Interpolated) String literal | Forever |
(This table is not true. This depends how the language guarantees (i.e., decided by Crystal's developer))
It works like that because of how @waj was working on improving this situation but it's probably something hard to change |
results in
From what I can see from the LL dump, this is a result of the struct being constructed in one memory location, and then relocated elsewhere. My current workaround for this is calling a function after the initialiser, which works as expected.
Fixing this may require changes in how variables are constructed, and thus may not be feasible. I suggest that a warning be emitted for the use of pointerof on an instance variable in a constructor; this would expose any latent issues.
The text was updated successfully, but these errors were encountered: