-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Vector.from_polyglot_array to make Vectors backed by polyglot arrays #3628
Conversation
50b2521
to
a5a3a2c
Compare
a5a3a2c
to
ec0c08f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is along the lines needed.
If we move the to_array
to be a method the new storage type this allows us to make Slice a no-op :).
The bit missing is to review the current Vector.Vector
calls and map them into the new polyglot functions.
distribution/lib/Standard/Base/0.0.0-dev/src/Polyglot/Proxy_Polyglot_Array.enso
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see Array
in Enso to be done properly. If this "solution" somehow helps libraries, feel free to go on, but I find this wrong from Engine perspective. Especially knowing that we don't handle nested arrays [[1,2], [3,4,5]]
.
distribution/lib/Standard/Base/0.0.0-dev/src/Polyglot/Proxy_Polyglot_Array.enso
Show resolved
Hide resolved
distribution/lib/Standard/Base/0.0.0-dev/src/Polyglot/Proxy_Polyglot_Array.enso
Outdated
Show resolved
Hide resolved
e593166
to
b145485
Compare
@jdunkerley I think I addressed all your concerns. This also revealed some bugs that I fixed on the way. Also, while working on this ticket, I discovered that polyglot null values were leaking into Enso rather than being translated to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am completely OK with the null
& Nothing
changes.
@@ -64,6 +64,11 @@ long doInteger(Object integer, @CachedLibrary(limit = "5") InteropLibrary number | |||
} | |||
} | |||
|
|||
@Specialization(guards = "interop.isNull(value)") | |||
Object doNull(Object value, @CachedLibrary(limit = "5") InteropLibrary interop) { | |||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we using Java null
to represent Enso Nothing
? Truffle languages in general try to avoid usage of null
as it requires additional checks in the partial evaluation mode and the whole interpreter code doesn't optimize so nicely. Following the traditional approach, we should have Nothing.NULL
truffle object. CCing @kustosz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, null
is not ideal but I had to use something that Epb and Enso know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decided to get rid of this "to null" and "from null" conversion and added a separate conversion node for the results of foreign calls.
.../java/org/enso/interpreter/node/expression/builtin/interop/generic/ReadArrayElementNode.java
Outdated
Show resolved
Hide resolved
Object elem = coercion.execute(library.readArrayElement(array, index)); | ||
if (elem == null) { | ||
Builtins builtins = Context.get(this).getBuiltins(); | ||
return builtins.nothing().newInstance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope newInstances
returns a singleton!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see cachedInstance
field
@@ -49,7 +49,7 @@ spec = Test.group "Vectors" <| | |||
built_from_js = Vector.from_polyglot_array generate_js_array | |||
built_from_js . should_equal [1, 2, 3, 4, 5] | |||
built_from_py = Vector.from_polyglot_array generate_py_array | |||
built_from_py . should_equal [1, 2, 3, 4, 5] | |||
built_from_py . should_equal [1, 2, 3, 4, Nothing] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I believe this is desirable. Foreign null
s are equal to Enso Nothing
. Btw. this is not what Graal.js does. If you:
var value = foreignCall();
if (value === null) print('===');
if (value == null) print("==");
then only the second condition may be true
! In Graal.js foreign null
s are not ===
with JavaScript null
, but are only ==
with it. I've been burned by this feature so many times - I suggest in Enso Nothing
is equal with any foreign Nothing/null
.
foreign python generate_py_array = """ | ||
return [1, 2, 3, 4, None] | ||
|
||
foreign python generate_nested_py_array = """ | ||
return [[1, 2, 3], [4, 5]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd also like to see a test where these two situations are joined - a nested array containing a null - so that we can correctly verify that not only surface level nulls are converted but also ones deep into the nesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's unlikely to work atm - see the pending test I added. It also demonstrates that it wouldn't work even before this PR.
In order to support it, we need a bigger rewrite of polyglot arrays which is not part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I missed that it is pending. That is concerning - is the fix for that planned for the near future?
Well, I'd suggest adding this mixed test as pending too - to ensure that, once we do the rewrite, stuff is handled correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is concerning
Yes, I also find this concerning.
to_array self = | ||
len = self.length | ||
a = Array.new len | ||
0.up_to len . each ix-> a.set_at ix (self.at ix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems very inefficient.
Any reason why you cannot replace it with Array.copy self.arr 0 a 0 len
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I'm a bit worried about the fact that we would always copy here. Cannot we somehow wrap the polyglot array while avoiding copying? At least for cases where we can guarantee immutability as we were discussing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is pretty much as inefficient as it was before, really. I would defer optimizations until we actually decide to rewrite Array, and we seem to be coming to that point very soon :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you comparing to? Before we didn't have a way to convert a polyglot array to Enso, to_array
would just return it - which was a zero-cost operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to (now gone) from_array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok, right.
I was thinking of the Vector.Vector polyglot_array
approach which we used most of the time.
at : Number -> Any | ||
at self index = | ||
Polyglot.read_array_element self.arr index | ||
|
||
to_array : Array Any | ||
to_array self = | ||
len = self.length | ||
a = Array.new len | ||
0.up_to len . each ix-> a.set_at ix (self.at ix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also one observation regarding that - we do the copy only upon a to_array
call - so this gives us the right type, but this does not solve the mutability issue - if the underlying polyglot array is mutable, the calls to wrapped_array.at
will not be referentially transparent as the underlying array may change in-between them.
I'm not saying that is an issue, because I assume that this PR is not aiming to solve the mutability issue, but I just wanted to ensure that we are on the same page there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeap, we are on the same page.
0711f55
to
6c956cf
Compare
.../java/org/enso/interpreter/node/expression/builtin/interop/generic/ReadArrayElementNode.java
Outdated
Show resolved
Hide resolved
6c956cf
to
0fd1217
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is an overall consensus this PR shall be integrated before work on Interop for Array & Vector starts, then let's do it.
My minority opinion continues to suggest to focus on Interop for Array & Vector which solves these array interop issues properly. But I don't want to stay in the way.
Engine approval is needed to get this PR off the table. Approving.
.../java/org/enso/interpreter/node/expression/builtin/interop/generic/ReadArrayElementNode.java
Show resolved
Hide resolved
Yeap, improved Array/Vector is next in line |
23d1194
to
7e97b19
Compare
Use Proxy_Polyglot_Array as a proxy for polyglot arrays, thus unifying the way the underlying array is accessed in Vector. Used the opportunity to cleanup builtin lookup, which now actually respects what is defined in the body of @Builtin_Method annotation.
Vector now has a `storage` field, which will delegate to Proxy_Polyglot_Array. Identified what appears to be all invalid `Vector.Vector` constructs when the underlying storage was a polyglot array.
Fixed a bug in our handling of polyglot values by 1) doing coercion of primitive values on all languages 2) explicitly checking for null values 3) whenever foreign calls return null we translate that to Nothing
This way we avoid explicit null checking which is admittedly not the nicest way of dealing with Nothing conversion.
One last request to avoid copying Arrays when the underlying storage is already an Array.
5e34d39
to
b22edb1
Compare
arr = self.storage.to_array | ||
case arr of | ||
Array -> | ||
arr | ||
_ -> | ||
len = self.storage.length | ||
a = Array.new len | ||
Array.copy arr 0 a 0 len | ||
a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test this code?
My experience tells me that polyglot arrays also used to match the Array
case in pattern matching, so I'm not really sure if it will do what you want - it will most likely always enter the Array
branch. Any array-object that is compatible with Array.copy
will also match the Array
branch, so the second branch is essentially dead code, unless I'm missing something.
See the following code:
from Standard.Base import all
polyglot java import java.util.List
main =
listarr = List.of 1 2 3
IO.println listarr
m = Meta.meta listarr
IO.println m
IO.println m.get_language
IO.println (Meta.get_simple_type_name listarr)
IO.println (Meta.get_qualified_type_name listarr)
case listarr of
Array -> IO.println "Matches Array!"
_ -> IO.println "nope"
it actually prints
[1, 2, 3]
(Polyglot [1, 2, 3])
Java
java.util.ImmutableCollections$ListN
null
Matches Array!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think the to_array
method should have some doc comment, if at least ## PRIVATE
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry for commenting after merge, but I did not notice these earlier)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Description
Use Proxy_Polyglot_Array as a proxy for polyglot arrays, thus unifying
the way the underlying array is accessed in Vector.
Used the opportunity to cleanup builtin lookup, which now actually
respects what is defined in the body of @Builtin_Method annotation.
Also discovered that polyglot null values (in JS, Python and R) were leaking to Enso.
Fixed that by doing explicit translation to
Nothing
.https://www.pivotaltracker.com/story/show/181123986
Checklist
Please include the following checklist in your PR:
Scala,
Java,
and
Rust
style guides.
./run ide dist
and./run ide watch
.