Skip to content
This repository has been archived by the owner on Apr 13, 2023. It is now read-only.

Issue with Java Strings as type arguments #488

Closed
CeylonMigrationBot opened this issue Apr 11, 2012 · 15 comments
Closed

Issue with Java Strings as type arguments #488

CeylonMigrationBot opened this issue Apr 11, 2012 · 15 comments

Comments

@CeylonMigrationBot
Copy link

[@FroMage] When using some Java code that returns a java.util.List<java.lang.String> we get a weird issue that String is not assignable to String because we assume that when used as a type parameter, String should be mapped to ceylon.language.String, which is just not always the case. We need to fix that.

[Migrated from ceylon/ceylon-compiler#488]
[Closed at 2012-06-04 14:30:13]

@CeylonMigrationBot
Copy link
Author

[@FroMage] Same issue with j.l.Object used as type param

@CeylonMigrationBot
Copy link
Author

[@FroMage] Though, for j.l.Object we can fix the bug easily, I don't see how to fix it for j.l.String: a List<j.l.String> is just not a List<c.l.String> so we can't pass one for the other. Even raw-casting doesn't get us anywhere since we then lose the real underlying type, and will expect members to be something they are not.

Can anyone remind me why we erase to j.l.String but not for parameterised types? Seems to me we should just always erase to it, though I admit that will require lots of boxing.

@CeylonMigrationBot
Copy link
Author

[@gavinking] j.l.String might not satisfy some constraints on the type parameter.

@CeylonMigrationBot
Copy link
Author

[@FroMage] Ah true. But the opposite is true… How can we fix this sort of problem?

@CeylonMigrationBot
Copy link
Author

[@FroMage] I've simply no idea how to fix this, short of giving up and stopping to optimise c.l.String as j.l.String.

I mean, I see why we're sometimes erasing to j.l.String, but that causes lots of boxing for no good reason, because we're not using j.l.String methods at all, so anything we do with the string, we need to box/unbox.

Tako says that we could produce code that calls helper methods rather than always box/unbox java strings, so we'd keep only java strings, but that doesn't help us if we're still using ceylon strings as type parameters, since that will always require boxing, and doesn't work with the pretended equivalence with java strings (different type parameter constraints)

For primitives this problem doesn't exist, but we decided to not treat j.l.Long and friends as equivalent to the primitives for precisely the reason that it doesn't work. Yet we're doing it to j.l.String when IMO we shouldn't even be using it in c.l.String and rather use int[].

Tako says that we're erasing to java strings because the JVM optimises them and we'd need to convert ceylon strings to java strings anyways for most APIs, which will not be true anymore once we have Pure Ceylon libs.

I think we should just stop treating j.l.String as a c.l.String and make them different types, and not erase one to the other.

@CeylonMigrationBot
Copy link
Author

[@quintesse] After discussing this some more with @FroMage we seem to have some serious problems with Java interop here, right?
Let's say we have a type Foo<T> and that both Java and Ceylon use it.
We could probably extend the boxing code to make it understand a Foo<String> coming from Java as actually using j.l.String and therefore add special (un)boxing code, but we could never do the same the other way around while trying to pass a Foo<String> to Java.
(And of course we haven't even thought about Java code trying to interop with Ceylon an actually using c.l.String somewhere)

@CeylonMigrationBot
Copy link
Author

[@quintesse] The only thing we've been able to think of thus far is to stop treating j.l.String as being equivalent to c.l.String, of course this would mean that Java interop would be far less nice, having to do javaObject.name.string for example.
There are several way forward if we would do this split, but first I think it's best if you, @gavinking , think about this a bit and tell us if you agree with our assessment or not.

@CeylonMigrationBot
Copy link
Author

[@gavinking] So, after some playing to see what the interop code actually does, it appears that there is no actual hole in the type system, since the following code all compiles and runs:

import java.lang { Str=String }
import java.util { List, ArrayList }

List<Str> slist = ArrayList<Str>();
slist.add(Str("hello"));
slist.add(Str("goodbye"));
for (i in 0..slist.size()-1) {
    Str s = slist.get(i);
    print(s);
}

The compiler doesn't treat java.lang.String as ceylon.language.String, or vice-versa. Which is correct. A List<Str> is simply not a List<String>. Even creating a subclass of ArrayList<Str> works out correctly.

So it seems that:

  1. The actual bug here, such as it is, is just that the model loader tries to map Foo<java.lang.String> to Foo<ceylon.language.String>. I can't really see how that's ever going to be correct even with all the bending and twisting in the world, so I think we should simply stop doing that.
  2. This opens the question of whether it's still OK to map plain java.lang.String to ceylon.language.String when it appears directly in a signature (i.e. not as a type argument). I don't see why not, but perhaps I'm missing something.
  3. Finally, we arrive at what I suppose is the real issue here: is it completely unacceptable from a usability perspective that we sometimes have some Java APIs giving us back plain java.lang.String instances? Well, it's certainly going to get in the way when it happens, but I don't have a strong intuition on how common this is. How often do we really encounter java.lang.String as a type argument in a Java API?

Now, lets say we discover that Array<java.lang.String> is popping up all over the place in Java APIs. Then one possible solution is to box it. Introduce a Ceylon type named Strings in the language module and box Java java.lang.String[] to that type. But how to generalize this idea? Well, what's really interesting about this solution is that for the cases which I think well probably arise, there are some really obvious types to box them to:

  • Java java.lang.String[] to Ceylon Array<String>
  • Java java.util.List<java.lang.String> to Ceylon List<String>
  • Java java.util.Set<java.lang.Set> to Ceylon Set<String>

Now, of course, the nice thing about this is it also neatly resolves the problem of not being able to iterate java.util.List<java.lang.String> in a for loop. So it seems like we should probably do this kind of boxing all the time, and merely specialize it when the type argument is java.lang.String. So indeed this issue ties in with something which has been in the back of my mind for a long time, which is to map Java collection types to Ceylon collection types.

Yes, I'm ignoring the mutability issue for now, and there might be some other problems here that I'm missing.

@CeylonMigrationBot
Copy link
Author

[@gavinking]

For primitives this problem doesn't exist, but we decided to not treat j.l.Long and friends as equivalent to the primitives for precisely the reason that it doesn't work.

I'm still not convinced that this is the right thing, and we might need to revisit it. Certainly we can't have model loader do to Long what it is currently doing to String, but if we stopped treating type args in this broken way, I don't see why a method that simply returns java.lang.Long can't be transparently mapped to Ceylon Integer.

Yet we're doing it to j.l.String when IMO we shouldn't even be using it in c.l.String and rather use int[].

There's certainly a good argument for that. I'm conflicted.

Tako says that we're erasing to java strings because the JVM optimises them

Well, we don't know the details of this optimization or even if it really exists.

@CeylonMigrationBot
Copy link
Author

[@gavinking]

I think we should just stop treating j.l.String as a c.l.String and make them different types, and not erase one to the other.

...

The only thing we've been able to think of thus far is to stop treating j.l.String as being equivalent to c.l.String, of course this would mean that Java interop would be far less nice, having to do javaObject.name.string for example.

I mean, it's an option, and it would not be the end of the world, but hell, all those calls to Str(s) and s.string are going to add up pretty quick, it seems to me.

@CeylonMigrationBot
Copy link
Author

[@FroMage]

The actual bug here, such as it is, is just that the model loader tries to map Foo<java.lang.String> to Foo<ceylon.language.String>

No it doesn't, it (confusingly) reports that: Foo<String> is not assignable to Foo<String>. I thought I could make that work, but it just can't.

The problem I see with trying to autobox ceylon and java strings is that it doesn't always work:

import java.lang { Str=String }
import java.util { List, ArrayList }

List<Str> slist = ArrayList<Str>();
slist.add(Str("hello"));
slist.add(Str("goodbye"));
for (i in 0..slist.size()-1) {
    Str s = slist.get(i);
    print(s);
}

It's not at all obvious to the users why the following autoboxing doesn't work:

import java.lang { Str=String }
import java.util { List, ArrayList }

List<Str> slist = ArrayList<Str>();
slist.add("hello"); // autoboxing
for (i in 0..slist.size()-1) {
    String s = slist.get(i); // autoboxing
    print(s);
}

Yet it doesn't.

When we do autoboxing of java strings to/from ceylon strings we give the impression that the two are interchangeable, which is indeed the case for Java's autoboxing (except for null), but then users will notice that there are some situations where autoboxing doesn't work and they will point at it and say: "look, this is inconsistent".

@CeylonMigrationBot
Copy link
Author

[@gavinking]

The actual bug here, such as it is, is just that the model loader tries to map Foo<java.lang.String> to Foo<ceylon.language.String>

No it doesn't, it (confusingly) reports that: Foo<String> is not assignable to Foo<String>.

OK, I was just going by the initial description of the issue, which seems to say that:

we assume that when used as a type parameter, String should be mapped to ceylon.language.String, which is just not always the case. We need to fix that.

If that's not the case (i.e. we in fact don't always assume that), then there is no "actual bug" here at all, just the observations that:

  1. sometimes the typechecker should use qualified names in its error messages (easy fix), and
  2. the question of whether its unacceptably confusing that the we usually, but not always, expose Java strings in Java APIs as Ceylon strings.

Again, I simply don't know the answer to 2. I'm inclined to think it might not be a really big issue.

@CeylonMigrationBot
Copy link
Author

[@FroMage] You did fix one of the error messages, but there's another confusing one for this case:

    List<JString> jstringList = java.jstringList();
    jstringList.add("foo");

(Where java.jstringList() returns a java.util.List<java.lang.String>)

test-src/com/redhat/ceylon/compiler/java/test/interop/Types.ceylon:323: error: argument must be assignable to parameter 
arg0 of add: String is not assignable to String?
    jstringList.add("foo");
                    ^

@CeylonMigrationBot
Copy link
Author

[@gavinking] > You did fix one of the error messages, but there's another confusing one for this case:

Right, there are a number of cases like this. But I've no idea how to detect them in general. I don't want to always use fully-qualified names in error messages.

@CeylonMigrationBot
Copy link
Author

[@FroMage] 558111626f0ca3e5b7767c7d95ba0d2c5c0f868e and 775c582cd6fdded3656b3cda1dd0203b7fc7f4d8 were for this issue. Now fixed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants