Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Fix threadsafety documentation for SequenceBuilder, SequenceAppender and StringBuilder classes #382

Closed
wants to merge 1 commit into from

8 participants

@loicrouchon
Collaborator

Documentation of classes SequenceBuilder, SequenceAppender and StringBuilder mentions that those classes are threadsafe, but it's not the case.

@loicrouchon loicrouchon Fix threadsafety documentation
for SequenceBuilder, SequenceAppender and StringBuilder classes
8c34b25
@tombentley
Collaborator

@ceylon, @loicrouchon has updated the documentation on SequenceBuilder and StringBuilder to reflect the fact that the implementations are currently not thread-safe. I seem to recall there was a difference in opinion about what we really wanted to do here, and I'm not sure it actually got resolved. I recall @gavinking wanted thread safety, and @FroMage argued against it. We really do need a consensus before applying this patch.

@gavinking
Owner

Grrr. So this was a change made to SequenceBuilder and SequenceAppender that I never agreed with, and in fact was only dimly aware had actually been made. I think we should roll it back. Just look at the proposed documentation change: it talks about multithreading, when there is no notion of threads in the language module, and about synchronization, something which is impossible to achieve using language module APIs.

We should make these classes threadsafe again, which was how they were originally conceived. On my machine that results in almost no performance impact.

@gavinking
Owner

I seem to recall there was a difference in opinion about what we really wanted to do here, and I'm not sure it actually got resolved.

Well apparently it got resolved by @quintesse removing the synchronized keywords but not actually updating the documentation to reflect the change, so that everyone not looking at the Java implementations thought they were still threadsafe.

@tombentley
Collaborator

AFAICS, perf is the only reason not to make it thread-safe. If we made it thread-safe we'd basically be relying on the JVM to do "lock elision" to make it performant. I think that's a relatively recent optimization in hotspot (Java 7, so not really a problem for us), but it probably isn't supported in other JVMs (e.g. dalvik).

@gavinking
Owner

I think we should deprecate these three classes and eventually remove them completely. They're not really used by the Ceylon sources in the language module, though they are used in a few places, unnecessarily, I imagine, in the Java sources.

  • What SequenceBuilder does can be done equally well by ceylon.collection::ArrayList or ceylon.collection::LinkedList.
  • What StringBuilder does can be done using ArrayList<Character>/LinkedList<Character> and String.join() but if we find that this doesn't perform acceptably, we could add a specialized CharacterList class somewhere, I suppose.
@quintesse
Owner

If there is supposedly no performance impact then we could just as well make every method we generate synchronized, right? I mean, I think it's weird to only give some special classes the benefit of thread-safety and most of the others not (none of the user defined classes for example). That way at least execution would be predictable.
(PS: Not that I'm actually advocating this, I'm sure there are very measurable performance issues in certain cases)

@tombentley
Collaborator

I expect they're also used implicitly in the code generated for some expressions. Replacements being in ceylon.collection isn't an option for those uses, so we'd need the replacements to be elsewhere in the language module (i.e. com.redhat.ceylon...).

@quintesse
Owner

What SequenceBuilder does can be done equally well by ...

@gavinking , still think we don't need some kind of consing in the base language(module)? It is such a basic action that needing to import the collection module for that seems overkill to me.

@gavinking
Owner

So this (silly) test code shows a small performance difference between ArrayList and SequenceBuilder on my machine:

shared void run1() {
    value nanoseconds = system.nanoseconds;
    value arrayList = ArrayList<Integer>();
    for (i in 0..10M) {
        arrayList.add(i);
    }
    value sequence = arrayList.sequence;
    print((system.nanoseconds-nanoseconds)/1M);
}

shared void run2() {
    value nanoseconds = system.nanoseconds;
    value sequenceBuilder = SequenceBuilder<Integer>();
    for (i in 0..10M) {
        sequenceBuilder.append(i);
    }
    value sequence = sequenceBuilder.sequence;
    print((system.nanoseconds-nanoseconds)/1M);
}

SequenceBuilder is ever so slightly faster, 3600 instead of 4800. Doesn't sound like enough of a difference to justify its existence....

@gavinking
Owner

If there is supposedly no performance impact then we could just as well make every method we generate synchronized, right?

No of course not. Every multithreaded program with shared state would deadlock immediately.

@gavinking
Owner

I expect they're also used implicitly in the code generated for some expressions.

Really?! Are you sure about that?

Anyway, even if they are, that only justifies the existence of their Java implementations, not the Ceylon schemas.

@gavinking
Owner

still think we don't need some kind of consing in the base language(module)?

Eh? There is a language construct for cons:

 function cons<E>(E[] es, E e) => [e, *es];

Along with an actual Cons class, named Tuple!

@gavinking
Owner

@quintesse Note that there is also List.withLeading() and List.withTrailing(). Now, right now, they make use of SequenceBuilder, so of course we would need to arrive at an alternative implementation. It looks to me like they can be easily implemented using Array or ArraySequence.

@quintesse
Owner

There is a language construct for cons

Didn't know about this one to be honest.

Still it's one of those things where I would have liked to see some syntax sugar, something like:

value newxs = x + xs;

but as long as the following results in efficient enough code that would be okay too:

value newxs = [x, *xs];

The examples with withLeading() and withTrailing() work of course but they are so verbose. We have such a nice syntax for creating Tuples that it seems a pity we don't have equally nice syntax for consing/joining.

Btw, unrelated to this issue, but we still have the problem that we can't really make long sequences this way without getting stack overflows.

@quintesse
Owner

is ever so slightly faster

That's a 25% performance gain! I don't consider that "slight"

@loicrouchon
Collaborator

@gavinking , still think we don't need some kind of consing in the base language(module)? It is such a basic action that needing to import the collection module for that seems overkill to me.

I completely agree with this, it's a so basic construct, so I think we should keep those classes.
FTR I was about to commit a fix to ceylon.build.tasks.ceylon to remove the dependency to ceylon.collection by using SequenceAppender, but I will wait for this discussion to end before doing so push my commit :)

Just look at the proposed documentation change: it talks about multithreading, when there is no notion of threads in the language module, and about synchronization, something which is impossible to achieve using language module APIs.

@gavinking well, if there's no notion of thread in the language module, it makes more sense not to support thread-safety than to support it.
But I agree that it might be confusing to mention it only for those classes, so maybe we should just not say whether they are or not thread-safe

@FroMage FroMage added this to the 1.1 milestone
@FroMage
Owner

Guys what is the conclusion here? Are they thread-safe? Do they need to be? My opinion is that nothing can really be considered thread-safe until we define threads.

@quintesse
Owner

To be honest to me it seems strange that builders are thread-safe. Java's StringBuffer is but it was soon superseded by StringBuilder which isn't. And I agree with @FroMage that until the time we have defined how to do threading in Ceylon we should not worry about this (which means make no statements about thread-safety at all, except by saying there ain't any)

@tombentley
Collaborator

until we define threads

Not wanting to put words in Gavin's mouth, but I can't see that ever happening, at least as far as the spec is concerned.

I wonder if a workable compromise would be to subclass those builders in the java interop module making the subclasses synchronized.

@quintesse
Owner

Well to me the issue remains: why do this for these classes? What makes them so special that we'll make them synchronized but nothing else? And why do it this way when surely it will never be the way Ceylon handles threading (in whatever way).

@tombentley
Collaborator

What makes them so special that we'll make them synchronized but nothing else?

I think @quintesse because those are the only classes in the language module with mutable state. And basically because the JVM supports threading we need some policy for the language module about how they should behave when accessed concurrently. It just happens that that policy (whatever it is) will only affect these classes.

I'm not sure how helpful getting rid of SequenceBuilder, SequenceAppender and StringBuilder would really be. I'm sure we could remove them (or make them non-shared), but I bet we'll end up sprouting similar things in ceylon.collection or somewhere else. The fact is they can be pretty useful. StringBuilders I tend to use quite frequently when constructing complex strings where I need to branch. I must admit that I've never to my knowledge used SequenceAppender though.

@quintesse
Owner

It's not the mutability that makes them special, anyone can make mutable classes, they would be special only because they are part of the language module and far far worse in my opinion nobody would be able to write classes that behave in the same way! Only those hand-written classes in the language module would have this magic thread-safe property.
So IMO either we expose a way to make classes synchronized (or otherwise thread-safe) or we leave them with the same basic problems that would have any other user-written class in Ceylon.
The JVM has threading yes but we can't just have some classes that work and some that don't, that seems like a non-solution to me.

@tombentley
Collaborator

You#re right: the fact that people can't write their own versions that are synchronized is an example of a more general interop problem. I think we should add a synchronized annotation to ceylon.interop.java which would let people write synchronized things on the JVM.

@gavinking
Owner

Actually the only really good reason why these are in the language module rather than the SDK is that they are implemented natively. It looks to me like SequenceBuilder could definitely be reimplemented in Ceylon, and I bet the same is true for StringBuilder. So I don't see why you guys couldn't write a ceylon.builder module in Ceylon and remove these classes from ceylon.language.

(There's a couple of places in Iterable where SequenceBuilder is used, but we probably don't absolutely need it there, and even if we do, that doesn't mean it has to be shared. Note that since this discussion happened, we stopped using it in List.)

I think it would be good to make this change because:

  1. it slightly unbloats the language module, and
  2. it removes the only mutable classes (except Array) from the language module.

But if we're going to do it, we have to do it now.

@gavinking
Owner

You're right: the fact that people can't write their own versions that are synchronized is an example of a more general interop problem. I think we should add a synchronized annotation to ceylon.interop.java which would let people write synchronized things on the JVM.

Disagree. In general synchronized is evil and shouldn't be used directly. There are much better constructs in java.util.concurrent that are safer and can be used from Ceylon. (ReentrantLock, for example.)

@quintesse
Owner

@tombentley although that would certainly help a bit it isn't real synchonization, almost nobody who really thinks about concurrency uses method-level synchonization, it's almost always the wrong granularity. (Which is one thing I forgot to mention above). We'd need the ability, like in Java, to do it at code-block level and to choose our lock object. Really, I think we should leave concurrency issues for the future when we are really prepared to deal with them. Up till then people can just fall back on using Java code if they really need JVM synchonization.

@quintesse
Owner

safer and can be used from Ceylon

Indeed

@tombentley
Collaborator

I'm concerned about StringBuilder. We're going to end up passing an Iterable<Character> to the constructor of ceylon.language::String and that internally creates a java.lang.StringBuilder, appends the characters to it, then calls toString(), and that copies the array again. Needing three char[] seems suboptimal to me. We may be able to improve on that if we had a package-private ArrayIterable which we'd return from Array.take() and which we could check for in the constructor of ceylon.language::String.

@FroMage
Owner

I don't care if we remove SequenceBuilder since that's essentially what mutable collections are for, but StringBuilder is something everyone uses. On the other hand, so is MutableSet and that's not in the language module, so we already require the SDK for anything more than hello worlds.

The thing is that it may make sense for StringBuilder to remain in the language module if it is used for the implementation of string interpolation. I don't recall.

@tombentley
Collaborator

I think we use a java.lang.StringBuffer() for string templates.

@FroMage
Owner

We have such faith in our code :(

@tombentley
Collaborator

It's not a lack of faith: I foresaw this day when @gavinking wanted to take StringBuilder out of the langugage.

@FroMage
Owner

Haha ;)

@quintesse
Owner

On the other hand, so is MutableSet and that's not in the language module, so we already require the SDK for anything more than hello worlds.

Still think those should be in the language module :)

Or... alternatively that a non-minimal Ceylon install includes the most essential sdk modules.

Anyway, this is not the place to discuss that ;)

@gavinking
Owner

It looks to me like SequenceBuilder could definitely be reimplemented in Ceylon, and I bet the same is true for StringBuilder.

I'm concerned about StringBuilder. We're going to end up passing an Iterable<Character> to the constructor of ceylon.language::String and that internally creates a java.lang.StringBuilder, appends the characters to it, then calls toString(), and that copies the array again.

You're right. In fact, I'm likely wrong in my "bet". There might not be an efficient way to implement StringBuilder in pure Ceylon, not without some kind of special support in the language module.

But what about if we had a single kind of ArrayBuilder object, implemented in pure Ceylon, using an underlying Array, and the constructors of String and ArraySequence knew what to do with it? I mean, the internal implementations of StringBuilder and SequenceBuilder do basically the same thing so there's no really good reason for having two classes, frankly.

So you would write something like:

 value builder = ArrayBuilder<Character>();
 builder.append("hello");
 builder.add(' ');
 builder.append("world");
 String string = String(builder);

Hell, we could even add an alias:

 shared class StringBuilder() => ArrayBuilder<Character>();
@tombentley
Collaborator

This is basically the idea I'm exploring at the moment. We'd need to expose (at the JVM level) a special ArrayIterable which we could check for in the String and ArrayList constructor. That would provide access to the underlying array. To work with builders we actually need ArrayIterable.take() to also return an ArrayIterable (because the char[] or whatever is usually bigger than the number of characters that are in it), but it looks like it would work.

We'd also need to consider how this pans out on JS.

The only question with your idea of a single ArrayBuilder is that it's API might be a little clunky for use when building a String. The appendAll(List<Element> list) method would have to use a boxed Ceylon String, which would lead to more boxing that having a dedicated StringBuilder and a non-generic appendAll(String) method.

@tombentley tombentley referenced this pull request from a commit
@tombentley tombentley Issue #382 Add AbstractArrayIterable which abstracts iteration over a…
…n array with parameterizable start, length and step. Make the existing java.lang::ObjectArray use this where it was using it's own iterable before. Add ArrayIterable as a subclass of AbstractArrayIterable and make Array.by(), Array.take() and Array.skip() use it.
ea2188c
@tombentley tombentley referenced this pull request from a commit
@tombentley tombentley Issue #382: Implement StringBuilder in Ceylon using an Array<Character>.
In the constructor for String check if the characters are an
`ArrayIterable<Character>` with unit step: if so we can just use the
Java constructor.
d46481d
@tombentley
Collaborator

I've added a Ceylon implementation of StringBuilder (the JS runtime is not using it yet, but the JVM one is). Next thing is to try to remove SequenceBuilder and SequenceAppender. Then we can look at moving StringBuilder to the SDK, if people are up for that?

@gavinking
Owner

I've added a Ceylon implementation of StringBuilder

Cool!

@chochos
Collaborator

Hrm. This is going to mess up my backend badly. I use StringBuilder for the interpolated string templates. Well I'll have to change that to something else.

@tombentley
Collaborator

SequenceBuilder is used in the transformation of spread member and spread invoke on the JVM (I don't know about JS, @chochos?). I don't think it makes sense to refactor the transformation because, essentially, something like SequenceBuilder is exactly what's required. We still don't need to expose it from the language module though.

@gavinking
Owner

something like SequenceBuilder is exactly what's required

Sure, but id it's only for internal use, it can be much more simplistic and lightweight.

@chochos
Collaborator

Spread invoke in js uses a special callable. No SequenceBuilder involved

@tombentley
Collaborator

Once upon a time @gavinking suggested that Sequence be an enumerated type (#198) if that's still on the cards then what do we propose to do about all the things (mainly in the SDK) which currently use a sequence builder to return a Foo[]? While the natural thing to do would be to use a MutableList, a List is not a Sequence. If we're going to do #198 then we should make those things return List<Foo> now to avoid future breakage. If not they can stay as Foo[], but we'll at least need a SequenceBuilder-like thing available to the SDK.

@gavinking
Owner

@tombentley I don't understand why that change would be necessary. It is trivial to get a sequence with the same elements as a List.

@tombentley
Collaborator
@tombentley
Collaborator

I've pushed to "sans-SequenceBuilder" branches of ceylon.language, ceylon-compiler and ceylon-sdk. A handful of the compiler tests still fail, which I'll continue to fix up, but we'll also need @chochos to make the changes for JS before we can merge.

@tombentley
Collaborator

compiler tests now pass, except LanguageSuite, which fails right at the end (and not when run directly via ant in ceylon.language) with:

Exception in thread "main" java.lang.RuntimeException: com.redhat.ceylon.compiler.java.runtime.Main$ClassPath$ModuleNotFoundException: Module com.redhat.ceylon.typechecker/1.1.0 not found
at com.redhat.ceylon.compiler.java.runtime.Main.registerInMetamodel(Main.java:449)
at com.redhat.ceylon.compiler.java.runtime.Main.setupMetamodel(Main.java:425)
at com.redhat.ceylon.compiler.java.runtime.Main.runModule(Main.java:377)
at com.redhat.ceylon.compiler.java.runtime.Main.main(Main.java:483)
Caused by: com.redhat.ceylon.compiler.java.runtime.Main$ClassPath$ModuleNotFoundException: Module com.redhat.ceylon.typechecker/1.1.0 not found
at com.redhat.ceylon.compiler.java.runtime.Main$ClassPath.loadModule(Main.java:249)
at com.redhat.ceylon.compiler.java.runtime.Main.registerInMetamodel(Main.java:445)
... 3 more

Anyone have any clues -- it looks like the module is not where it's expected to be

@FroMage
Owner

This test uses the modules from the ceylon-dist repo. I guess it's missing?

@tombentley
Collaborator

I've done both an ant siblings package and an ant publish-all. Maybe I need a ant @quintesse only knows

@quintesse
Owner

Well you can see if it's there, right? And it's always just ant clean publish-all, never anything else ;)

@tombentley
Collaborator

I have a dist/repo/com/redhat/ceylon/typechecker/1.1.0/com.redhat.ceylon.typechecker-1.1.0.jar in my ceylon-dist.

@tombentley
Collaborator

@chochos, when you're over your jet lag...

@chochos
Collaborator

So what do I need to do exactly? Remove all references to SequenceBuilder and that's it? StringBuilder stays?

@chochos
Collaborator

I notice a new file sequential.ceylon - did you guys by any chance remember that some of us have case-preserving but case-insensitive filesystems?

@tombentley
Collaborator
@chochos
Collaborator

I'll fix it then, just put both Sequential and sequential in the same file.

@lucaswerkmeister

To be clear, what files are you talking about? Sequential.ceylon in src and sequential.js in runtime-js? But these are in different folders – when do they conflict?

@chochos
Collaborator

No, Sequential.ceylon and sequential.ceylon both in src/ceylon/language - if you only see one it's because you also have a case-insensitive filesystem.

@chochos
Collaborator

There's no sequential.js in runtime-js BTW, it's sequences.js

@lucaswerkmeister

Then Github and I must be missing some commits… Here’s sequential.js in runtime-js on master. It’s present at the time of writing. And I don’t see a sequential.js in src/ceylon/language, neither on Github nor on my local copy (Linux, case-sensitive).

@lucaswerkmeister

Or are you on sans-sequenceBuilder?

@lucaswerkmeister

Ah, there it is. (But there’s still a sequential.js in runtime-js.)

@chochos
Collaborator

It's under runtime-js/replaced that's dead code I just kept around in case we need it again.

@lucaswerkmeister

dead code I just kept around in case we need it again.

Isn’t that why we use version control? ;-)

@chochos
Collaborator

There seems to be a circular reference issue with the Iterable.sequence attribute. Maybe you don't have it in JVM but in JS it's causing stack overflows: Iterable.sequence calls sequential which in turn calls Iterable.sequence... I think I can break it by refining Array.take but it can cause stack overflows in other implementations of Iterable which don't refine take

@FroMage
Owner

IIRC @tombentley fixed a similar issue recently for the JVM, no?

@tombentley
Collaborator

Er, the one thing that the sequential method does not do is call Iterable.sequence.

@tombentley
Collaborator

Oh, but you'll need to override your Array.sequence to not inherit Iterables -- I guess you've not done that.

@FroMage
Owner

“does not” or “should not”?

@FroMage
Owner

Doesn't everyone need to do this then?

@tombentley
Collaborator

Only everyone who's implementing Array. So just us and @chochos.

@chochos
Collaborator

Iterable.sequence calls sequential, which in turn calls Iterable.take, which in turn creates an internal iterable object on which sequence is called, so sequential is called, etc.

I already refined Array.sequence but that's not the problem; to fix this, I had to implement Array.sequence

@tombentley
Collaborator

@chochos Array.take, Array.by, Array.skip all need to be refined.

@FroMage
Owner

What @chochos seems to be saying is that everyone implementing Iterable should refine those or get infinite loops.

@tombentley
Collaborator

@FroMage: No sequential only calls take() on an Array. It's only Array that needs to override those things.

@chochos
Collaborator

I think I've mostly got it but some tests still fail because now String.sequence uses sequential (because it's not refined) and since String is already a Character[], now String.sequence returns itself (it used to return a real sequence of its characters). I still can't figure out what you guys did in JVM so that this doesn't happen there.

@chochos
Collaborator

ah, you didn't do anything - type erasure is why this doesn't happen in JVM.

@chochos chochos referenced this pull request from a commit
@chochos chochos work on #382 77aca17
@chochos chochos referenced this pull request from a commit
@chochos chochos remove SequenceBuilder to fix #382 f8287ba
@chochos
Collaborator

you can now merge that branch. I still have an issue that just came up when running sequential along with indexed but it's unrelated to this.

@chochos chochos closed this in f8287ba
@tombentley
Collaborator
  • The language module no longer has a SequenceBuilder
  • the StringBuilder is native Ceylon (and could therefore be moved to the SDK).
  • StringBuilder is the only thing with mutable state in the language module, but since it's native it doesn't make much sense to talk about it being thread-safe or not.
@chochos
Collaborator

Sorry didn't mean to close this yet

@chochos chochos reopened this
@tombentley
Collaborator

It was me pushing your commit with the magic words in the commit message.

@chochos
Collaborator

So, has this branch been merged into master already?

@chochos
Collaborator

Seems the ceylon StringBuilder doesn't work in JS, probably because chars and strings get mixed up, but I haven't found out exactly where.

@chochos
Collaborator

Shit, I hadn't noticed you had merged and I pushed some stuff to the branch after that. When I find the problem with StringBuilder (which could be a problem with my String actually) should I push directly to master or to this branch and then merge again?

@tombentley
Collaborator

I merged it, yes. Because it seemed that no one was really against it. And you said "you can now merge that branch" so I assumed we were ready to merge.

@tombentley
Collaborator

I'd push onto master as this point

@chochos
Collaborator

OK

@chochos
Collaborator

I guess I'll open a new issue then

@chochos chochos closed this
@chochos chochos referenced this pull request from a commit
@chochos chochos remove use of SequenceBuilder for #382 adb6935
@chochos chochos referenced this pull request from a commit
@chochos chochos compile StringBuilder #382 8ee4b8b
@chochos chochos referenced this pull request from a commit
@chochos chochos remove StringBuilder #382 f4a29a8
@akberc

+1 for

Or... alternatively that a non-minimal Ceylon install includes the most essential sdk modules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Dec 30, 2013
  1. @loicrouchon

    Fix threadsafety documentation

    loicrouchon authored
    for SequenceBuilder, SequenceAppender and StringBuilder classes
This page is out of date. Refresh to see the latest.
View
16 src/ceylon/language/SequenceBuilder.ceylon
@@ -1,7 +1,11 @@
"Since sequences are immutable, this class is used for
constructing a new sequence by incrementally appending
- elements to the empty sequence. This class is mutable
- but threadsafe."
+ elements to the empty sequence.
+
+ This class is mutable and not threadsafe.
+ If multiples threads use the same `SequenceBuilder`
+ instance, an external synchronization mechanism
+ has to be put in place."
see (`class SequenceAppender`,
`function concatenate`,
`class Singleton`)
@@ -35,8 +39,12 @@ shared native class SequenceBuilder<Element>() {
"This class is used for constructing a new nonempty
sequence by incrementally appending elements to an
existing nonempty sequence. The existing sequence is
- not modified, since `Sequence`s are immutable. This
- class is mutable but threadsafe."
+ not modified, since `Sequence`s are immutable.
+
+ This class is mutable and not threadsafe.
+ If multiples threads use the same `SequenceAppender`
+ instance, an external synchronization mechanism
+ has to be put in place."
see (`class SequenceBuilder`)
shared native class SequenceAppender<Element>([Element+] elements)
extends SequenceBuilder<Element>() {
View
8 src/ceylon/language/StringBuilder.ceylon
@@ -1,7 +1,11 @@
"Since strings are immutable, this class is used for
constructing a string by incrementally appending
- characters to the empty string. This class is mutable
- but threadsafe."
+ characters to the empty string.
+
+ This class is mutable and not threadsafe.
+ If multiples threads use the same `StringBuilder`
+ instance, an external synchronization mechanism
+ has to be put in place."
shared native class StringBuilder() {
"The resulting string. If no characters have been
Something went wrong with that request. Please try again.