Make an RFC for immutability and freshness in the standard library #22

LiberalArtist · 2019-07-17T02:18:57Z

Racket 2 presents a good opportunity to revisit which standard library functions return mutable values and which are guaranteed to return freshly-allocated values.

Spoiler: I think more things should be immutable and freshness should rarely be guaranteed.

My particular pet issue is strings. I don't think I have ever mutated a string except to demonstrate bugs caused by the mutability of strings. I strongly suspect most other Racketeers also reason about strings as though they were immutable. Yet almost every standard library function returns mutable strings: for example, (immutable? (string-append "a" "")) and even (immutable? (string-append "" "")) both return #false. I think that should change. My primary motivation is the ease of reasoning that immutable values provide, but the change might also expose additional optimization opportunities.

The question applies to other datatypes, as well. I suspect bytes may be mutated much more often than character strings, but I would still like better support for working with immutable bytes. For example, path->bytes could return immutable bytes, or else a path->immutable-bytes could be added. Maybe bytes-append should return immutable bytes, or there should be an immutable-bytes-append, or the mutability of the result should depend on the mutability of the arguments (though I personally think that would be confusing, especially in a dynamically-typed context). Basically the same trade-offs exist for vectors.

The text was updated successfully, but these errors were encountered:

jackfirth · 2019-07-17T04:52:30Z

Strongly agree. I think we should do what we did for lists and split all mutable versions of the existing types into separate entirely modules. Like if you want to use mutable strings, you should have to use a mutable-string? type and mutable-string-* functions.

sorawee · 2019-07-17T05:58:19Z

I think we should do what we did for lists and split all mutable versions of the existing types into separate entirely modules

I'm not sure if I like this idea because it really clutters modules with identifiers. Considering hash map as an example. We right now have |{weak, mutable, immutable}| x |{eq, equal, eqv}| = 9 variants, which is absurd IMO. For constructors, I think there should be only one. The variant should be specified by keyword argument.

But things do get more complicated when it comes to a function that produces a new object, like string-append. For hash map, there's not much problem[*] because there are normal variant and !-variant, where !-variant always mutate the object in-place. string-append is weird because string-append! doesn't make sense (unless we are talking about C's strcat)...

Also, we have a proposal that things should be more generic (#19). This proposal on the other hand suggests that things should be splitted even more. These two goals look conflicting.

[*]: actually there's one: I always wish for/hash to have an option to return a mutable hash.

rocketnia · 2019-07-17T11:35:58Z

I think the places mutable strings and mutable bytes matter most are where people have taken it upon themselves to optimize their code, at which point generic operations are less desirable because they're not as optimized as the specialized operations are. Conflict between the goals is natural in this case, and I'm optimistic that everything can have a place.

There are sort of three layers: Libraries that define the most efficient operations for a data structure (some of which probably should do whatever they need for maximum efficiency, but the wilder ones should probably have longer names), libraries that define operations that almost surely will never be efficient or recommended (so are shunned from the main library) but are still too convenient to completely abandon, and generic operations. The generic operations have to be exposed by the first (efficient) library because that's where the data type is exposed, but they'll often be inefficient guilty pleasures like the operations in the second library. Even worse, sometimes they won't even be particularly convenient or meaningful, just there because they fill a gap in the Expression Problem tableau. The sheer comprehensiveness of the tableau can be very convenient even if not all the entries are star performers.

dbenoit17 · 2019-07-17T13:08:20Z

I think genericness may not be mutually exclusive with expressiveness. Consider Rust, which has the mut keyword as a modifier property of types. In such a schema, mutable-string is equivalent to mut String, where String is the base type and the mutability property is a modifier. By making mutability a special property of types, Rust can also perform checks at compile time and emit warnings when mutability is unnecessary in a given circumstance.

In cases where one must check for a mutable string type, mutable-string? is really a combination of the contracts (and/c string? mutable?). Identifiers like mutable-string? are in a sense just built-in sugar over the contract combinator. If identifier cluttering is a concern, perhaps a solution is to use contract-call forms which could be more generic than built-in identifiers.

(define-syntax-rule (and/c? [contract ...] predicate)
  ((and/c contract ...) predicate))

(if (and/c? [string? mutable?] "immutable string")
    true-expr
    false-expr)

dbenoit17 · 2019-07-17T13:11:30Z

@rocketnia I think the concern over performance of generics is well-founded, and is something we should discuss in the generics RFC. Performance impact may or may not be possible to avoid, depending on which dispatching operations can be done at compile time.

gus-massa · 2019-07-17T15:22:07Z

My particular pet issue is strings. I don't think I have ever mutated a string except to demonstrate bugs caused by the mutability of strings.

I agree.

I like the idea of making most (all) functions return an immutable version, and the immutable thing may be not fresh. I guess this include 99% of my use of string and bytes. But vectors are more prone to be mutated. It is also optimizer friendly, but I guess it would be useful to add some cases to reduce (string-copy (string-copy x)).

I'd like to drop the immutable? function because it has a weird idea of what is immutable. Let's split it in string-immutable?, ... (or string-mutable?, ...).

Nevertheless, I like to maintain a immutable version of "" and a mutable version of "". I'm never sure if this is a good idea, but I like it.

rmculpepper · 2019-07-17T16:44:57Z

I like the idea of making more things immutable by default and guaranteeing freshness in fewer places.

I don't like the idea of splitting existing primitive operations like string-ref into string-ref and mutable-string-ref (namespace pollution). For constructors, though, it might make more sense to have different names.

I would propose keeping immutable? and adding generic mutable and immutable procedures that convert (shallowly!) between immutable and mutable kinds of data types. For example, to join two strings into a mutable string: (mutable (string-append str1 str2)). If most operations return immutable values, then uses of immutable should be rare---probably mostly at the end of code that uses mutation to construct a value. If mutation is rare, then mutable should be fairly rare. And presumably the compiler can watch for such combinations and eliminate copies in some cases.

Similarly, fresh could return a value equal (and of the same mutability) but not eq to its argument.

spdegabrielle · 2019-07-18T14:10:26Z

#37 closed because covered by #22

Drop more guarantees about new object allocations: for example, allow (append lst '()) to return lst, and the same for other list, string, bytes, vector functions (drop 0 items from the end of a list. substring and regexp-replace that don't change a string, etc).
@elibarzilay 3 Oct 2012

rocketnia · 2019-07-19T02:21:35Z

As long as this is a topic about generic relationships between mutable and immutable types, I want to point out #28 as an example of how a single immutable type (immutable vectors) can naturally correspond to more than one distinct mutable type (fixed-length vectors and resizable vectors). So while there may be a straightforward generic operation that converts mutable types to immutable ones, any operation that converts the other way will likely make at least a few non-obvious choices.

LiberalArtist · 2019-11-09T00:20:35Z

I don't think I have ever mutated a string except to demonstrate bugs caused by the mutability of strings.

Here's one of those "bugs caused by the mutability of strings:" racket/racket#2179

dbenoit17 mentioned this issue Jul 17, 2019

Make an RFC for a generic collection interface #19

Open

This was referenced Jul 18, 2019

Move content from the Racket2 wiki entry to this repository #26

Closed

Wishlist of backwards incompatible things for a future Racket2. #33

Closed

spdegabrielle mentioned this issue Jul 18, 2019

Drop more guarantees about new object allocations #37

Closed

gus-massa mentioned this issue Aug 1, 2019

add set->hash racket/racket#2766

Open

gus-massa mentioned this issue Dec 14, 2020

Add sequence conversion functions for built-in collection types racket/racket#3525

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make an RFC for immutability and freshness in the standard library #22

Make an RFC for immutability and freshness in the standard library #22

LiberalArtist commented Jul 17, 2019

jackfirth commented Jul 17, 2019

sorawee commented Jul 17, 2019 •

edited

Loading

rocketnia commented Jul 17, 2019

dbenoit17 commented Jul 17, 2019 •

edited

Loading

dbenoit17 commented Jul 17, 2019

gus-massa commented Jul 17, 2019

rmculpepper commented Jul 17, 2019

spdegabrielle commented Jul 18, 2019

rocketnia commented Jul 19, 2019

LiberalArtist commented Nov 9, 2019

Make an RFC for immutability and freshness in the standard library #22

Make an RFC for immutability and freshness in the standard library #22

Comments

LiberalArtist commented Jul 17, 2019

jackfirth commented Jul 17, 2019

sorawee commented Jul 17, 2019 • edited Loading

rocketnia commented Jul 17, 2019

dbenoit17 commented Jul 17, 2019 • edited Loading

dbenoit17 commented Jul 17, 2019

gus-massa commented Jul 17, 2019

rmculpepper commented Jul 17, 2019

spdegabrielle commented Jul 18, 2019

rocketnia commented Jul 19, 2019

LiberalArtist commented Nov 9, 2019

sorawee commented Jul 17, 2019 •

edited

Loading

dbenoit17 commented Jul 17, 2019 •

edited

Loading