Skip to content

string type layout #14

@CeleritasCelery

Description

@CeleritasCelery

There is conflict between the Rust world and the elisp world. Rust expects types to have explicit compile time alias checking, and elisp says you can alias anything you want. We need solution to make both of these worlds happy.

Elisp string are mutable. This might not be such a big deal except for the fact that since not all characters (code points) are the same size, you may need to reallocate the string when mutating it. So we need some way to mutate aliasing strings in Rust.

Take the following code sample:

    let str1 = "foo".into_obj();
    let str2 = str1;
    mutate(str1.untag(), str2.untag());
    
    fn mutate(str1: &LispString, str2: &LispString) -> &str {
        let slice: &str = str1.get(); // take a immutable reference through str1
        str2.set_at(0, 'å'); // mutate the string through str2, requiring a reallocation. This will drop slice
        slice // return the now ivalidated slice
    }

We need to find some way to handle this situation.

1 - current solution: RefCell

The easiest way to handle this from and implementation point of view is using RefCell. This is how thing are currently setup. However this comes with some big downsides. For one, we add overhead to all string access, including immutable access. Second, and probably most important, is that we open up the opportunity for runtime panics. Mutating a string should never be an error (unless it is const).

2 - copy on write

Since the problem is that all references to the string get invalidated on mutation, we could just make a copy instead. So anytime you mutate a string, it keeps the old string buffer valid until the next garbage collection. It would just update the "current" string buffer to point to the new copy.

This has the advantage of being simple implementation wise, but makes the mutation expensive. Probably the only reason you would be using mutation from elisp is to because of performance, now that is gone. This might be okay, because string mutation is a relatively rare operation in elisp.

3 - unsafe

There are only a few function that actually mutate string from elisp:

  • aref
  • store-substring
  • clear-string

Maybe it would be worth it to just mark mutation as unsafe, and require the user to ensure no aliasing happens? There would not be that many unsafe blocks to inspect. This would be fine so long as the mutation subr's are only called from elisp, but if another rust function calls them, all bets are off.

Metadata

Metadata

Assignees

No one assigned

    Labels

    design neededItems where more design help is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions