Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix double allocation in push_str if the argument would promote from an Inline SmartString. #5

Merged
merged 1 commit into from
Jul 6, 2020

Conversation

deej-io
Copy link
Contributor

@deej-io deej-io commented Jul 6, 2020

Hi!

I found your project from https://fasterthanli.me/articles/small-strings-in-rust and thought it was awesome that there was a project for SSO!

I looked at the source code and thought I noticed a potential double allocation in the push_str method. I created a small project (copying the tracing allocator from the above blog post) to test my assumption and I think I'm right.

It allocaties once to convert the Inlined SmartString into a String and again when it pushes the new &str onto the end.

I've created a small change that reserves the memory upfront so there is only a single allocation.

At the bottom of this post is the source code and Cargo.toml of the project that shows the double allocation.

This is the output I get when running cargo run --release with smartstring 0.2.2:

push_str
{"Alloc":{"addr":94476657969728,"size":5}}
{"Alloc":{"addr":94476657970000,"size":51}}
{"Freed":{"addr":94476657969728,"size":5}}
{"Freed":{"addr":94476657970000,"size":51}}
push
{"Alloc":{"addr":94476657969728,"size":23}}
{"Alloc":{"addr":94476657970000,"size":46}}
{"Freed":{"addr":94476657969728,"size":23}}
{"Freed":{"addr":94476657970000,"size":46}}

And here is the output after applying this patch:

push_str
{"Alloc":{"addr":94149837544272,"size":51}}
{"Freed":{"addr":94149837544272,"size":51}}
push
{"Alloc":{"addr":94149837544000,"size":24}}
{"Alloc":{"addr":94149837544272,"size":48}}
{"Freed":{"addr":94149837544000,"size":24}}
{"Freed":{"addr":94149837544272,"size":48}}

Note that push does not seem to benefit from this patch as it seems Strings seems to like to allocate 24 bytes, however it is no worse, so I made the change there for consistency.

Never-the-less, in the push_str there is now only a single allocation.

I appreciate this is a micro-optimisation, but as minimal allocations seems to be a goal for the project, I thought it was worthwhile.

Demo Project

Cargo.toml:

[package]
name = "smallstrings-allocation"
version = "0.1.0"
authors = ["Daniel J. Rollins <daniel@djrollins.com>"]
edition = "2018"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
serde = { version = "1.0.114", features = ["derive"] }
serde_json = "1.0.56"
libc = "0.2.71"
smartstring = "0.2.2"
#smartstring = { path = "../smartstring" }

src/main.rs:

use std::sync::atomic::{AtomicBool, Ordering};
use serde::{Deserialize, Serialize};
use std::alloc::{GlobalAlloc, System};
use std::io::Cursor;
use smartstring::{SmartString, LazyCompact};

type SString = SmartString<LazyCompact>;

#[global_allocator]
pub static ALLOCATOR: Tracing = Tracing::new();


#[derive(Clone, Copy, Serialize, Deserialize)]
pub enum Event {
    Alloc { addr: usize, size: usize },
    Freed { addr: usize, size: usize },
}

pub struct Tracing {
    pub inner: System,
    pub active: AtomicBool,
}

impl Tracing {
    fn write_ev(&self, ev: Event) {
        let mut buf = [0u8; 1024];
        let mut cursor = Cursor::new(&mut buf[..]);
        serde_json::to_writer(&mut cursor, &ev).unwrap();
        let end = cursor.position() as usize;
        self.write(&buf[..end]);
        self.write(b"\n");
    }

    fn write(&self, s: &[u8]) {
        unsafe {
            libc::write(libc::STDERR_FILENO, s.as_ptr() as _, s.len() as _);
        }
    }

    pub const fn new() -> Self {
        Self {
            inner: System,
            active: AtomicBool::new(false),
        }
    }

    pub fn set_active(&self, active: bool) {
        self.active.store(active, Ordering::SeqCst);
    }
}

unsafe impl GlobalAlloc for Tracing {
    unsafe fn alloc(&self, layout: std::alloc::Layout) -> *mut u8 {
        let res = self.inner.alloc(layout);
        if self.active.load(Ordering::SeqCst) {
            self.write_ev(Event::Alloc {
                addr: res as _,
                size: layout.size(),
            });
        }
        res
    }
    unsafe fn dealloc(&self, ptr: *mut u8, layout: std::alloc::Layout) {
        if self.active.load(Ordering::SeqCst) {
            self.write_ev(Event::Freed {
                addr: ptr as _,
                size: layout.size(),
            });
        }
        self.inner.dealloc(ptr, layout)
    }
}


fn main() {
    let name = "push_str\n";
    unsafe {
        libc::write(libc::STDERR_FILENO, name.as_ptr() as _, name.len() as _);
    }
    ALLOCATOR.set_active(true);
    {
        let mut _s = SString::from("hello");
        _s.push_str("some longer string to get it to allocate again");
    }
    ALLOCATOR.set_active(false);

    let name = "push\n";
    unsafe {
        libc::write(libc::STDERR_FILENO, name.as_ptr() as _, name.len() as _);
    }
    ALLOCATOR.set_active(true);
    {
        let mut _s = SString::from("aaaaaaaaaaaaaaaaaaaaaaa");
        _s.push('a');
        _s.push('a');
    }
    ALLOCATOR.set_active(false);
}

Currently, when pushing onto an Inline SmartString, there will be a
double allocation (once in the `to_string` of the inlined string and
again when pushing the new string onto the end.

This patch reserves enough memory for the new string to fit so that
there is only one allocation.
@bodil
Copy link
Owner

bodil commented Jul 6, 2020

Heh, this entire project is a micro-optimisation, to be fair. Thanks for the fix. 😁

@bodil bodil merged commit 6d4a5cf into bodil:master Jul 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants