Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization issue with dummy moves in loop. #29566

Closed
leeopop opened this Issue Nov 4, 2015 · 2 comments

Comments

Projects
None yet
3 participants
@leeopop
Copy link
Contributor

leeopop commented Nov 4, 2015

Micro benchmarks was done with following code, built with cargo build --release.

const DEFAULT_PACKET_BUFFER_SIZE : usize = 2048;
pub const PACKET_BUFFER_SIZE : usize = DEFAULT_PACKET_BUFFER_SIZE;
pub const MAX_FIELD_NAME : usize = 128;
use std::collections::hash_map::HashMap;

pub type Byte = u8;
pub type Buffer = [Byte; PACKET_BUFFER_SIZE];
pub type Field<'a> = &'a mut [Byte];
pub type ConstField<'a> = &'a [Byte];
pub type PacketHandler<'a> = fn (Packet) -> Packet;

pub const ZERO_BUFFER : Buffer = [0u8; PACKET_BUFFER_SIZE];

#[inline(always)]
pub fn move_field(src : ConstField, dst : Field) -> usize {
    let moved_bytes = std::cmp::min(src.len(), dst.len());
    //TODO: target.clone_from_slice(val); or copy_memory(val, target);
    for index in 0..moved_bytes {
        dst[index] = src[index];
    }
    moved_bytes
}

#[inline(always)]
pub fn compare_field(a : ConstField, b : ConstField) -> std::cmp::Ordering {
    let a_len = a.len();
    let b_len = b.len();
    if a_len == 0 || b_len == 0 {
        a_len.cmp(&b_len)
    }
    else {
        match a[0].cmp(&b[0]) {
            std::cmp::Ordering::Less => std::cmp::Ordering::Less,
            std::cmp::Ordering::Greater => std::cmp::Ordering::Greater,
            _ => compare_field(&a[1..], &b[1..]),
        }
    }
}

pub struct PacketContext<'a> {
    u64_val : HashMap<&'static str, u64>,
    u32_val : HashMap<&'static str, u32>,
    u16_val : HashMap<&'static str, u16>,
    u8_val : HashMap<&'static str, u8>,
    bool_val : HashMap<&'static str, bool>,
    string_val : HashMap<&'static str, &'static str>,
    field_val : HashMap<&'static str, Field<'a>>,
}

impl<'a> PacketContext<'a> {
    #[inline(always)]
    pub fn new() -> PacketContext<'a> {
        PacketContext {
            u64_val : HashMap::new(),
            u32_val : HashMap::new(),
            u16_val : HashMap::new(),
            u8_val : HashMap::new(),
            bool_val : HashMap::new(),
            string_val : HashMap::new(),
            field_val : HashMap::new(),
        }
    }
}

pub type PacketContextRef<'a> = Box<PacketContext<'a>>;

pub struct Packet<'a> {
    content : Field<'a>,
    pub context : PacketContextRef<'a>,
}

impl<'a> Packet<'a> {
    #[inline(always)]
    pub fn new(field : Field<'a>) -> Packet<'a> {
        let packet_context_box = Box::new(PacketContext::new());
        Packet {
            content : field,
            context : packet_context_box,
        }
    }
}


#[inline(always)]
pub fn dummy_move(packet : Packet) -> Packet {
    Packet {
        content : packet.content,
        context : packet.context,
    }
}

const N : usize = 1000000000;

pub fn bench_dummy_move() {
    let mut temp_struct = ZERO_BUFFER;
    let mut packet = Packet::new(&mut temp_struct);
    for _ in 0..1000000000 {
        packet = dummy_move(packet);
    }
}

pub fn bench_dummy_move2() {
    let mut temp_struct = ZERO_BUFFER;
    let mut packet = Packet::new(&mut temp_struct);
    for _ in 0..N {
        packet = dummy_move(packet);
    }
}

pub fn bench_dummy_move3() {
    let mut temp_struct = ZERO_BUFFER;
    let mut packet = Packet::new(&mut temp_struct);
    for _ in 0..1000000000usize {
        packet = dummy_move(packet);
    }
}

extern crate time;
use time::PreciseTime;
fn main() {
    let start = PreciseTime::now();
    bench_dummy_move();
    let end = PreciseTime::now();

    println!("Rust dummy move: {} us for {} times.", 
        start.to(end).num_microseconds().expect("Umm"), N);

    let start = PreciseTime::now();
    bench_dummy_move2();
    let end = PreciseTime::now();

    println!("Rust dummy move2: {} us for {} times.", 
        start.to(end).num_microseconds().expect("Umm"), N);

    let start = PreciseTime::now();
    bench_dummy_move3();
    let end = PreciseTime::now();

    println!("Rust dummy move3: {} us for {} times.", 
        start.to(end).num_microseconds().expect("Umm"), N);
}

With rust 1.3 stable release for windows x64,

cargo 0.4.0-nightly (553b363 2015-08-03) (built 2015-08-03)
rustc 1.3.0 (9a92aaf19 2015-09-15)

the result is

PS C:\workspace\eclipse_workspace\rst_test\target\release> .\hello_world.exe
Rust dummy move: 1045 us for 1000000000 times.
Rust dummy move2: 1 us for 1000000000 times.
Rust dummy move3: 1 us for 1000000000 times.
PS C:\workspace\eclipse_workspace\rst_test\target\release>

WIth rust 1.4 stable release for windows x64, gnu,

cargo 0.5.0-nightly (833b947 2015-09-13)
rustc 1.4.0 (8ab8581f6 2015-10-27)

the result is

PS C:\workspace\eclipse_workspace\rst_test\target\release> .\hello_world.exe
Rust dummy move: 1232181 us for 1000000000 times.
Rust dummy move2: 1 us for 1000000000 times.
Rust dummy move3: 1 us for 1000000000 times.
PS C:\workspace\eclipse_workspace\rst_test\target\release>

I wonder why the optimizations does not work with i32 values
and why it is worse in rust 1.4 than 1.3.

@leeopop

This comment has been minimized.

Copy link
Contributor Author

leeopop commented Nov 4, 2015

  • 1000000000 is less than the i32's maximum value (2147583647)
@brson

This comment has been minimized.

Copy link
Contributor

brson commented Apr 11, 2017

Closing. The optimizer is always changing and presumably has drifted a long way. Sorry for the lack of response.

@brson brson closed this Apr 11, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.