Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update improve memory efficiency of rope diffing #1137

Merged
merged 12 commits into from
Mar 30, 2019
Merged

Conversation

HoNile
Copy link
Contributor

@HoNile HoNile commented Mar 17, 2019

Summary

I implement the option proposed by Raph with a new RopeSlice struct to improve memory efficiency of rope diffing.

I keep an eye on performance and it's may be not what you want as you can see bellow.

Benchmark before my commit:
running 10 tests
test hash_diff ... bench: 1,266 ns/iter (+/- 463)
test hash_diff_big ... bench: 183,348 ns/iter (+/- 30,961)
test hash_diff_med ... bench: 29,436 ns/iter (+/- 8,489)
test ne_idx_avx ... bench: 5,297 ns/iter (+/- 834)
test ne_idx_detect ... bench: 5,258 ns/iter (+/- 682)
test ne_idx_rev_sse ... bench: 13,240 ns/iter (+/- 2,105)
test ne_idx_rev_sw ... bench: 68,948 ns/iter (+/- 6,669)
test ne_idx_sse ... bench: 9,636 ns/iter (+/- 1,413)
test ne_idx_sw ... bench: 0 ns/iter (+/- 0)
test scanner ... bench: 11,012 ns/iter (+/- 4,030)

Benchmark after my commit:
running 10 tests
test hash_diff ... bench: 1,127 ns/iter (+/- 332)
test hash_diff_big ... bench: 6,765,850 ns/iter (+/- 317,297)
test hash_diff_med ... bench: 192,641 ns/iter (+/- 44,295)
test ne_idx_avx ... bench: 5,111 ns/iter (+/- 468)
test ne_idx_detect ... bench: 4,988 ns/iter (+/- 477)
test ne_idx_rev_sse ... bench: 12,778 ns/iter (+/- 1,155)
test ne_idx_rev_sw ... bench: 68,279 ns/iter (+/- 6,189)
test ne_idx_sse ... bench: 6,288 ns/iter (+/- 3,110)
test ne_idx_sw ... bench: 0 ns/iter (+/- 0)
test scanner ... bench: 11,032 ns/iter (+/- 3,316)

The performance drop come from removing whitespace from my slice (diff.rs line 91) which make me create a new RopeSlice (or make my previous slice mutable).

I didn't manage to remove the overhead but I'm totally new to rust so I hope you have idea to solve it or I can try the second option proposed to improve the memory efficiency of rope diffing.

Related Issues

closes #946

Review Checklist

  • I have responded to reviews and made changes where appropriate.
  • I have tested the code with cargo test --all.
  • I have updated comments / documentation related to the changes I made.
  • I have rebased my PR branch onto xi-editor/master.

@cmyr
Copy link
Member

cmyr commented Mar 19, 2019

Thanks for this! will take a look soon and see if we can figure out how to address the performance questions.

@HoNile
Copy link
Contributor Author

HoNile commented Mar 19, 2019

Humm ... I'm still trying to improve this but look like I commit too early not sure it's worth it for you to look at this. Probably way more than performance problem.

@HoNile
Copy link
Contributor Author

HoNile commented Mar 19, 2019

Now look better I hope I didn't miss something else.

running 10 tests
test hash_diff ... bench: 1,197 ns/iter (+/- 149)
test hash_diff_big ... bench: 313,091 ns/iter (+/- 47,905)
test hash_diff_med ... bench: 56,170 ns/iter (+/- 6,490)
test ne_idx_avx ... bench: 6,162 ns/iter (+/- 706)
test ne_idx_detect ... bench: 5,764 ns/iter (+/- 1,112)
test ne_idx_rev_sse ... bench: 16,766 ns/iter (+/- 3,167)
test ne_idx_rev_sw ... bench: 68,587 ns/iter (+/- 7,985)
test ne_idx_sse ... bench: 10,005 ns/iter (+/- 1,663)
test ne_idx_sw ... bench: 0 ns/iter (+/- 0)
test scanner ... bench: 12,574 ns/iter (+/- 2,174)

@HoNile
Copy link
Contributor Author

HoNile commented Mar 21, 2019

I think I could improve syntax with while let or if let at some point and non_ws_offset was better before my last change, but I don't want to change anything before review.

Copy link
Member

@cmyr cmyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've taken a closer look at this and then back at the code again, and I think I have a better solution:

We can just change the signature of make_line_hashes to return HashMap<Cow<str>, usize>, and then use rope.iter_lines_raw(..) instead of our custom LineScanner type, which we can delete.

This should give us the improvement we want, while hopefully actually reducing code size. :)

@HoNile
Copy link
Contributor Author

HoNile commented Mar 24, 2019

I'm trying to do this but it look hard to return a HashMap<Cow<str>, usize> without allocation or without lifetime issue, as I can't insert line from rope.lines_raw(..) directly into my HashMap (need to insert something based on line[non_ws..]).

@cmyr
Copy link
Member

cmyr commented Mar 25, 2019

It should be possible with something like this:

fn make_line_hashes<'a>(base: &'a Rope, min_size: usize) -> HashMap<Cow<'a, str>, usize> {
    let mut offset = 0;
    let mut line_hashes = HashMap::with_capacity(base.len() / 60);
    for line in base.lines_raw(..) {
        let non_ws = non_ws_offset(&line);
        if line.len() - non_ws >= min_size {
            let cow = match line {
                Cow::Owned(ref s) => Cow::Owned(s[non_ws..].to_string()),
                Cow::Borrowed(s) => Cow::Borrowed(&s[non_ws..]),
            };
            line_hashes.insert(cow, offset + non_ws);
        }
        offset += line.len();
    }
    line_hashes
}

@HoNile
Copy link
Contributor Author

HoNile commented Mar 26, 2019

Okay sorry I didn't relize that we can split COW in two categories during this work, my own version did Cow::from(s[non_ws..].to_owned()) for all lines.
I update make_line_hashes with your version and just add comment and a parameter to not process all the rope, based on preliminary scan like for target.

rust/rope/src/diff.rs Outdated Show resolved Hide resolved
Copy link
Member

@cmyr cmyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for this, looks like a nice little win.

@cmyr cmyr merged commit 71930c4 into xi-editor:master Mar 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants