Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Buffer operation with unsafe operations (was: Simple improvements to speed of Buffer module) #6148

Closed
vicuna opened this issue Aug 29, 2013 · 11 comments

Comments

@vicuna
Copy link

commented Aug 29, 2013

Original bug ID: 6148
Reporter: @johnwhitington
Assigned to: @damiendoligez
Status: assigned (set by @xavierleroy on 2017-02-19T17:22:57Z)
Resolution: reopened
Priority: normal
Severity: feature
Target version: 4.07.0+dev/beta2/rc1/rc2
Fixed in version: 4.02.0+dev
Category: standard library
Has duplicate: #3560
Monitored by: @hcarty

Bug description

The (tiny) attached patch to trunk uses unsafe functions in uncontroversial parts of the Buffer module. In all instances, the accesses are already checked by the logic of the Buffer module prior to the unsafe function being called.

I have avoided unsafe functions in complicated or infrequently called parts of Buffer, like the resize function.

Sample speedups in native code:
add_char heavy code: 12%
add_string heavy code: 19%
add_substring heavy code: 25%

Here's the add_string example:

let b = Buffer.create 30

let string_of_lexemes lexemes =
Buffer.clear b;
List.iter (fun l -> Buffer.add_string b l; Buffer.add_char b ' ') lexemes;
Buffer.contents b

let _ =
for x = 1 to 50_000_000 do
ignore (string_of_lexemes ["a"; "bcd"; "efg"; "hij"; "klm"])
done

The speedups in bytecode are similar.

File attachments

@vicuna

This comment has been minimized.

Copy link
Author

commented Aug 30, 2013

Comment author: @ygrek

The patch seems to be reverse, and could you please use unified diff?
I.e. diff -urNw orig/ modified/

@vicuna

This comment has been minimized.

Copy link
Author

commented Aug 30, 2013

Comment author: @johnwhitington

See attached buffer-unsafe-2.diff

@vicuna

This comment has been minimized.

Copy link
Author

commented Aug 30, 2013

Comment author: @alainfrisch

Thanks! I've integrated your patch to the trunk (commit 14048).

As a side note, it's not 100% straightforward that the calls to blit in add_substring and add_string are safe. I'm thinking about possible integer overflow when computing new_position (leading to a negative value for new_position which does not trigger resizing). I think it is fine because of the limit on the maximum string length, but one needs to be careful (I've heard about a variant of the OCaml runtime developed by OCamlPro which gets rid of the limit on string lenths).

@vicuna

This comment has been minimized.

Copy link
Author

commented Dec 7, 2016

Comment author: @alainfrisch

Reopening, since the use of unsafe versions has be reverted in

http://caml.inria.fr/cgi-bin/viewvc.cgi/ocaml/version/4.02/stdlib/buffer.ml?r1=14852&r2=15041

Damien: do you remember the reason for going back to the slower version?

@vicuna

This comment has been minimized.

Copy link
Author

commented Feb 19, 2017

Comment author: @xavierleroy

@doligez could you please take a decision on what to do with this code?

@vicuna

This comment has been minimized.

Copy link
Author

commented Oct 10, 2017

Comment author: @alainfrisch

Ping Damien.

@vicuna vicuna added the stdlib label Mar 14, 2019
@vicuna vicuna added this to the 4.07.0 milestone Mar 14, 2019
@xavierleroy

This comment has been minimized.

Copy link
Contributor

commented Mar 17, 2019

@damiendoligez please take a decision concerning this patch (which was applied, then you reverted it).

@vicuna vicuna referenced this issue Feb 20, 2017
@vicuna vicuna added the feature-wish label Mar 20, 2019
@damiendoligez

This comment has been minimized.

Copy link
Member

commented Apr 8, 2019

The first three lines of the patch are superseded by 5b8df63.
I'll open a PR for the last two (add_string and add_substring).

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Apr 8, 2019

There are new calls to Bytes.get (for to_seq, to_seqi) which could be worth turning into unsafe_get.

@damiendoligez

This comment has been minimized.

Copy link
Member

commented Apr 8, 2019

I'll add them.

gasche added a commit that referenced this issue Apr 8, 2019
optimize some buffer operations (implements #6148)
@damiendoligez

This comment has been minimized.

Copy link
Member

commented Apr 9, 2019

This is implemented in #8596, so I'm closing this issue, although it raised other concerns (see #8597).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.