Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String split zero copy #46

Closed
wants to merge 1 commit into from

Conversation

malthe
Copy link
Contributor

@malthe malthe commented Oct 18, 2016

@jemc
Copy link
Member

jemc commented Oct 18, 2016

Can you make explicit the method signature of the current String.split method as it compares to the new one?

@SeanTAllen
Copy link
Member

I need convincing. I can see having another split method that does this. We have/had that at Sendence. However changing split to only work on vals is not a good thing. Its perfectly reasonable for me to want to be able to split a ref. This change would disallow that. Rather than changing the existing method, I believe what you want is a new method.

@malthe
Copy link
Contributor Author

malthe commented Oct 26, 2016

I suppose with subtyping it would be quite easy to provide two implementations of String.split – one for val and one for ref.

@SeanTAllen
Copy link
Member

What subtyping are you referring to @malthe?

@malthe
Copy link
Contributor Author

malthe commented Oct 26, 2016

I was thinking that with #62 it might be possible to "overload" String.split with two implementations.

If that's something we want?

An alternative is of course to go with different names but that seems a bit wrong to me – like Hungarian notation almost.

I think ergonomics should be a concern here as well as obviously performance.

@jemc
Copy link
Member

jemc commented Oct 26, 2016

I wonder how the performance of (String box).clone -> (String val).split compares to the current case of (String box).split. It's possible that the clone.split would actually be more performant, since it involves one large buffer allocation instead of many small buffer allocations. If that ends up being true, I'd think that there is not much justification to keep (String box).split around.

It would be interesting to see some benchmarks.

@SeanTAllen
Copy link
Member

@jemc that would be an interesting benchmark. did you mean:

(String box).split in the following?

I'd think that there is not much justification to keep (String box).clone around.

@jemc
Copy link
Member

jemc commented Oct 26, 2016

@SeanTAllen yes, thanks for the correction - I've corrected the comment.

@malthe
Copy link
Contributor Author

malthe commented Oct 27, 2016

The benchmark results are:

split               20    82373935 ns/op
split-zc            30    44659338 ns/op

For what it's worth, ponylang/ponyc#1335 actually runs about 1% faster than "split".

Program code:

use "ponybench"

actor Main
  let length: USize = 1024 * 1024

  new create(env: Env) =>
    let s = recover ref String.create(length) end

    while s.size() < length do
      s.append("abcde ")
    end

    let s': String val = s.clone()
    let bench = PonyBench(env)

    bench[None]("split", recover val lambda()(s') => s'.split() end end)
    bench[None]("split-zc", recover val lambda()(s') => s'.clone().split_zc() end end)

@jemc
Copy link
Member

jemc commented Oct 30, 2016

@malthe thanks for doing this benchmark.

I cloned your fork and tried a few variations on the benchmark, and couldn't come up with a case where the split() wasn't slower than clone().split_zc().

With that in mind, I think there is no reason to keep a fun split because fun val split is more optimal, even when you have to clone first to get a val.

@SeanTAllen
Copy link
Member

Closing as stale.

@SeanTAllen SeanTAllen closed this Apr 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants