Split at every character if using empty string as delimiter #1335

malthe · 2016-10-18T15:31:32Z

No description provided.

jemc · 2016-10-18T17:16:19Z

packages/builtin/string.pony

-            end
-
-            result.push(cur = recover String end)
+          let found = match chars


Note that this will incur a newly added match cost of "unwrapping the type union" at every iteration of this loop. Has this change been benchmarked for how it affects the base case (non-empty delim)?

I think a more performant solution would be to decide once outside the loop whether the delim is empty or non-empty, then use a different implementation of the loop based on that decision, rather than forcing the same calculation of the same decision at every iteration of the loop.

I would assume that the cost of the match is negligible compared to the rest of what goes on in this method.

Isn't the match cost in this case simply a branch instruction?

Simply a branch instruction is not necessarily cheap. We are writing an application where that branch might get called hundreds of thousands of times a second. I haven't tested but under heavy load, jemc's variant sounds like it will perform much better.

I can fix that.

Would we need an RFC to change the signature of the split method to work on String val instead?

@malthe - I think it would need an RFC, probably.

SeanTAllen · 2016-10-19T19:54:26Z

Sylvan and I talked during sync, if this has no impact on the normal case, we are good with this but if it causes a performance degradation for the normal use case, we are not in favor of this.

malthe · 2016-10-19T20:09:39Z

I have submitted an RFC and implementation for a version of this that also changes String.split to work on immutable strings only. That version has no impact on the normal case (the two cases are split up in the implementation).

SeanTAllen · 2016-10-20T04:18:13Z

There's a release underway. Please rebase against master and verify that your CHANGELOG entry appears in the "unreleased" section post rebasing.

malthe · 2016-10-21T19:11:25Z

@SeanTAllen – do some of the tests fail because of this pull request or because of some general test instability?

SeanTAllen · 2016-10-21T19:49:05Z

We are having issues with some of the network tests. Its probably unrelated to your change.

jemc · 2016-10-21T20:14:07Z

This time the one failure appears to be Just Travis' Fault™:

An error occurred while generating the build script.

SeanTAllen · 2016-10-21T20:33:36Z

That's been happening a decent amount today. Probably a result of DDoS on DYN.

SeanTAllen · 2016-10-22T12:12:02Z

CHANGELOG changed due to 0.7.0 release, this probably needs to be rebased against master.

malthe · 2016-10-26T08:13:12Z

This should be ready now.

SeanTAllen · 2016-10-26T11:30:53Z

@malthe I don't see any changes to address the performance concern. this introduces a performance regression in the common case to deal with an edge case. I'm not in favor of this in that case.

Given that I'm not sure that I think splitting at every character is the right thing to do with an empty string AND it it going to impact on performance for "the usual cases", I am not in favor of this change at this time.

malthe · 2016-10-26T11:34:59Z

I'll close this one and promote ponylang/rfcs#46 instead (which currently includes this change).

jemc reviewed Oct 18, 2016

View reviewed changes

malthe force-pushed the string-split-empty branch from d90e7a8 to a1c2134 Compare October 19, 2016 07:40

SeanTAllen added the needs discussion during sync label Oct 19, 2016

SeanTAllen removed the needs discussion during sync label Oct 19, 2016

malthe force-pushed the string-split-empty branch from a1c2134 to fc8bd8a Compare October 19, 2016 20:10

malthe force-pushed the string-split-empty branch 3 times, most recently from 0512954 to 98f4ef3 Compare October 21, 2016 13:08

malthe force-pushed the string-split-empty branch from 98f4ef3 to 1421ac5 Compare October 22, 2016 19:02

Split at every character if using empty string as delimiter

ff3ca66

malthe force-pushed the string-split-empty branch from 1421ac5 to ff3ca66 Compare October 23, 2016 16:22

malthe closed this Oct 26, 2016

malthe mentioned this pull request Oct 27, 2016

String split zero copy ponylang/rfcs#46

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split at every character if using empty string as delimiter #1335

Split at every character if using empty string as delimiter #1335

malthe commented Oct 18, 2016

jemc Oct 18, 2016

malthe Oct 18, 2016

SeanTAllen Oct 18, 2016

malthe Oct 18, 2016 •

edited

jemc Oct 18, 2016

SeanTAllen commented Oct 19, 2016

malthe commented Oct 19, 2016

SeanTAllen commented Oct 20, 2016

malthe commented Oct 21, 2016

SeanTAllen commented Oct 21, 2016

jemc commented Oct 21, 2016

SeanTAllen commented Oct 21, 2016

SeanTAllen commented Oct 22, 2016

malthe commented Oct 26, 2016

SeanTAllen commented Oct 26, 2016

malthe commented Oct 26, 2016 •

edited

Split at every character if using empty string as delimiter #1335

Split at every character if using empty string as delimiter #1335

Conversation

malthe commented Oct 18, 2016

jemc Oct 18, 2016

Choose a reason for hiding this comment

malthe Oct 18, 2016

Choose a reason for hiding this comment

SeanTAllen Oct 18, 2016

Choose a reason for hiding this comment

malthe Oct 18, 2016 • edited

Choose a reason for hiding this comment

jemc Oct 18, 2016

Choose a reason for hiding this comment

SeanTAllen commented Oct 19, 2016

malthe commented Oct 19, 2016

SeanTAllen commented Oct 20, 2016

malthe commented Oct 21, 2016

SeanTAllen commented Oct 21, 2016

jemc commented Oct 21, 2016

SeanTAllen commented Oct 21, 2016

SeanTAllen commented Oct 22, 2016

malthe commented Oct 26, 2016

SeanTAllen commented Oct 26, 2016

malthe commented Oct 26, 2016 • edited

malthe Oct 18, 2016 •

edited

malthe commented Oct 26, 2016 •

edited