-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: System.String.Repeat (instance method) #102454
Comments
Why not just write a trivial extension method here? |
As you can imagine, LINQ + I'll take a microbench between LINQ and single time allocation implementation if I have time.
If many people had done, wouldn't it be so bad to prepare it by the runtime? This isn't so high priority because those other than newbies and game developers will be able to content themselves with LINQ, as you claim. You can throw this into Any TIme at worst. |
I forgot REPL users will have to copy and paste or recite the boilerplate |
Java's trackers:
Java has been able to do it using Stream before IntStream.range(0,10).mapToObj(i -> "👍").collect(Collectors.joining()) |
Tagging subscribers to this area: @dotnet/area-system-runtime |
You can use more performant APIs that exist nowadays, like public static string Repeat(this string str, int count)
{
return string.Create(str.Length * count, str, (span, value) =>
{
for (int i = 0; i < count; i++)
{
value.AsSpan().CopyTo(span);
span = span.Slice(value.Length);
}
});
} |
@KeterSCP We can double the size of I expected the use case in PowerShell, but its |
It turned out to be more complex than I expected. public static class E
{
public static string Repeat(this string str, int count) =>
count switch
{
< 0 => throw new ArgumentOutOfRangeException(nameof(count)),
0 => "",
1 => str,
_
=> string.Create(
checked(str.Length * count),
(str, count),
static (outSpan, state) =>
{
var inSpan = state.Item1.AsSpan();
inSpan.CopyTo(outSpan);
var len = inSpan.Length;
var firstSpan = outSpan[..len];
var copyFromSpan = firstSpan;
var copyToSpan = outSpan[len..];
for (
var bit =
1
<< (
30
- System.Numerics.BitOperations.LeadingZeroCount(
(uint)state.Item2
)
);
bit != 0;
bit >>= 1
)
{
copyFromSpan.CopyTo(copyToSpan);
var oldLength = copyFromSpan.Length;
copyToSpan = copyToSpan[oldLength..];
copyFromSpan = outSpan[..(oldLength << 1)];
if ((bit & state.Item2) != 0)
{
firstSpan.CopyTo(copyToSpan);
copyToSpan = copyToSpan[len..];
}
}
}
)
};
} IL size in lambda: 210 I erased public static class E
{
public static string Repeat(this string str, int count)
{
return count switch
{
< 0 => throw new ArgumentOutOfRangeException(nameof(count)),
0 => "",
1 => str,
_
=> string.Create(
checked(str.Length * count),
str,
static (outSpan, input) =>
{
var inSpan = input.AsSpan();
for (; outSpan.Length != 0; outSpan = outSpan.Slice(inSpan.Length))
{
inSpan.CopyTo(outSpan);
}
}
)
};
}
} IL size in lambda: 48→43 We have to recite such a implementation (the LINQ one is also sufficiently complex for beginners) or search an NuGet package (is there one?) in every project (solution) for just a method that has been implemented in many other (newer) languages. |
FWIW, where
|
@mikernet Oh, thank you for the benchmark. However I still need to find out cases disadvantageous against the more complex one and perform my bench with them and Mann-Whitney regardless of the result. I took into cases where the input string is composed of a single character: // WTFPLv2; feel free to combine this code into your product without asking permission or any conditions
public static class E
{
public static string Repeat(this string str, int count) =>
(count, str.Length) switch
{
(< 0, _) => throw new ArgumentOutOfRangeException(nameof(count)),
(0, _) or (_, 0) => "",
(1, _) => str,
(_, 1) => new string(str[0], count),
_
=> string.Create(
checked(str.Length * count),
(str, count),
static (outSpan, state) =>
{
var inSpan = state.Item1.AsSpan();
inSpan.CopyTo(outSpan);
var len = inSpan.Length;
var firstSpan = outSpan[..len];
var copyFromSpan = firstSpan;
var copyToSpan = outSpan[len..];
for (
var bit =
1
<< (
30
- System.Numerics.BitOperations.LeadingZeroCount(
(uint)state.Item2
)
);
bit != 0;
bit >>= 1
)
{
copyFromSpan.CopyTo(copyToSpan);
var oldLength = copyFromSpan.Length;
copyToSpan = copyToSpan[oldLength..];
copyFromSpan = outSpan[..(oldLength << 1)];
if ((bit & state.Item2) != 0)
{
firstSpan.CopyTo(copyToSpan);
copyToSpan = copyToSpan[len..];
}
}
}
)
};
} I looked into the implementations of the PowerShell's implementation: https://github.com/PowerShell/PowerShell/blob/74d8bdba443895ea84da1ea6c0d26dac05f08d7b/src/System.Management.Automation/engine/runtime/Operations/StringOps.cs#L53-L62 IronPython's implementation: https://github.com/IronLanguages/ironpython3/blob/master/Src/IronPython/Runtime/Operations/StringOps.cs#L1395-L1399 Unfortunately, they both call |
Yup. |
@mikernet I'm glad to hear that. Thank you. |
Benchmarks on the above 3 implementations, within the regulations of not modifying the runtime.
Iteration counts are the form of
My faster one ( The naive LINQ implementation (see "Alternative Designs") can be as much as more than 30x slower than the fastest implementation within the regulations of not modifying the runtime!
Repo: https://github.com/tats-u/StringRepeatBench I tried to improve the following code, but I couldn't. (couldn't be faster statistically or on average) if ((bit & state.Item2) != 0)
{
firstSpan.CopyTo(copyToSpan);
copyToSpan = copyToSpan[len..];
} |
Background and motivation
At least Java/JS/Python/Ruby/Perl/PHP/Rust/Go/Kotlin/Swift can generate a string formed by repeating a certain string a specified number of times by only one method, function, or operator.
This emoji takes 2 characters in C#.
For example, recheck, a ReDoS checker, uses
.repeat(n)
to describe attack strings for regex.However, C# doesn't have such a convenient method unlike the above languages.
API Proposal
We can implement it using
FastAllocateString
,CopyStringContent
, andSystem.Numerics.BitOperations.LeadingZeroCount
.API Usage
Alternative Designs
The former uses the LINQ, which is slower and more difficult to be found out.
The latter is only limited to a single BMP char.
Risks
Runtime/SDK size increase
The text was updated successfully, but these errors were encountered: