Skip to content

Optimization pass over String and UTF8Span's allASCII helper #82540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

stephentyrone
Copy link
Contributor

This ranges between parity (for very small strings) and 5x faster (for 32-63B strings) in benchmarking on M1 MBP. For largeish strings it delivers a roughly 2x speedup; further increase in blocksize nets a small win in microbenchmarks that I do not expect would translate to real world usage due to codesize impact and the fact that most strings are smallish.

There's some opportunity for further work here; in particular, if people start building Swift for a baseline of AVX2 or AVX512, we should have paths for that (and we should also implement them if/when we get better multiversioning dispatch machinery in the language). Span adoption would be interesting. It's likely we should have a dedicated "small core" implementation that uses only aligned accesses. Still, this is a significant improvement as-is, and we should land it.

allASCII

This ranges between parity (for very small strings) and 5x faster (for 32-63B strings) in benchmarking on M1 MBP. For largeish strings it delivers a roughly 2x speedup; further increase in blocksize nets a small win in microbenchmarks that I do not expect would translate to real world usage due to codesize impact and the fact that most strings are smallish.

There's some opportunity for further work here; in particular, if people start building Swift for a baseline of AVX2 or AVX512, we should have paths for that (and we should also implement them if/when we get better multiversioning dispatch machinery in the language). Span adoption would be interesting. It's likely we should have a dedicated "small core" implementation that uses only aligned accesses. Still, this is a significant improvement as-is, and we should land it.
@stephentyrone stephentyrone requested a review from a team as a code owner June 26, 2025 18:21
@stephentyrone
Copy link
Contributor Author

@swift-ci test

@stephentyrone
Copy link
Contributor Author

@swift-ci benchmark

Copy link
Contributor

@Catfish-Man Catfish-Man left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it. Discussed some minor caveats offline but nothing that should block landing this.

@benrimmington
Copy link
Contributor

Do you also need an #if SWIFT_STDLIB_ENABLE_VECTOR_TYPES (see StringUTF16View.swift for example).

@stephentyrone
Copy link
Contributor Author

stephentyrone commented Jun 27, 2025

We shouldn't, because the vector types should be unconditionally enabled, but as a practical matter we might.

@stephentyrone
Copy link
Contributor Author

@swift-ci test macOS

@stephentyrone
Copy link
Contributor Author

@swift-ci smoke test

Copy link
Contributor

@glessard glessard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

// } && allASCII(blockAt: base + last)
//
// but LLVM leaves one unnecessary conditional operation in the loop
// when we do that, so we write them out as while loops instead for now.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I'm repeating your own words to yourself, but we might want to link to an llvm issue/enhancement here.

@stephentyrone
Copy link
Contributor Author

@swift-ci benchmark

@Catfish-Man
Copy link
Contributor

@swift-ci please Apple Silicon benchmark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants