Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeScalar operators. #2439

Merged
merged 1 commit into from
Feb 1, 2024
Merged

UnicodeScalar operators. #2439

merged 1 commit into from
Feb 1, 2024

Conversation

johnno1962
Copy link

@johnno1962 johnno1962 commented Jan 23, 2024

Hi Apple,

Some ideas being explored on this thread in Swift evolution trying to validate the use of a protocol extension for avoiding having to use UInt8(ascii:) all the time for low level code. The TL;DR is that this change tidies up the new Lexer code in Cursor.swift considerably but I've not been able to quantify any performance regression for a Release build. For details on how it was benchmarked see the remainder of the character-literals branch.

Cheers

@ahoppen
Copy link
Collaborator

ahoppen commented Jan 24, 2024

Just to make sure I understand this correctly (I haven’t looked at the code yet):

@johnno1962
Copy link
Author

johnno1962 commented Jan 24, 2024

Thanks for this info @ahoppen,

At this stage I'm looking to validate the concept with the perfect test project swift-syntax. Your benchmarking is more exacting than mine and I'm seeing this as a baseline with the original code

% swift build -c release --package-path SwiftParserCLI
% ./SwiftParserCLI/.build/release/swift-parser-cli performance-test --directory ../swift/test --iterations 10
Time: 742.353892326355ms
Instructions: 10026665060.6

And this for the code I suggested:

Time: 996.0793972015381ms
Instructions: 12108098163.4

Which I guess is not good news for this approach. There is a compromise using this extension instead of the operators:

extension UInt8: ExpressibleByUnicodeScalarLiteral {
    /// Make UInt8 expressible by "c" (probably not worth it)
    @_transparent
    public init(unicodeScalarLiteral value: UnicodeScalar) {
        self.init(value.value)
    }
}

Which yields something in the middle.

Time: 875.1207947731018ms
Instructions: 11375745753.9

I imagine it would be difficult to justify a one-time aesthetic code tidy-up which in any way degraded performance.

@johnno1962
Copy link
Author

johnno1962 commented Jan 24, 2024

An update, by changing @_transparent to @inline(__always) in my extensions I get the following results:

Time: 745.8992958068848ms
Instructions: 10250689930.1

For the "ExpressableBy" extension alternative I mentioned:

Time: 739.4865036010742ms
Instructions: 10204277326.7

i.e. no slow down which is much more encouraging!

@johnno1962
Copy link
Author

johnno1962 commented Jan 25, 2024

Hi @ahoppen, I've been verifying a few more things, for example, that build time is not affected by the new code. Also, as I noted in the evolution post, for a Debug build, performance is about 30% down though this is relative to Debug builds being about 10x slower anyway so perhaps this would be less noticeable.

Debug, original code:
Time: 58800.06802082062ms
Instructions: 787493993112.0

Debug, using this PR:
Time: 86279.06596660614ms
Instructions: 1140467798820.0

If it's of interest to you, I'd like to present this PR for merging into swift-syntax now. Although my eventual aim is to make the extensions that facilitate direct comparisons between integers and strings (UnicodeScalars) available in the std library it would be useful if they were thoroughly exercised first in another project and this would also help make the case when I eventually pitch to the stdlib. Waiting until they were available in stdlib would only introduce a delay of a number of years before the new coding style could be adopted. I've checked that the PR builds with a toolchain that also includes the new operators in it's stdlib.

Over to you (unless you have any other changes you'd like me to make.)

@ahoppen
Copy link
Collaborator

ahoppen commented Jan 26, 2024

My preference would be to not take this PR. It makes the code harder to read because it deviates from how UInt8 comparisons are don in any other Swift codebase. Apart from that preference, I also think that any kind of compile-time or build-time regression is not acceptable.

@johnno1962
Copy link
Author

Thanks @ahoppen, I quite understand a position: "Why would I make a change to the code making it more difficult to understand while at the same time making it run slower while I'm debugging"!

I've pushed a final commit using a simpler ExpressibleByUnicodeScalarLiteral extension only (which introduces a. warning with development versions of the compiler). This brings Release and Debug, build and run time performance, before and after the PR to be almost identical. If you don't feel the refactor is clearer however I guess there isn't much I can do about that.

Debug before:
% time swift build -c debug --package-path SwiftParserCLI
Build complete! (21.25s)
swift build -c debug --package-path SwiftParserCLI 57.91s user 6.08s system 294% cpu 21.750 total
Build complete! (19.49s)
swift build -c debug --package-path SwiftParserCLI 59.07s user 5.73s system 326% cpu 19.867 total
% ./SwiftParserCLI/.build/debug/swift-parser-cli performance-test --directory ../swift/test --iterations 1
Time: 59025.82097053528ms
Instructions: 787613934024.0
Time: 60317.728996276855ms
Instructions: 787577146883.0
Time: 59705.91497421265ms
Instructions: 787433747372.0

Debug after PR:
% time swift build -c debug --package-path SwiftParserCLI
Build complete! (21.92s)
swift build -c debug --package-path SwiftParserCLI 59.02s user 7.42s system 295% cpu 22.466 total
Build complete! (21.61s)
swift build -c debug --package-path SwiftParserCLI 60.11s user 6.79s system 304% cpu 21.998 total
% ./SwiftParserCLI/.build/debug/swift-parser-cli performance-test --directory ../swift/test --iterations 1
Time: 58846.640944480896ms
Instructions: 791415934358.0
Time: 58897.018909454346ms
Instructions: 792114724719.0
Time: 58796.53799533844ms
Instructions: 792059332230.0

Release before:
% time swift build -c release --package-path SwiftParserCLI
Build complete! (171.27s)
swift build -c release --package-path SwiftParserCLI 249.60s user 24.43s system 159% cpu 2:51.81 total
Build complete! (163.93s)
swift build -c release --package-path SwiftParserCLI 248.68s user 19.04s system 162% cpu 2:44.37 total
% ./SwiftParserCLI/.build/release/swift-parser-cli performance-test --directory ../swift/test --iterations 10
Time: 741.7734026908875ms
Instructions: 10027644813.8
Time: 741.2122011184692ms
Instructions: 10030413083.4
Time: 739.5735025405884ms
Instructions: 10027683834.2

Release after PR:
% time swift build -c release --package-path SwiftParserCLI
Build complete! (160.12s)
swift build -c release --package-path SwiftParserCLI 248.15s user 15.97s system 164% cpu 2:40.65 total
Build complete! (159.21s)
swift build -c release --package-path SwiftParserCLI 247.30s user 16.22s system 165% cpu 2:39.65 total
% ./SwiftParserCLI/.build/release/swift-parser-cli performance-test --directory ../swift/test --iterations 10
Time: 745.3409910202026ms
Instructions: 10204463495.4
Time: 742.9486036300659ms
Instructions: 10198957963.7
Time: 742.7903056144714ms
Instructions: 10200009868.6

Unfortunately, using an ExpressibleBy extension could never be in the standard library as it is a bit of a loose cannon and would allow nonsense expressions such as "a" * "a" to be valid in an integer context. So, I guess the space is well and truly explored. Thanks for your time, the performance-test benchmark has been a great help.

@ahoppen
Copy link
Collaborator

ahoppen commented Jan 26, 2024

Oh, this looks a lot better now. And I just checked that

func testMyStuff(x: UInt8) {
    switch x {
    case "A", "B", "C",
      "D", "E", "F",
      "G", "H", "I",
      "J", "K", "L",
      "M", "N", "O",
      "P", "Q", "R",
      "S", "T", "U",
      "V", "W", "X",
      "Y", "Z",
      "a", "b", "c",
      "d", "e", "f",
      "g", "h", "i",
      "j", "k", "l",
      "m", "n", "o",
      "p", "q", "r",
      "s", "t", "u",
      "v", "w", "x",
      "y", "z",
      "_":
      print("x")
    default:
      break
    }
}

compiles down to the same IR as when using UInt8(ascii:), so there shouldn’t be any runtime performance regression.

@johnno1962
Copy link
Author

Yes, things are shaping up better now. Were you looking at the last commit where I was able to tune the operator approach? Looking at the code more closely it was the comparison operations on an optional that were the problem. If you stick to concrete types rather than protocols the speed regression disappears. The last commit may even be a percentage point or two faster than the baseline!

@ahoppen
Copy link
Collaborator

ahoppen commented Jan 27, 2024

Yes, I looked at the last version of the PR that only adds the operator overloads.

I think we’re good to take this but I would prefer a couple minor changes:

  • I would only define the ==, != and ~= operators. <, > seem to only be used once and <=, >= and - seem to not be used at all. To avoid adding more symbols to operator lookup when not necessary I would prefer to remove them
  • Could you reformat the switch statements because each case can now hold a lot more than three characters while staying in the 160 column limit.

@johnno1962
Copy link
Author

Great! I've made the changes you asked for, let me know if there is anything else. I've been able to produce a toolchain with these operators and using @_alwaysEmitIntoClient they seem to backport fine:

Release performance-test:
./SwiftParserCLI/.build/release/swift-parser-cli performance-test --directory ../swift/test --iterations 10
Time: 762.2112989425659ms
Instructions: 10422817218.5
Time: 761.7305994033813ms
Instructions: 10423392805.5
Time: 765.9548997879028ms
Instructions: 10415461468.1

Debug performance-test:
% ./SwiftParserCLI/.build/debug/swift-parser-cli performance-test --directory ../swift/test --iterations 1
Time: 56992.75600910187ms
Instructions: 781180707632.0
Time: 56739.298939704895ms
Instructions: 781062066976.0
Time: 56700.37305355072ms
Instructions: 781040261203.0

A good result, slightly slower for a release build using the toolchain but perhaps that can be fine tuned later. The version you're using should be faster. Thanks for your patience!

@ahoppen
Copy link
Collaborator

ahoppen commented Jan 27, 2024

Oh, I only now spotted that there is a performance regression in release builds I didn’t count the zeros correctly in 10027644813.8 and 10204463495.4 from your last measurement.

To me, a requirement for this PR is that it compiles to the exact same code as before and doesn’t have any performance regression. We have jumped through bigger hoops to get a 2% performance improvement and it would be a shame to loose them for something that fairly minor and local like this.

@johnno1962
Copy link
Author

johnno1962 commented Jan 27, 2024

Fair enough, you may be good to go now. I don't know about the instruction count but we're down into the 730's.

% ./SwiftParserCLI/.build/release/swift-parser-cli performance-test --directory ../swift/test --iterations 10
Time: 737.7013087272644ms
Instructions: 10257822390.4
Time: 738.1468057632446ms
Instructions: 10261760931.5
Time: 740.6180024147034ms
Instructions: 10257334543.6
Time: 739.7215962409973ms
Instructions: 10258337438.3
Time: 739.1029000282288ms
Instructions: 10258197234.8
Time: 740.1492953300476ms
Instructions: 10254890075.9

The toolchain regression seems to be related to @_alwaysEmitIntoClient and I can look at that later.

@ahoppen
Copy link
Collaborator

ahoppen commented Jan 27, 2024

I don’t understand how @_alwaysEmitIntoClient would have any effect but 🤷🏽

Regarding instruction counts: I found that they are a very stable way of measuring performance and are usually what I use because there’s so much less noise. And especially because we are aiming to produce the same binary after the change, there’s no noise created by potential delays when waiting for memory (which might not increment the instruction count), so I think we should evaluate performance based on instruction count.

@johnno1962
Copy link
Author

OK, leave it with me but if you spot anything obvious in the operator code like my last change let me know.

@johnno1962 johnno1962 force-pushed the pr-back branch 2 times, most recently from ffc5dc1 to 74161fd Compare January 27, 2024 15:27
@johnno1962
Copy link
Author

johnno1962 commented Jan 29, 2024

Good Morning @ahoppen,

I took a long look at these rogue instruction counts over the weekend. I approached this by copying the main branch version of Sources/SwiftParser/Lexer/Cursor.swift to one side then fetching my branch for this PR and copying the copy of Cursor.swift back into the repo. I then slowly worked though discarding the diffs one by one progressively reinstating the proposed changes. What I found was the instruction count slowly built up till you reach a certain point with two or three hunks remaining to revert where it it suddenly jumped up to the values you're seeing with even the smallest change. It seems like there is something non-linear (a fixed size optimisation window or page size or something - chose your explanation) controlling the instruction count. All the while I had the impression the run time execution elapsed time was decreasing ever so slightly. So, it seems it is possible code can be executing more instructions and yet executes more quickly in real time which has to be the measurement to keep an eye on even if it is more variable.

So, all I can do is gather statistics and document this conclusion is valid. Looking first at build times (using the reps.* scripts I checked into my character-literals branch I get the following results:

Release build time before:
Time: 165.696s Δ3.528 2.13%
Time: 164.148s Δ2.171 1.32%
Release build time after:
Time: 164.308s Δ2.959 1.80%
Time: 163.419s Δ4.524 2.77%

The Δ figure is the standard deviation across multiple runs and the % figure the deviation divided by the mean which is a normalised measure of variability. So, given how variable build times are there is no evidence of any significant difference.

Turning to the run-time performance results I'm seeing the following:

Release runtime performance before:
Time: 742.507ms Δ4.979 0.67%
Instructions: 10028272804.338 Δ2477806.458 0.02%
Time: 741.837ms Δ3.593 0.48%
Instructions: 10028178965.391 Δ2867472.287 0.03%

Release runtime performance after:
Time: 738.265ms Δ3.064 0.41%
Instructions: 10259635879.697 Δ2625566.019 0.03%
Time: 737.977ms Δ3.025 0.41%
Instructions: 10258652211.847 Δ2532919.354 0.02%

Debug runtime performance before:
Time: 58729.210ms Δ67.024 0.11%
Instructions: 787565050168.200 Δ183473904.498 0.02%

Debug runtime performance after:
Time: 57993.601ms Δ85.047 0.15%
Instructions: 793125877908.400 Δ256072306.405 0.03%

You can see how much more variable the time measurements are and yet if you run enough repetitions (100 in this case) there does seem to be a detectable improvement in real life performance despite a ~2% increase in instruction counts.

With respect to @_alwaysEmitIntoClient it wasn't making a difference in the end as the 20ms slow down I was seeing seems to be inherent in preparing a toolchain from the Swift sources and comparing it to that of an actual Xcode release.

TBH this PR is in better shape than I anticipated. I'd expected to be having to argue for the refactor in the face of build or run time speed regressions (however small) but I've been unable to detect any evidence of either (if anything quite the opposite). I'd like to think this would be enough evidence for a high level of assurance merging it wouldn't be a mistake.

@ahoppen
Copy link
Collaborator

ahoppen commented Jan 29, 2024

That is interesting but I would trust the instruction counts here more than the time, because:

  • Execution time is easily influenced by external factors like general CPU temperature or whether more processes are running in the background. In my experience it’s quite easy to get systematic errors here
  • Based on the PR, I don’t see how it could speed up execution time. In the best case it should compile down to the same assembly as the original code.
  • My statistical knowledge is a little rusty but the difference between execution time is within the standard deviation while the difference in instruction count is well without the standard deviation. I believe this indicates that the difference in execution time might be due to statistical fluctuations while the difference in instruction count is statistically significant.

I think what should be investigated here (and I think would also be important if this became a language features), is why it compiles down to different binary code and what can be done there to make sure it’s a transparent change as far as compilation is concerned.

@johnno1962
Copy link
Author

More numbers for today after reverting a couple of commits. The most import change was reverting to an annotation of @_transparent instead of @inline(__always) for the operators. I also reverted the only "optimisation" to Cursor.swift I had made to check for nil values separately from switch statements. You should find the instruction counts are in line with your expectations now while wall clock performance has remained slightly improved over the baseline.

Release runtime performance before PR:
Time: 742.507ms Δ4.979 0.67%
Instructions: 10028272804.338 Δ2477806.458 0.02%
Time: 741.837ms Δ3.593 0.48%
Instructions: 10028178965.391 Δ2867472.287 0.03%
Time: 742.815ms Δ4.806 0.65%
Instructions: 10027972755.874 Δ2491993.943 0.02%
Time: 741.211ms Δ3.782 0.51%
Instructions: 10027619991.396 Δ2432832.541 0.02%

Release runtime performance after PR:
Time: 739.074ms Δ4.193 0.57%
Instructions: 10021806841.520 Δ2597281.399 0.03%
Time: 737.766ms Δ1.611 0.22%
Instructions: 10020808717.379 Δ2679588.421 0.03%
Time: 738.697ms Δ3.448 0.47%
Instructions: 10021543757.248 Δ2826853.241 0.03%
Time: 738.266ms Δ1.921 0.26%
Instructions: 10021309207.203 Δ2554471.708 0.03%

Debug runtime performance before PR:
Time: 58729.210ms Δ67.024 0.11%
Instructions: 787565050168.200 Δ183473904.498 0.02%
Time: 59045.758ms Δ896.555 1.52%
Instructions: 787474677956.600 Δ144497624.230 0.02%

Debug runtime performance after PR:
Time: 57762.088ms Δ115.130 0.20%
Instructions: 793731368393.700 Δ201101277.253 0.03%
Time: 57961.811ms Δ228.974 0.40%
Instructions: 793822097838.000 Δ253472432.254 0.03%

Copy link
Collaborator

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice that the instruction counts are the same now 🎉 One minor comment, otherwise looks good to me.

And could you squash your commits? Just makes for a nicer git history https://github.com/apple/swift-syntax/blob/main/CONTRIBUTING.md#authoring-commits

Comment on lines 252 to 254
/// Basic equality operators
@_transparent
static func == (i: Self, s: Unicode.Scalar) -> Bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment applies only to the == function, so Basic equality operators doesn’t really make sense.

@johnno1962
Copy link
Author

johnno1962 commented Jan 30, 2024

Doc comment edited and duly squashed.

@ahoppen
Copy link
Collaborator

ahoppen commented Jan 31, 2024

@swift-ci Please test

@ahoppen
Copy link
Collaborator

ahoppen commented Jan 31, 2024

@swift-ci Please test Windows

@johnno1962
Copy link
Author

@ahoppen, I have a commit with the indentation fixed. Do you want me to --amend it onto this PR?

@johnno1962
Copy link
Author

I've force pushed the indentation fix if someone wants to @swift-ci Please test again

@ahoppen
Copy link
Collaborator

ahoppen commented Jan 31, 2024

@swift-ci Please test

And just in case you were wondering, only contributors with commit access can trigger CI https://github.com/apple/swift-syntax/blob/main/CONTRIBUTING.md#review-and-ci-testing

@ahoppen
Copy link
Collaborator

ahoppen commented Feb 1, 2024

@swift-ci Please test

@ahoppen
Copy link
Collaborator

ahoppen commented Feb 1, 2024

@swift-ci Please test

@ahoppen
Copy link
Collaborator

ahoppen commented Feb 1, 2024

@swift-ci Please test Windows

@ahoppen ahoppen merged commit 114a6a1 into apple:main Feb 1, 2024
3 checks passed
@johnno1962
Copy link
Author

Excellent, thanks @ahoppen. I love It when a plan comes together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants