Skip to content

[stdlib] Handle String's 32-bit inline representation for Span #82442

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 27, 2025

Conversation

glessard
Copy link
Contributor

@glessard glessard commented Jun 24, 2025

String's inline representation is currently non-contiguous on 32-bit platforms. Furthermore, that representation is baked in ABI for watchOS's 32-bit variant. This PR implements mitigations.

A. For 32-bit platforms without ABI stability, we reduce the size of the inline representation to 8 UTF-8 code units (from 10).

B. For 32-bit watchOS, we add an optional property var _span: Span<UTF8.CodeUnit>? that returns nil when the representation of the String is non-contiguous, that is when it is 9 or 10 UTF-8 code units.

All platforms will get this underscored property, so that code that supports 32-bit watchOS doesn't have to be specific to it.

@glessard glessard requested a review from a team as a code owner June 24, 2025 02:04
@glessard
Copy link
Contributor Author

@swift-ci please test

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! (But also: ugh 😣)

return 10
#elseif _pointerBitWidth(_32) || _pointerBitWidth(_16)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have platforms with 16-bit pointers? 😮

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're not impossible!

@glessard
Copy link
Contributor Author

@swift-ci please test

1 similar comment
@glessard
Copy link
Contributor Author

@swift-ci please test

@glessard
Copy link
Contributor Author

@swift-ci please test linux platform

@atrick
Copy link
Contributor

atrick commented Jun 26, 2025

This patch works for me... The getter's 'self' is non-copyable (no-implicit-copy). I'm not actually sure how to avoid that.
At any rate, non-copyable values can't be captured, and boolean operators capture their operands...

diff --git a/stdlib/public/core/Substring.swift b/stdlib/public/core/Substring.swift
index ae55c6fd9a0..b9dc2299827 100644
--- a/stdlib/public/core/Substring.swift
+++ b/stdlib/public/core/Substring.swift
@@ -871,10 +871,13 @@ extension Substring.UTF8View {
   public var _span: Span<UTF8.CodeUnit>? {
     @lifetime(borrow self)
     borrowing get {
-      if _wholeGuts.isSmall &&
-         _wholeGuts.count > _SmallString.contiguousCapacity() {
-        // substring is spannable only when the whole string is spannable.
-        return nil
+      let isSmall = _wholeGuts.isSmall
+      if isSmall {
+        let count = _wholeGuts.count
+        if count > _SmallString.contiguousCapacity() {
+          // substring is spannable only when the whole string is spannable.
+          return nil
+        }
       }
       return _underlyingSpan()
     }
diff --git a/stdlib/public/core/UTF8Span.swift b/stdlib/public/core/UTF8Span.swift
index a7f5aaeb5d3..99a8effb0bb 100644
--- a/stdlib/public/core/UTF8Span.swift
+++ b/stdlib/public/core/UTF8Span.swift
@@ -426,10 +426,13 @@ extension Substring {
   public var _utf8Span: UTF8Span? {
     @lifetime(borrow self)
     borrowing get {
-      if _wholeGuts.isSmall &&
-          _wholeGuts.count > _SmallString.contiguousCapacity() {
-        // substring is spannable only when the whole string is spannable.
-        return nil
+      let isSmall = _wholeGuts.isSmall
+      if isSmall {
+        let count = _wholeGuts.count
+        if count > _SmallString.contiguousCapacity() {
+          // substring is spannable only when the whole string is spannable.
+          return nil
+        }
       }
       return unsafe UTF8Span(
         unchecked: _underlyingSpan(), isKnownASCII: base._guts.isASCII

@atrick
Copy link
Contributor

atrick commented Jun 26, 2025

Alternatively, remove the ‘borrowing’ keyword from the _span getters:

diff --git a/stdlib/public/core/Substring.swift b/stdlib/public/core/Substring.swift
index ae55c6fd9a0..afae49f9f2f 100644
--- a/stdlib/public/core/Substring.swift
+++ b/stdlib/public/core/Substring.swift
@@ -870,7 +870,7 @@ extension Substring.UTF8View {
   @available(SwiftStdlib 6.2, *)
   public var _span: Span<UTF8.CodeUnit>? {
     @lifetime(borrow self)
-    borrowing get {
+    get {
       if _wholeGuts.isSmall &&
          _wholeGuts.count > _SmallString.contiguousCapacity() {
         // substring is spannable only when the whole string is spannable.
diff --git a/stdlib/public/core/UTF8Span.swift b/stdlib/public/core/UTF8Span.swift
index a7f5aaeb5d3..e567e1c274a 100644
--- a/stdlib/public/core/UTF8Span.swift
+++ b/stdlib/public/core/UTF8Span.swift
@@ -425,7 +425,7 @@ extension Substring {
   @available(SwiftStdlib 6.2, *)
   public var _utf8Span: UTF8Span? {
     @lifetime(borrow self)
-    borrowing get {
+    get {
       if _wholeGuts.isSmall &&
           _wholeGuts.count > _SmallString.contiguousCapacity() {
         // substring is spannable only when the whole string is spannable.

- computed properties have an ABI impact; a function is fine here
@glessard glessard force-pushed the rdar147500261-utf8span-32bit branch from fab3154 to 791dd4c Compare June 26, 2025 05:02
@glessard
Copy link
Contributor Author

I circumvented the autoclosure by using the condition-list syntax, rather than combining expressions with &&. Thanks for the reminder @atrick!

@glessard glessard force-pushed the rdar147500261-utf8span-32bit branch from c19b58d to 56aea71 Compare June 26, 2025 19:04
@glessard glessard force-pushed the rdar147500261-utf8span-32bit branch from 56aea71 to db664fb Compare June 26, 2025 20:19
@glessard
Copy link
Contributor Author

@swift-ci please clean test

@glessard glessard merged commit 96f6f5e into swiftlang:main Jun 27, 2025
5 checks passed
@glessard glessard deleted the rdar147500261-utf8span-32bit branch June 27, 2025 22:18
@glessard glessard restored the rdar147500261-utf8span-32bit branch June 27, 2025 22:18
@glessard glessard deleted the rdar147500261-utf8span-32bit branch June 28, 2025 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants