Skip to content

Conversation

@mattt
Copy link
Contributor

@mattt mattt commented Oct 27, 2020

The current implementation of the description property required for ByteString to conform to CustomStringConvertible includes a call to fatalError if the buffer constitutes an invalid UTF-8 byte sequence:

https://github.com/apple/swift-tools-support-core/blob/e618e0f0f82fbfcff1a5a097ba0d019c7d6ddcc5/Sources/TSCBasic/ByteString.swift#L110-L116

According to the header documentation:

A ByteString represents a sequence of bytes.

This struct provides useful operations for working with buffers of
bytes. Conceptually it is just a contiguous array of bytes (UInt8), but it
contains methods and default behavor suitable for common operations done
using bytes strings.

This struct is not intended to be used for significant mutation of byte
strings, we wish to retain the flexibility to micro-optimize the memory
allocation of the storage (for example, by inlining the storage for small
strings or and by eliminating wasted space in growable arrays). For
construction of byte arrays, clients should use the WritableByteStream class
and then convert to a ByteString when complete.

Based on that, it's surprising to me that the type would enforce that the byte string is UTF-8 valid, or that calling description (most often implicitly) could cause a program to trap. I encountered this behavior when generating and verifying SHA256 checksums.

enum Error: Swift.Error {
  case mismatchedChecksums(actual: ByteString, expected: ByteString)
}

In the process of throw-ing that mismatchedChecksums error in a unit test, XCTest attempted to call the synthesized description property, which delegated to ByteString's implementation. My workaround was to change the associated value types to String and use each checksum's hexadecimalRepresentation.

Unfortunately, the context of this change appears to have been lost when TSCBasic et al. were consolidated into this monorepo (fcaa2ce), so I can only guess at the original motivation here. This PR replaces the existing implementation with a call to cString. Tests pass all the same (though FWIW, PkgConfigParserTests.testBrewPrefix is failing locally before and after the commit...), though I don't have particularly strong feelings with cString as the thing to use here. If anyone has a any different ideas, I'd be interested to hear them.

Remove fatalError from implementation of ByteString description
@MaxDesiatov
Copy link
Contributor

@swift-ci please test

Copy link
Contributor

@MaxDesiatov MaxDesiatov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I, for one, welcome this change. That fatalError there always made me feel uncomfortable.

@abertelrud
Copy link
Contributor

abertelrud commented Oct 29, 2020

Thanks for fixing this! I agree, this seems like a good change. I don't know the history here either, but a fatalError seems wrong in any case.

I had noticed the brewPrefix failure too, and have a PR up at #151 to fix it. It happened because state could leak between different unit tests.

I think we also have a new test failure in testWASITarget that's unrelated (it fails locally, at least).

@abertelrud
Copy link
Contributor

One of the changes here will be that if the byte string contains invalid UTF-8, they will be replaced with invalid-character markers. Since description isn't supposed to throw, that seems like the right thing.

The API here seems due for a cleanup at some point. Like many things in ToolsSupportCore, it dates back to when the Swift standard library wasn't so mature yet, and I know that one big problem in Swift 1 was the performance of String, and especially quickly going to and from UTF-8. That's all different now, with String being backed by UTF-8, so I would imagine that a lot of the APIs that use ByteStrings could really use regular strings today.

@MaxDesiatov
Copy link
Contributor

I think we also have a new test failure in testWASITarget that's unrelated (it fails locally, at least).

testWASITarget passes locally for me with both SwiftPM and TSC on the main branch. I think it would fail if TSC checkout is old enough not to contain #153.

@mattt
Copy link
Contributor Author

mattt commented Oct 30, 2020

@abertelrud Please let me know if there's anything left to do for this to get merged. I'm currently hitting this in my work on package registry (specifically testArtifactChecksumChange), and would love to have that test fail instead of trapping the whole test suite.

@abertelrud
Copy link
Contributor

This looks good to me.

@abertelrud abertelrud merged commit c52a61c into swiftlang:main Oct 30, 2020
@mattt mattt deleted the bytestring-nonfatal-description branch October 30, 2020 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants