`<print>`: `std::println("{}", ...)` UTF-8 Truncation Bug with `/utf-8` (256-byte Buffer Split)

# Describe the bug

When using the MSVC compiler with `/utf-8`, `std::println` truncates overly long UTF-8 strings at internal buffer boundaries (replaced with U+FFFD replacement characters) when formatting arguments are used (`std::println("{}", str)`).

# Reproduction Code and Output

```cpp
#include <print>

int main()
{
    std::println("{}", "这是一段超长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长的文本。");
}
```

**Compilation Command:** `cl.exe /std:c++latest /utf-8 repro.cpp`

**Compiler Version:** `用于 x86 的 Microsoft (R) C/C++ 优化编译器 19.50.35718 版`

**Expected Behavior:** The UTF-8 string is output completely and correctly.

**Observed Behavior:** `这是一段超长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长长���的文本。`

# Possible Cause

The issue is likely caused by the output mechanism splitting the long string into small, fixed-size chunks (e.g., 256 bytes) before sending them to the console.

The failure is suspected to lie in the specialized handler responsible for committing these chunks to the console: _Fmt_iterator_flush<_Print_to_unicode_console_it>.

This handler, which manages the UTF-8 to Console conversion, appears to simply pass the raw byte chunk's range (_First to _Last) to the underlying write function without ensuring the chunk contains complete UTF-8 characters:

```C++
// https://github.com/microsoft/STL/blob/main/stl/inc/print 
template <>
struct _Fmt_iterator_flush<_Print_to_unicode_console_it> {
    static _Print_to_unicode_console_it _Flush(
        const char* const _First, const char* const _Last, _Print_to_unicode_console_it _Output) {
        _STD _Print_noformat_unicode_to_console_nonlocking(_Output._Get_console_handle(), {_First, _Last});
        return _Output;
    }
};
```

If a chunk ends in the middle of a multi-byte UTF-8 character, committing the incomplete sequence at this point may cause the downstream MultiByteToWideChar conversion to fail, resulting in the observed U+FFFD characters. This suggests the necessary UTF-8 boundary check logic may be missing from this specific specialization before the data is written to the console handle.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`<print>`: `std::println("{}", ...)` UTF-8 Truncation Bug with `/utf-8` (256-byte Buffer Split) #5894

Describe the bug

Reproduction Code and Output

Possible Cause

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

<print>: std::println("{}", ...) UTF-8 Truncation Bug with /utf-8 (256-byte Buffer Split) #5894

Description

Describe the bug

Reproduction Code and Output

Possible Cause

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`<print>`: `std::println("{}", ...)` UTF-8 Truncation Bug with `/utf-8` (256-byte Buffer Split) #5894