Affected version: XSD 4.2.0 (C++/Tree mapping)
Affected files: xsd/cxx/tree/serialization/int.hxx
(and likely the analogous headers for long, short,
byte, unsigned-int, unsigned-long, integer, etc.)
Summary
The serialization operators for built-in integer types use a
std::basic_ostringstream without calling imbue(std::locale::classic()).
As a result, when the global C++ locale uses digit grouping (e.g. de_DE,
where 42000 is formatted as "42.000"), the generated XML contains
group-separator characters in numeric fields. This violates the
XML Schema specification, which requires xs:int, xs:long, xs:integer
etc. to be serialized as a plain digit sequence with no grouping.
Reproduction
Minimal schema:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="value" type="xs:int"/>
</xs:schema>
Minimal program:
#include <iostream>
#include <locale>
#include <sstream>
#include "schema.hxx"
int main() {
std::locale::global(std::locale("de_DE.UTF-8"));
value_t v(42000);
xml_schema::namespace_infomap map;
std::ostringstream os;
os.imbue(std::locale::classic()); // does not help
value_(os, v, map);
std::cout << os.str() << std::endl;
}
Expected output:
42000
Actual output:
42.000
The imbue() call on the user-supplied stream has no effect because the
serialization functions internally construct their own ostringstream
which inherits the global locale at construction time.
Root cause
In xsd/cxx/tree/serialization/int.hxx, the insertion operators
construct a basic_ostringstream without imbuing the classic locale
before writing the value. Compare with xsd/cxx/tree/serialization/element.hxx,
where the insert() helper does call imbue(std::locale::classic())
correctly. The fix in element.hxx appears to have been intentional;
the integer-type headers seem to have been overlooked.
Suggested fix
Add os.imbue(std::locale::classic()) immediately after constructing
the basic_ostringstream in each of the affected operator<< overloads,
analogous to the existing fix in element.hxx. The same fix should be
applied consistently across all numeric-type serialization headers
(int.hxx, long.hxx, short.hxx, byte.hxx, the unsigned variants, and
integer.hxx / non-negative-integer.hxx etc.).
Workaround context
This is particularly painful for users who ship XSD-generated code as
part of a library or DLL, where modifying the global C++ locale is not
acceptable (it would affect the host application's behaviour). Per-thread
locale tricks (uselocale on POSIX, _configthreadlocale on Windows) do
not reliably propagate to std::basic_ostringstream in all standard
library implementations, so a library-side fix is the only robust
solution.
Environment
- XSD: 4.2.0
- Compiler: msvc 2022
- OS: windows 11
Happy to provide a patch if helpful.
Affected version: XSD 4.2.0 (C++/Tree mapping)
Affected files: xsd/cxx/tree/serialization/int.hxx
(and likely the analogous headers for long, short,
byte, unsigned-int, unsigned-long, integer, etc.)
Summary
The serialization operators for built-in integer types use a
std::basic_ostringstream without calling imbue(std::locale::classic()).
As a result, when the global C++ locale uses digit grouping (e.g. de_DE,
where 42000 is formatted as "42.000"), the generated XML contains
group-separator characters in numeric fields. This violates the
XML Schema specification, which requires xs:int, xs:long, xs:integer
etc. to be serialized as a plain digit sequence with no grouping.
Reproduction
Minimal schema:
Minimal program:
Expected output:
42000
Actual output:
42.000
The imbue() call on the user-supplied stream has no effect because the
serialization functions internally construct their own ostringstream
which inherits the global locale at construction time.
Root cause
In xsd/cxx/tree/serialization/int.hxx, the insertion operators
construct a basic_ostringstream without imbuing the classic locale
before writing the value. Compare with xsd/cxx/tree/serialization/element.hxx,
where the insert() helper does call imbue(std::locale::classic())
correctly. The fix in element.hxx appears to have been intentional;
the integer-type headers seem to have been overlooked.
Suggested fix
Add os.imbue(std::locale::classic()) immediately after constructing
the basic_ostringstream in each of the affected operator<< overloads,
analogous to the existing fix in element.hxx. The same fix should be
applied consistently across all numeric-type serialization headers
(int.hxx, long.hxx, short.hxx, byte.hxx, the unsigned variants, and
integer.hxx / non-negative-integer.hxx etc.).
Workaround context
This is particularly painful for users who ship XSD-generated code as
part of a library or DLL, where modifying the global C++ locale is not
acceptable (it would affect the host application's behaviour). Per-thread
locale tricks (uselocale on POSIX, _configthreadlocale on Windows) do
not reliably propagate to std::basic_ostringstream in all standard
library implementations, so a library-side fix is the only robust
solution.
Environment
Happy to provide a patch if helpful.