This is possibly me being silly.. I'm currently working on something where I have to convert chars to u32s for storage, and then convert back. My code is only correct if ∀ (c1 : char) (c2 : char) . c1.cmp(c2) == (c1 as u32).cmp(c2 as u32). In other words conversion from char to u32 must preserve the ordering relation of values. (there is probably a name given to this notion of relation-preserving mapping, but I don't know what it is.. anyone know?)
Looking at char documentation, it says char represents "Unicode scalar value"s, but other than that it doesn't say what does conversion to u32 do, or whether they're represented as u32 internally, and whether comparison is simple u32 comparison (i.e. comparison of unicode scalar values) or something more complicated. It's also not easy to learn about this by looking at the source, becuase PartialOrd implementation of char relies on >, < etc. implementations (1, 2), which are I think hard-coded somewhere in the compiler.
Simple u32 comparison is probably the only sensible implementation, but that's not entirely clear to me. When I click on the "unicode scalar value" link in the documentation it doesn't say how the values must be compared. Without knowing more about unicode or "unicode scalar value" I'm not sure if e.g. a character can have multiple unicode scalar values but those need to be equal, or something like that.
This is possibly me being silly.. I'm currently working on something where I have to convert
chars tou32s for storage, and then convert back. My code is only correct if∀ (c1 : char) (c2 : char) . c1.cmp(c2) == (c1 as u32).cmp(c2 as u32). In other words conversion fromchartou32must preserve the ordering relation of values. (there is probably a name given to this notion of relation-preserving mapping, but I don't know what it is.. anyone know?)Looking at
chardocumentation, it sayscharrepresents "Unicode scalar value"s, but other than that it doesn't say what does conversion tou32do, or whether they're represented asu32internally, and whether comparison is simpleu32comparison (i.e. comparison of unicode scalar values) or something more complicated. It's also not easy to learn about this by looking at the source, becuasePartialOrdimplementation ofcharrelies on>,<etc. implementations (1, 2), which are I think hard-coded somewhere in the compiler.Simple
u32comparison is probably the only sensible implementation, but that's not entirely clear to me. When I click on the "unicode scalar value" link in the documentation it doesn't say how the values must be compared. Without knowing more about unicode or "unicode scalar value" I'm not sure if e.g. a character can have multiple unicode scalar values but those need to be equal, or something like that.