Define conversions across types #319

andreubotella · 2020-07-20T17:01:07Z

As per whatwg/encoding#215 (comment), we might want to enable Infra types to define conversions from and to other types.

annevk · 2020-07-22T11:01:37Z

So what do we need here?

Byte sequence as list.
String as list. Code unit and code point, presumably? Encoding needs code point, but I suspect in other places we would want to do code units, if anything.

And the reverse?

Also, should we make it implicit so you can write <a for=list>For each</a> <var>byte</var> of <var>bytes</var> or do we want <var>bytes</var> to be explicitly converted to a list first?

Or even further, do we want to say that byte sequences and strings are fundamentally lists? (I guess that doesn't work for strings do to code unit/code point stuff.)

andreubotella · 2020-09-14T09:34:30Z

I'm looking at the various usages of the Encoding hooks across several standards, and they seem to be called almost every time with a byte sequence (respectively, with a string), with the return value being used as a string (resp. byte sequence). Note that this already relied on implicit conversions before whatwg/encoding#215.

I suppose it might be fine to make a conversion implicit if it's on an algorithm boundary with well-defined types. For example, if "decode" is called with a byte sequence, it's clear that it has to be converted into an I/O queue of bytes. Likewise, if inside the steps for "decode", a string was returned, it'd be clear that it'd have to be converted into an I/O queue of scalar values inside the decode operation. But from outside the decode algorithm, the output type of the conversion is not necessarily clear, and since the range of possible types might be open-ended, the conversion would have to be explicit:

Let string be the result of UTF-8 decoding byteSeq, converted to a string.

annevk · 2020-09-14T09:39:33Z

That, or we define an I/O queue of scalar values that contains end-of-queue as being interchangeable with a scalar value string. That might also address the for each problem although I guess you'd not want end-of-queue to show up there... Or we define a string-returning version of the frequently invoked decoding algorithms.

andreubotella · 2020-09-15T09:36:26Z

I think we might want to define that types which are a wrapper over some other type should by default have conversions to/from that wrapped type, but we might want to define additional conversions and/or override the default ones.

For example, let's say that string was defined as a list of code units (which it probably should). Then there'd be a conversion string → list of code units and a conversion list of code units → string by default. But we could additionally define a conversion string ↔ list of code points, and we could in turn use that conversion to define code point length, scalar value string, collect a sequence of code points...

Now, for some types which add additional semantics to their wrapped types, such as set, we could define an explicit algorithmic conversion list → set which maintains the invariants. And we could use that same thing to handle end-of-queue on I/O queues.

domenic · 2021-04-15T17:49:19Z

It's pretty weird that you cannot (or can no longer?) apply UTF-8 decode to a byte sequence, but instead have to apply UTF-8 decode to the result of converting the byte sequence into an I/O queue.

annevk · 2021-04-16T05:18:53Z

Not sure if this came up in the context of writing new specification text, but I think we should continue to write text as if that is possible and eventually fix the plumbing.

This was referenced Oct 2, 2020

More JSON algorithms #338

Merged

Conversion to scalar value string #345

Open

andreubotella mentioned this issue Oct 26, 2020

Questionable conversion in BOM Sniff whatwg/encoding#243

Closed

andreubotella mentioned this issue Nov 3, 2020

Editorial: revamp the way we deal with code points and bytes whatwg/encoding#247

Draft

andreubotella pushed a commit to andreubotella/multipart-form-data that referenced this issue Apr 22, 2021

Remove explicit conversion steps, as per whatwg/infra#319 (comment)

cb2f92d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define conversions across types #319

Define conversions across types #319

andreubotella commented Jul 20, 2020

annevk commented Jul 22, 2020 •

edited

Loading

andreubotella commented Sep 14, 2020 •

edited

Loading

annevk commented Sep 14, 2020

andreubotella commented Sep 15, 2020

domenic commented Apr 15, 2021

annevk commented Apr 16, 2021

Define conversions across types #319

Define conversions across types #319

Comments

andreubotella commented Jul 20, 2020

annevk commented Jul 22, 2020 • edited Loading

andreubotella commented Sep 14, 2020 • edited Loading

annevk commented Sep 14, 2020

andreubotella commented Sep 15, 2020

domenic commented Apr 15, 2021

annevk commented Apr 16, 2021

annevk commented Jul 22, 2020 •

edited

Loading

andreubotella commented Sep 14, 2020 •

edited

Loading