Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas for an audio transport sequence #1

Open
jerch opened this issue Oct 19, 2021 · 25 comments
Open

Ideas for an audio transport sequence #1

jerch opened this issue Oct 19, 2021 · 25 comments

Comments

@jerch
Copy link
Owner

jerch commented Oct 19, 2021

Coming from mintty/mintty#1122.

@j4james
Copy link

j4james commented Oct 19, 2021

Doesnt that mean we would have to get things ISO certified?

To answer your question from the Mintty thread, the format I'm talking about has already been registered - that's why I was suggesting it. I just posted some links the docs that I found.

@jerch
Copy link
Owner Author

jerch commented Oct 19, 2021

Thx for the update - yeah we posted somewhat interleaved. Will have a look at that.

@jerch
Copy link
Owner Author

jerch commented Oct 25, 2021

@j4james Small update on things - the VT525 is still powering up 😸. So far no capacitor exploded (fingers crossed, dont want to get into soldering). But I need to get some usb to serial plug/cable first, to really work with it.

@jerch
Copy link
Owner Author

jerch commented Oct 27, 2021

Another interim update:

Bought an FTDI usb-serial adapter, thats supposed to work with a vt420 following several resources on the internet. It should work with another MMJ adapter at comm2 or comm3, but I cannot find that adapter anywhere atm (H8571-J, its wiring described here).
Since the vt525 has also a DB25 male + female port for comm1, I tried to connect with a 9-25 adapter, but no success yet. The wiring scheme indicates that I might have to use a breakout box to get DTR/DSR crossed with CTS/RTS for proper handshake and flow control. Another site says, that full null modem crossing (crossing Rx/Tx, DTR/DSR, CTS/RTS) is enough to get the data through, but hardware flow control wont work. Reason for this seems to be, that CTS/RTS never was intended for flow control, but got misused for that by modems later on, while DEC did not do that and relied on its DTR/DSR flow control. Overall information about these things is very thin.

My best hope currently is to get the MMJ way working, prolly soldering the adapter myself. (The equipment is again buried at home somewhere, will not get there again before mid November.)
The breakout box way would be nice to get the DB-25 ports working as well, but carries the risk to damage either end, if I get the schematics wrong just once.

On the sequence side of things I can test at least the local echo mode of the vt525 with very weird results. DECPS plays once a short beep at different frequencies following Pnote, but duration and multiple Pnote do not work. Multiple DECPS sequences also do not work, the device plays DECPS exactly once, then I have to reboot the device (RIS/DECSTR not tested yet). But that might be a side effect of local echo mode not integrating DECPS logic properly by short-cutting the buffers and blocking semantics. (This is only useful to investigate if someone wants to emulate the local echo to full degree, def. not on my list currently.)

@j4james
Copy link

j4james commented Oct 28, 2021

Thanks for the update. I'm afraid hardware was never my strong point, and it's been years since I've messed with cabling, so I won't be of much help to you. But @hackerb9 has made a bunch of notes on the cabling he is using for his VT340 which might be useful (see here). And if you get stuck, he may be able to advise you.

The breakout box way would be nice to get the DB-25 ports working as well, but carries the risk to damage either end, if I get the schematics wrong just once.

Yeah, I wouldn't want to risk that if I were you.

On the sequence side of things I can test at least the local echo mode of the vt525 with very weird results. DECPS plays once a short beep at different frequencies following Pnote, but duration and multiple Pnote do not work. Multiple DECPS sequences also do not work,

That's a bit disappointing. Testing with local echo was going to be my backup plan if I managed to get hold of a VT525 and couldn't get the serial connection working. It would be really annoying if that's as a good as it gets, but hopefully you'll be able to get a proper connection setup eventually. Take your time though - I don't want you to blow anything up! 😉

This is only useful to investigate if someone wants to emulate the local echo to full degree, def. not on my list currently.

Yeah, I'm not particularly concerned about that either.

@jerch
Copy link
Owner Author

jerch commented Oct 28, 2021

But @hackerb9 has made a bunch of notes on the cabling he is using for his VT340 which might be useful (see here).

Thx, this looks like a helpful resource, esp. regarding the flow control options. I checked the setup screens again and the vt525 should support all flow control options - XON/XOFF (software) and DTR/DSR (vt400+ line) and even RTS/CTS ("modern" serial devices). Since it already supports RTS/CTS, I hope to get away with null modem crossing at the DB-25 port, so soldering the MMJ adapter is now my second best option.

Update on sequence handling (local echo):

  • RIS resets stalled beep output, DECSTR does not. RIS is somewhat strange, it does a different init that the normal bootup - with a white splash screen, then a keyboard error, but things work after pressing a key. The normal bootup does not show the splash screen, just its "SELFTEST OK" until you hit a key.

After resetting to factory defaults it now sometimes allows me to input multiple DECPS, but have not yet figured out under which circumstances. For some reason my best success was with 32 in duration, while all other values there worked only once (again I hear no duration difference on the beep). But still the 32 did not work reliable, it would also block beeps randomly. Imho the local echo mode is not helpful to get DECPS explained.

@hackerb9
Copy link

hackerb9 commented Oct 29, 2021

My best hope currently is to get the MMJ way working, prolly soldering the adapter myself. (The equipment is again buried at home somewhere, will not get there again before mid November.)

I do highly recommend PacificCable.com's MMJ adapters. No soldering necessary; just push the pins into the socket where you want them. Pacific Cable's A9F12D MMJ to DB9 adapter

The wiring I came up with definitely functions for data and doesn't blow anything up. I'd be curious to know if it works as I intended on a terminal that can actually perform hardware flow control.

The breakout box way would be nice to get the DB-25 ports working as well, but carries the risk to damage either end, if I get the schematics wrong just once.

Before doing anything risky, does it work when you disable hardware flow control? For example,

stty -F /dev/ttyUSB0  9600  -crtscts  clocal 
echo Hello > /dev/ttyUSB0
cat /dev/ttyUSB0

@jerch
Copy link
Owner Author

jerch commented Oct 29, 2021

@hackerb9 Can test your snippet once I got the null modem cable in place.

The vt525 supports one more setting from the options - some sort of modem emulation. But I have not figured yet, how thats supposed to work. My inital guess was, that it would make the device operate in modem mode, thus I could skip the null modem crossing. But I was not able to get it working with that, prolly due to handshake issues (I have no oscilloscope to check 😞, also that modem mode is not described in detail anywhere).

@hackerb9
Copy link

The vt525 supports one more setting from the options - some sort of modem emulation. But I have not figured yet, how thats supposed to work. My inital guess was, that it would make the device operate in modem mode, thus I could skip the null modem crossing. But I was not able to get it working with that, prolly due to handshake issues (I have no oscilloscope to check disappointed, also that modem mode is not described in detail anywhere).

If you've got GNU/Linux, just run statserial /dev/ttyUSB0 to see the state of RTS/CTS and DSR/DTR. No oscilloscope or logic probe needed for that.

However, I'm guessing "modem emulation" has nothing to do with a null modem. The VT340 text programming manual has a detailed section on "Modem Control Modes" in the Communication chapter. (Pages 275–280). There are three options: disabled, VT220, and V.25-bis. Any of them will work with a properly wired cable and and the right settings on the host side, but the easiest thing to do would be to set it to "disabled" so the terminal doesn't require hardware handshaking before allowing communication.

If it is what I think it is, I do not see any advantage to enabling modem control unless you are actually using the terminal with a modem.

@j4james
Copy link

j4james commented Oct 29, 2021

In case you haven't seen it, chapter 9 of the VT520/VT525 programmers manual (EK-VT520-RM) also has a bunch of technical information on the communications ports.

@jerch
Copy link
Owner Author

jerch commented Oct 31, 2021

Oops, thx for the pointers in the manuals, yeah I did not see those chapters (was under the impression they would deal with connectivity in those early setup chapters 😊).

Edit: @hackerb9 Yes youre right, the modem thingy seems to be that modem control to drive it behind a "modem layer", not a null modem itself.

@jerch
Copy link
Owner Author

jerch commented Nov 11, 2021

Short update on things - after 3 weeks of waiting I got the nullmodem adapter, still waiting on the MMJ adapter @hackerb9 pointed me to above (prolly stuck in a container somewhere around the world).

With the nullmodem adapter the connection works under these conditions:

  • connected to upper comm1 port
  • modem control deactivated
  • only software flow control (XON/XOFF) is working, no RTS/CTS or DTR/DSR (Was hoping for proper RTS/CTS, but this might be a limitation of the nullmodem adapter not crossing these pins.)
  • serial settings: 8bit word, parity none, 1 stop bit
  • tested speeds - 9600, 19.2K, 38.4K, 115.2K
  • speed setting must exactly match on both ends (or the terminal will output rubbish and/or completely stall)
  • getting a proper login shell by running sudo /sbin/agetty 115200 ttyUSB0 vt525 from ubuntu 18

Still have to mess with all connection details, and prolly will do some writeup of it.

Some first impressions on DECPS:

  • sequence supports only one note, multiple notes given will skip first ones and only play last one (needs more checks)
  • screen output blocks at every single sequence for the given duration
  • volume 0 is off, 1-3 is same level (somewhat mumbling, like triangle/square mixture, but not as sharp as plain sawtooth), 4-7 is louder, but again same level (sounds like some additive sine octave on top of 1-3 sound)

For DECPS I will try to record sound + screen update clips from test cases, so everyone can check for timings, frequencies or wave forms.

@j4james I was able to fix your "happy birthday" snippet by extending the multiple note sequences to single ones. Seems thats the only supported sequence form.

Edit:
Oh and I need to prepare my shell/locale env first, output is full of broken unicode/utf8 chars and spurious OSC title commands. @hackerb9 Maybe you can share some aspects of your env settings for your VT340 connection?

@j4james
Copy link

j4james commented Nov 11, 2021

sequence supports only one note, multiple notes given will skip first ones and only play last one

I suppose this is understandable, considering they only mentioned the multiple note support in the setup, and not in any of the other places that the DECPS sequence was documented.

screen output blocks at every single sequence for the given duration

But this doesn't make any sense then. Why would they talk about the sound buffer storing 16 notes, if it blocks after 1? I could understand blocking after 1 DECPS controls containing 16 notes, or 16 single-note DECPS controls, but I can't see how this related to the documentation at all. I suppose at least it might be easier to implement.

And when you say it block at every single sequence, do that mean you won't see any output following the DECPS until the note has finished playing? For example in a test case like this:

printf "Hello \e[4;32;1,~ World\n"

Do you see Hello while the note is playing, and World only once it has finished?

Another thought that occurred to me - is it possible it's sending an XOFF which causes the client side to pause its output, but if you ignored the flow control you could potentially send more output while the note was still playing?

I'm probably grasping at straws, but I have to admit I find this behavior a bit disappointing. Not being able to play anything in the background at all kind of limits the usefulness of this sequence.

@jerch
Copy link
Owner Author

jerch commented Nov 11, 2021

But this doesn't make any sense then. Why would they talk about the sound buffer storing 16 notes, if it blocks after 1? I could understand blocking after 1 DECPS controls containing 16 notes, or 16 single-note DECPS controls, but I can't see how this related to the documentation at all. I suppose at least it might be easier to implement.

Yes this seem not in line with the docs, or we read something completely different into that. It also possible that different flow control modes have different side effects here (note I only got XON/XOFF working atm).

Your example does this:

  • print Hello + SP
  • start note
  • wait duration (prolly 1s)
  • end note
  • print SP + World + CRLF

Another thought that occurred to me - is it possible it's sending an XOFF which causes the client side to pause its output, but if you ignored the flow control you could potentially send more output while the note was still playing?

In general I would not be surprised if software flow control behaves much different than hardware flow control.

I'm probably grasping at straws, but I have to admit I find this behavior a bit disappointing. Not being able to play anything in the background at all kind of limits the usefulness of this sequence.

I hear you. It still would be possible to do that, but would be very limited in "background play" - you basically need to split screen updates in very small chunks perfectly timed at the note changes. Thats really nasty, and also does not allow bigger updates, as big transmissions would create perceivable sound delays. Still hoping that changing the flow control settings will reveal the audio buffer behavior as described in the docs.

Will do more tests once I got the connection / env details sorted out.

@jerch
Copy link
Owner Author

jerch commented Nov 14, 2021

@j4james Took a closer look on your ECMA35 based suggestion for audio data transport and the corresponding audio ISO spec.

Several notes from my sides (without final judgement yet, whether thats good or bad):
The encoding knows several translation modes, from no translation (mode 0) to full 7bit shifts. Their purpose is not further motivated, guess it is to avoid misinterpretation of certain bytes during transmission by some devices involved with "fixed control chars". At least full 8bit transmission was for a long time a serious issue due to devices in the line not being capable to do full 8 bit, therefore most protocols have these 7bit escape hatches (even sixel stays strictly in 7bit, prolly for the same reason). I think this is not that much of an issue anymore with all the "virtualized transports" these days, at least ssh + the TCP stack should handle that transparently as payload bytes. I am not so sure at the PTY/TTY, which still can be set to 7bit, furthermore it knows an IUTF8 mode, which might lead to wrong data interpretation (needs investigation).
Regarding the translation modes itself - not sure if we could go with mode 0 (no translation at all), we at least need some symbol to escape the data stream. Furthermore I am not sure, how "no translation at all" is itself ECMA35 compatible - I always thought, any charset needs to declare at least ESC in C0 at 01/11.

Overall the situation regarding possible failing devices/software with that ECMA35 encoding might not be that bad. So I would not exclude the idea right from the beginning. Whether we can gain bandwidth with it (compared to a fully inlined base64 sequence), mainly depends on the translation mode we need to go with (note that the more complex translations replace certain bytes with 2 7bit bytes, which leads to 1.5x bandwidth for full 7bit translation, where base64 is only at 1.33x).

Correction:
The translation scheme for 7bit shifts is quite data intensive, and more at ~2x bandwidth.


Looking throught the modes more in detail, imho these 3 are worth a closer look:

  • mode 0
    • Pro - best bandwidth utilization.
    • Con - Needs some sort of unlikely "sentinel" escape sequence to indicate end of stream. Deactivates all control functions incl. ESC, thus imposes major rework on TE side (needs serious parser rework).
  • mode 1
    • Pro - good bandwidth utilization with just one replacement (US). US could be used as the escape stream indicator.
    • Con - Still deactivates most control functions incl. ESC.
  • mode 5
    • Pro - leaves most C0 as control functions, easy to go with for TEs. Bandwidth still ok with ~1.125x.
    • Con - Replaces most C0 bytes with DLE + shift variant of the C0 char, thus needs more processing on the data, including length shifts at ~ every 8th byte (copying intensive).

Currently I'd favor mode 5, as I think that this is the most straight forward one to implement on TE parser side, while not penalizing bandwidth much. The additional translation needs are fully within the data, thus TEs not implementing this are not affected at all (only needs DOCS start/end recognition).
Mode 0 & 1 are way too invasive for TE parsers and would need serious rework for the C0 unloading (without checking any implementation, but I assume that proper DOCS suppport with C0 unloading caps is close to zero across all TEs).

If it turns out we have to go with 7 bit only for some reason, I would ditch the ECMA35 approach in favor of an DCS/OSC sequence with base64 payload.

@j4james
Copy link

j4james commented Nov 14, 2021

Regarding the translation modes itself - not sure if we could go with mode 0 (no translation at all), we at least need some symbol to escape the data stream.

I haven't looked at this stuff very closely, but my impression was that the data was self terminating, i.e. the packet headers have length indicators, and there is some concept of a "last block" which I figured would let you know once you'd reached the end of the stream.

Mode 0 & 1 are way too invasive for TE parsers and would need serious rework for the C0 unloading (without checking any implementation, but I assume that proper DOCS suppport with C0 unloading caps is close to zero across all TEs).

Hard to say until I've actually tried it, but I wouldn't have thought the C0 handling was the problem. Once the mode is activated I'd just redirect all the data to a separate parser until terminated (I imagine something like Tektronix mode would work the same way).

The tricky part for us is the UTF-8 parsing which is handled at a higher level. The way we currently deal with that (when switching to an 8-bit ISO-2022 encoding) is by resetting the code page to ISO-1252, essentially passing the data through as-is. However, that does require an app to flush the output immediately after switching modes to ensure that you don't have data with different encodings arriving in the same packet.

If that became a problem, it's probably fixable, but hasn't been a priority for us so far. I believe XTerm may have a similar limitation.

If it turns out we have to go with 7 bit only for some reason, I would ditch the ECMA35 approach in favor of an DCS/OSC sequence with base64 payload.

This wouldn't be my first choice, but it's up to you. If you do go this route, though, I'd recommend sixel over base64. It's maybe not that big a deal, but I think base64 introduces an unnecessary level of complexity when you aren't actually constrained by the limitations of 7-bit email.

@j4james
Copy link

j4james commented Nov 14, 2021

Back on the previous topic of DECPS being able to play in the background, you said:

It still would be possible to do that, but would be very limited in "background play" - you basically need to split screen updates in very small chunks perfectly timed at the note changes.

But I don't see how that would work. Based on the test case you ran for me, my understanding is that nothing else is going to happen while a note is playing. So it doesn't matter how small your screen updates are, everything is going to come to a standstill as soon as you play a note.

If you could play at least one note in the background, while still continuing to update the screen, that would be fine, but that doesn't seem to be the case. Or have I misunderstood what's happening?

sequence supports only one note, multiple notes given will skip first ones and only play last one

Thinking about this some more, I'm inclined to leave in support for multiple notes even if the VT525 didn't actually support that. It is at least part of the official documentation, it's not likely to break backwards compatibility with apps that are limiting themselves to one note at a time, and there are already a number of modern terminals supporting multiple notes.

@jerch
Copy link
Owner Author

jerch commented Nov 14, 2021

I haven't looked at this stuff very closely, but my impression was that the data was self terminating, i.e. the packet headers have length indicators, and there is some concept of a "last block" which I figured would let you know once you'd reached the end of the stream.

Yes it is defined that way, but self termination does not work with TEs, that dont implement the details? They would not "understand" the termination from the data within?

If that became a problem, it's probably fixable, but hasn't been a priority for us so far. I believe XTerm may have a similar limitation.

Yeah the "transport layer-like" utf-8 handling will certainly cause issues. Isnt xterm fully utf-8 only internally since several years (would always need that luit shim/switch)?
My hope with keeping important control codes intact (well, at least ESC) is basically a higher adoption rate by TEs, because they can keep their main parsers in place, just add DOCS start support with data rerouting (or skipping, if not doing audio), and fall back to normal handling on DOCS end. Most have no Tektronix implemented, thus prolly no higher level incoming data switch/preparser in place.

This wouldn't be my first choice, but it's up to you. If you do go this route, though, I'd recommend sixel over base64. It's maybe not that big a deal, but I think base64 introduces an unnecessary level of complexity when you aren't actually constrained by the limitations of 7-bit email.

Ah well, the alphabet to use is quite the least concern for me there. I think sixel's alphabet would be easier to translate due to its continuous character space?

About DECPS:

If you could play at least one note in the background, while still continuing to update the screen, that would be fine, but that doesn't seem to be the case. Or have I misunderstood what's happening?

Nope, thats exactly what happens. What I tried to illustrate was - you can still get notes coupled to screen updates through, if you clever split that. But yes, both - sound and screen will have pauses from the other taking its time. No real background playing possible. At least with XON/XOFF flow control (others yet to be tested).

Thinking about this some more, I'm inclined to leave in support for multiple notes even if the VT525 didn't actually support that. It is at least part of the official documentation, it's not likely to break backwards compatibility with apps that are limiting themselves to one note at a time, and there are already a number of modern terminals supporting multiple notes.

Yes, was acually thinking the same. In the end it depends on how closely you want to emulate a certain device.

@j4james
Copy link

j4james commented Nov 15, 2021

Yes it is defined that way, but self termination does not work with TEs, that dont implement the details? They would not "understand" the termination from the data within?

No, but regardless of the format, I wouldn't want to send a large chunk of audio data to a TE without first confirming that it could actually handle that. Even with a DCS or OSC sequence, there's no guarantee that the terminal won't choke on it and start spewing garbage onto the screen.

With DOCS you can always output a short test sequence first and check for cursor movement to determine if it's supported or not.

That said, I don't mean to pressure you to go this route this if you don't like idea. It was just something I thought worth investigating before inventing your own thing.

Isnt xterm fully utf-8 only internally since several years (would always need that luit shim/switch)?

No. You can get 8-bit ISO-2022 support in XTerm either using the DOCS standard return sequence, or by starting up with the LANG environment variable appropriately set.

I think sixel's alphabet would be easier to translate due to its continuous character space?

Yeah, that's what I was getting at. It doesn't need a dictionary lookup, and you also don't need to worry about all that = padding nonsense.

In the end it depends on how closely you want to emulate a certain device.

Certainly not that strict by default, otherwise I'd also be disabling all the modern OSC sequences and XTerm modes that people take for granted nowadays. If we ever got around to supporting different terminal types and conformance levels it might be worth considering, but even then I'm not sure I'd bother for something like this.

@jerch
Copy link
Owner Author

jerch commented Nov 15, 2021

Even with a DCS or OSC sequence, there's no guarantee that the terminal won't choke on it and start spewing garbage onto the screen.

Yeah the TE choking issue will hunt us in every case without prior testing for support. I hate this situation, I'd love to use more DCS for newer stuff (as I see DECs DCS variant as more capable than OSC), but support is so lousy across the board, geez. Software is, other than real devices, easy to be fixed, still the field does not move much. Idc much, if the kernel consoles dont make any ground beyond vt100 emulation, but isn't basic sequence type support is somewhat mandatory after being specified for >30ys?

That said, I don't mean to pressure you to go this route this if you don't like idea. It was just something I thought worth investigating before inventing your own thing.

Well to make it blunt - in my world the ECMA35 encoding juggling is almost dead (mainly due to utf8 forcing all into higher level "transport encoding"), and unicode being capable to replace those encodings from its bigger codepoint space. With pulling a binary payload out of the hat we lose that unicode "symmetry". Furthermore sequence payload types (like OSC/DCS/APC) offer enough functionality with proper encapsulation (given the parser implements at least recognition to skip them). So thats where my sentiments against reviving DOCS specs come from. (Not even sure, if reviving is the right term here - do you know any applications that used this audio spec?)
On the other hand, the DOCS system promises an almost 1:1 bandwidth utilization, if we are allowed to stay in 8 bit. 8 bit per se works with standard tooling these days, otherwise utf8 would not work. But now that general utf8 assumption places a new hurdle on that path, that we have to check first. So yes, I give that bandwidth advantage (~ 20-33%) enough credits to think about DOCS, mainly because audio transport will be quite data intensive, where even 20% make quite a difference.

Mind you - when I saw Annex F I had to laugh, it even mentions JPEG. Did only a quick scan over it - it does not mention sixels anywhere, does it?

No. You can get 8-bit ISO-2022 support in XTerm either using the DOCS standard return sequence, or by starting up with the LANG environment variable appropriately set.

Ah ok - thought, it gets internally mapped to unicode chars. (time to check source, just have read this somewhere in the past)

Yeah, that's what I was getting at. It doesn't need a dictionary lookup, and you also don't need to worry about all that = padding nonsense.

In SIMD code thats prolly only 3 instructions (63-126 range check + subtract 63 + a shuffle extraction). While standard base64 is even in SIMD quite expensive.

@j4james
Copy link

j4james commented Nov 16, 2021

but isn't basic sequence type support is somewhat mandatory after being specified for >30ys?

Well that ITU audio spec is at least 25 years old, but I doubt you'd argue that's a reason for it to be mandatory. TE's have a right to choose what protocols they want to implement, and if they've chosen to emulate a VT100, then it's hardly surprising that they don't support DCS when the VT100 never did either.

But the problem isn't just lack of DCS support. You've also got to worry about terminals that do support DCS, but put arbitrary limits on it. For example, Kitty will stop processing the sequence after a certain length, and it doesn't just ignore anything over that limit - it dumps the rest of the content out to the screen. At least that was the case for the last version I tested.

Bottom line: don't expect a newly invented protocol to just work without first querying the terminal to see if it's actually supported.

Did only a quick scan over it - it does not mention sixels anywhere, does it?

No. This would have been long after the days of sixel images.

@jerch
Copy link
Owner Author

jerch commented Nov 16, 2021

Well that ITU audio spec is at least 25 years old, but I doubt you'd argue that's a reason for it to be mandatory.

Ofc not. The difference is, that ECMA-48 is something as a fundamental standard existing since mid 70s, where the sequence types are specified, while the audio spec is totally optional and much younger.

TE's have a right to choose what protocols they want to implement, and if they've chosen to emulate a VT100, then it's hardly surprising that they don't support DCS when the VT100 never did either.

Thats the sad part - ofc they are free to do as they please, but thats one of the reasons why moving the terminal interface forward is almost a futile subject. With that thinking we are basically stalled in the early 80s forever. Also it is not the full truth, since things like unicode or basic color support also made its hacky way into the slowpokes.
Again, software is easy to be fixed (while keeping almost fully compatible). At least they did not choose to stall on dummy terminals or the VT52, what a relief. 👿

For example, Kitty will stop processing the sequence after a certain length, and it doesn't just ignore anything over that limit - it dumps the rest of the content out to the screen. At least that was the case for the last version I tested.

Then it operates outside of ECMA-48, as far as I can tell. There is no length limit specified anywhere, so not covered by the specs.

@j4james
Copy link

j4james commented Nov 16, 2021

The difference is, that ECMA-48 is something as a fundamental standard existing since mid 70s, where the sequence types are specified

I think you're misinterpreting the purpose of ECMA-48. It was never intended to dictate what controls a device should support - it was about standardizing the escape sequences to use for functionality that you may or may not choose to implement. And I don't think there's anything in the standard that requires a device to hide the content of a DCS sequence.

And it's not like a terminal emulator can "emulate" ECMA-48 anyway - that's just not a thing. They're typically going to be emulating one or more hardware terminals, which might possibly have conformed to ECMA-48. But matching the device's actual behavior is the main requirement

but thats one of the reasons why moving the terminal interface forward is almost a futile subject.

I disagree. There's nothing stopping you using sequences that some terminals can't handle, or even inventing new sequences if you think that's necessary. It just requires that the sender and receiver first reach an agreement on which sequences are actually supported. DEC terminals achieved this with things like DA reports, conformance levels, and mode queries. I don't see why modern terminals couldn't do the same thing.

Then it operates outside of ECMA-48, as far as I can tell. There is no length limit specified anywhere, so not covered by the specs.

I hate this behavior, so I don't want to seem like I'm defending it, but technically there is nothing in ECMA-48 (as far as I've seen) that says a device couldn't do something like that. That said, if a terminal is claiming to emulate a VT220 (which is what Kitty identifies as), then I would expect it to match the VT220's interpretation of DCS, and that has no such limit AFAIK. So my criticism would be that it's a poor VT220 emulation.

@hackerb9
Copy link

The difference is, that ECMA-48 is something as a fundamental standard existing since mid 70s, where the sequence types are specified

I think you're misinterpreting the purpose of ECMA-48. It was never intended to dictate what controls a device should support - it was about standardizing the escape sequences to use for functionality that you may or may not choose to implement. And I don't think there's anything in the standard that requires a device to hide the content of a DCS sequence.

ECMA-48, as I read it, is a huge grab bag of possible options. Each terminal will implement a different subset and no terminal will implement all of it.

As usual, I may be mistaken, but I see one of the benefits of ECMA-48 (and the subsequent ANSI standard) as having created a way for terminals to ignore the subsets they don't handle, including ones that didn't exist at the time they were manufactured. While I think DEC's labeling of sixels as "ANSI-compliant" is mostly marketing, it did have some meaning: an ANSI compatible terminal that doesn't support sixels can silently ignore them.

Of course, it is completely allowed by the spec to not ignore unknown sequences and instead barf characters all over the screen. But, ECMA-48 at least gives terminals a chance to know what might happen. For example:

In a control sequence with parameters, each parameter sub-string corresponds to one parameter and
represents the value of that parameter. The number of parameters is either fixed or variable, depending
on the control function. If the number of parameters is variable, neither the maximum number of values
nor the order in which the corresponding actions are performed are defined by this Standard.

On the face, it is simply saying that ECMA-48 does not define anything about the maximum number of parameters. But a careful programmer would see that as saying there is no limit and their code should handle arbitrary long sequences.

I'm disappointed to hear that Kitty failed that test, but yes, its behavior is within the ECMA-48 standard. (On the other hand, a device which explodes upon receiving DCS would also be within spec.)

@jerch
Copy link
Owner Author

jerch commented May 5, 2022

@j4james

Since you mentioned it here above (and also in some terminal-wg thread) - you are totally right with your idea about using the sixel alphabet for a 6-bit encoding, instead of normal base64. Did some first decoder tests, it is massively faster than any base64 algo out there (ofc values are only for my machine):

  • scalar: 3 GB/s native, 1.8 GB/s in webassembly
  • SIMD: 17 GB/s native, 6 GB/s in webassembly

compared to base64:

  • scalar: 2.1 GB/s native, 1.4 GB/s in webassembly
  • SIMD: 9 GB/s native, 0.7 GB/s in webassembly

The 17 GB/s is my machine's single channel memory bus limit (yes oldish laptop here), so it basically runs at memcopy speed for me. Thats not possible with any base64 SIMD trickery (can only run up to AVX2 algos with this machine), there is only one method known coming close to memcopy speeds on AVX512 machines (https://arxiv.org/abs/1910.05109). But AVX512 is very power hungry and only available on big CPUs.

So I stand corrected and think, that all new sequences with the need of payload encoding should use base64-sixel (well thats how I called it for now). Base64 is a very poor choice compared to base64-sixel. So thx for bringing this up, and not following "industry-standards" blindly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants