Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proper vertical cursor advance #37

Open
jerch opened this issue Sep 17, 2022 · 39 comments
Open

proper vertical cursor advance #37

jerch opened this issue Sep 17, 2022 · 39 comments
Labels
bug Something isn't working

Comments

@jerch
Copy link
Owner

jerch commented Sep 17, 2022

echo -e '\x1bPq$-$-$-$-\x1b\\' should move the text cursor downwards, even if no pixels were modified (given sixel scrolling is on).

@hackerb9 If you find some time, can you confirm, that an "empty" sixel sequence (no pixel notion) with just GCR/GLFs would still advance the text cursor downwards on a vt340? Thats at least how I read the sixel docs, but kinda none of the terminals except xterm does it atm, so I am not sure if I misinterpret things here.

@jerch jerch added the bug Something isn't working label Sep 17, 2022
@j4james
Copy link

j4james commented Sep 17, 2022

Prediction: The "image" will take up three rows, and the following prompt will be on the fourth row.

You've got four graphic linefeeds, making five sixel rows, and the default aspect ratio is 2:1. So that gives you an image height of 5 x 6 x 2 = 60 pixels. The VT340 has 20 pixels per row, so that's three rows, with the final text cursor position being on the third row. The echo adds another linefeed taking you to the fourth row.

@jerch
Copy link
Owner Author

jerch commented Sep 17, 2022

@j4james Yep, thats what I would expect, but seriously - no TE except xterm does it atm, thus I wonder if thats actually true. (Or if all TEs minus xterm fall into the same trap as I did from a zero image --> no dimensions --> no advance, lol.)

@jerch jerch mentioned this issue Sep 17, 2022
2 tasks
@j4james
Copy link

j4james commented Sep 17, 2022

Well it works in Reflection Desktop and in my Windows console sixel fork, but I couldn't find any others. Even XTerm didn't work for me so I'm not sure why it's working for you. I'm fairly sure I've checked out the latest code.

It's possible my interpretation is wrong, but bear in mind that sixel is meant to be output as it's received, so if you've got a stream of sixel pumping out graphic linefeeds, you should see the viewport scrolling, even if no actual pixel data has been received yet. I know there is a certain amount of buffering, but it's highly unlikely the terminal would buffer linefeeds indefinitely while waiting for pixels to arrive. What would be the point?

@jerch
Copy link
Owner Author

jerch commented Sep 17, 2022

@j4james Oh well it is broken in recent xterm (just tested, thx for the headsup), but works in older version (still broken kinda there, as it advances too much, not the same as a sixel with pixels would have, lol).

I now just go with whatever the same sixel with at least one pixel would have advanced the cursor vertically. Really strange that kinda all xterm-alikes get it wrong, hmm.

@jerch
Copy link
Owner Author

jerch commented Sep 17, 2022

It's possible my interpretation is wrong, but bear in mind that sixel is meant to be output as it's received, so if you've got a stream of sixel pumping out graphic linefeeds, you should see the viewport scrolling, even if no actual pixel data has been received yet. I know there is a certain amount of buffering, but it's highly unlikely the terminal would buffer linefeeds indefinitely while waiting for pixels to arrive. What would be the point?

Since sixel is coming from a printer I think a GLF should be respected in the terminal (resp. GLFs accounting for x text lines). The problem which prolly most TE impls have is the "empty image returned" trap - no dimensions, so no cursor advance either (just a guess). But thats looks certainly wrong to me, if one thinks sixel as a "pixel-printer-on-top, with vertical stop at last active printed sixel line".

@j4james
Copy link

j4james commented Sep 17, 2022

The problem which prolly most TE impls have is the "empty image returned" trap - no dimensions, so no cursor advance either

Yeah, but what I'm saying is that's just a symptom of a bigger problem, which is that most TEs are treating sixel as an image format, when it's really a drawing mode. When you think of it as a drawing mode, you don't wait for the ST to find out how big the "image" is. The concept of "image size" doesn't have any meaning.

The raster attributes just specify an area of the display that is cleared in advance, but that doesn't limit where you can draw. You've got a kind of "pen" which you can move around, and which will scroll the viewport when moved offscreen. You can plot pixels at the pen's current location, and when the mode ends, the text cursor is simply positioned wherever the pen was.

@jerch
Copy link
Owner Author

jerch commented Sep 17, 2022

So in theory it is possible to stay in sixel mode forever, doing all outputs from sixel notation (just like normal PRINT state), and moving things forward with GLFs. I think really none of the OSS TEs can handle that - well my addon certainly cannot, as I cannot create an infinite canvas to draw on. Here the scrollbuffer idea somewhat contradicts the graphics-on-top sixel idea.

While my sixel decoder supports partial output as it is basically operating at sixel band granularity (e.g. you can watch it filling the bands with colors, pretty much like printer head movements), my back integration in xterm.js cannot do that atm, as the overhead for more fine-grained graphics linking to text cell is really expensive, also I cannot directly draw&render atm, as this is even more expensive in JS (Well this would change with webgl, but support is still too flaky across browsers).

Edit: I might be able to change into an "endless" mode by separating final bands (not reachable by the text cursor anymore) from intermediate bands, that need more turnovers. Not sure yet where to set the buffering limit, as I cannot do that for every single band, that got altered by one sixel (in fact the very first impl worked that way, and it was slow as hell). Will see what I can do here...

@j4james
Copy link

j4james commented Sep 17, 2022

So in theory it is possible to stay in sixel mode forever, doing all outputs from sixel notation (just like normal PRINT state), and moving things forward with GLFs.

Yep. A typical use case would be something like the output from a seismograph. And it doesn't necessarily require an infinitely growing canvas, because if you're drawing within margins, the top of the image will be erased as new output is generated on the bottom.

It could always be implemented as a series of image slices on terminals that don't support this concept, but it's a lot simpler just remaining in sixel mode the whole time. Especially back in days of the original hardware terminals where bandwidth would have been more of a constraint.

I think really none of the OSS TEs can handle that

Yeah, the only TEs I know of that can handle this are the commercial ones: Reflection Desktop, and IBM Personal Communications. My Windows fork does too, but that's essentially vaporware so I don't think that counts. 😉

@hackerb9
Copy link

hackerb9 commented Sep 18, 2022

Prediction: The "image" will take up three rows, and the following prompt will be on the fourth row.

@j4james is, as usual, correct. I added a test program that people can use to test their terminals: https://github.com/hackerb9/vt340test/blob/main/jerch/gcrglf.sh

[EDIT: Original script presumed the terminal's font size is 20 pixels high, but that has been fixed.]

Output should look something like this:

@jerch
Copy link
Owner Author

jerch commented Sep 18, 2022

@hackerb9 Thx for this awsome test script. I still have an offset by 1 line error, but kinda on both sides:

image

I mainly have an issue in my decoder, which tries very hard to find the last pixel touched by sixels, which also cuts off your '$-' at the end of the 2nd sixel (thus cursor stops in last line overprinting lower part). In the sense of images it kinda makes sense to find the real dimensions this way, but it makes absolutely no sense in the graphics layer metaphor. Gonna change that.
My second issue is linked to this - as the addon does not respect custom aspect ratios my left side is also off (always doing 1:1). Manually fixing both issues I get this:

image

which quite nicely mimicks vt340's line progression (even resembles the tiny offset at the arrows).

Last but not least, all of these fail the test:

  • xterm v369 (not newest release though)
  • mlterm, 3.8.4 (default from distro)
  • contour (latest release)
  • WezTerm (quite recent version)

Basically all of them just output a single empty line, no matter how many GCRGLFs were provided.

@christianparpart
Copy link

Guys. I'm kinda surprised by the interest and amazed by the deep content of this discussion here. I'd in fact like Contour to bump up conformance as much as it either is still relevant (debatable) or fun to implement.

The one thing I really liked is to have the graphical line rendered whenever a GLF has been processed. But frankly, I think I have never read that requirement in the specs.

I was reading that the ANSI (text) cursor is synchronized to the position of the sixel cursor when leaving sixel mode. My personal experience is that xterm doesn't do that, it instead also move the text cursor to the left margin on the NEXT line rather than simply leaving it where the sixel cursor was last painting. I personally don't even like xterm doing that (in other words: hate), but it's just Sixels and I tried hard not to care.

My main problem I am having and that becomes actually obvious by reading through here is the fact that there's so much stuff in between the spec lines that is unwritten but you have to adhere to in order to call yourself conformant.

If we could just create a .md file gathering all that actually important documentation to easily brain existing Sixel TE implementations on par, that would actually be helpful. At least I'd take that as a based to improve Contour's sixel implementation. :)

/me reading....

@j4james
Copy link

j4james commented Sep 19, 2022

If we could just create a .md file gathering all that actually important documentation to easily brain existing Sixel TE implementations on par, that would actually be helpful.

That was the intention behind the test cases in hackerb9's vt340test repo. If you can run those tests in your terminal and it matches the screenshots that hackerb9 captured from the VT340, then you can be confident that your implementation is doing a reasonable job of emulating a VT340.

For the tests I contributed, I usually tried to include some documentation explaining what aspects of Sixel I was testing, and what behavior was expected under the different edge cases. I don't expect terminals to necessarily match every single aspect exactly - some things just aren't practical - but the information is there if you want to make use of it.

The only catch (at least for my tests) is that you need a minimum level of VT340 compatibility for the tests to be of much use. That includes having support for aspect ratios, and matching the VT340 font size (although the latter requirement can often be worked around by configuring your terminal with a suitable 10x20 font).

@christianparpart
Copy link

christianparpart commented Sep 19, 2022

The only catch (at least for my tests) is that you need a minimum level of VT340 compatibility for the tests to be of much use. That includes having support for aspect ratios, and matching the VT340 font size (although the latter requirement can often be worked around by configuring your terminal with a suitable 10x20 font).

So to finally understand that... Given a (virtual) 10x20 pixels grid cell, that could be filled by 3 + 1/3 Sixel rows, and 10 sixel columns (assuming 1:1 ratio). Is this correct?

@j4james
Copy link

j4james commented Sep 19, 2022

Given a (virtual) 10x20 pixels grid cell, that could be filled by 3 + 1/3 Sixel rows, and 10 sixel columns (assuming 1:1 ratio). Is this correct?

That is correct, yes. When doing simple tests I often use a height of 60 pixels (3 text rows of 20 pixels), because that's exactly 10 sixel rows at 1:1, 5 sixel rows at 2:1, and 2 sixel rows at 5:1.

@hackerb9
Copy link

If we could just create a .md file gathering all that actually important documentation to easily brain existing Sixel TE implementations on par, that would actually be helpful. At least I'd take that as a based to improve Contour's sixel implementation. :)

That's a very good idea. While it's a start, I do not think my vt340test repository does a good job at summarizing what is required of a modern sixel terminal. It is merely an agglomeration of bug reports in various terminals, finding out what went wrong, and adding new test programs.

While documentation is the ideal end state, I think the next step, now that activity on vt340-test has cooled down, is to accumulate all the different tests into a single suite and make sure that all of the tests are correct and informative.

Anybody who'd like to help out with that is welcome to post ideas at hackerb9/vt340test#24

So to finally understand that... Given a (virtual) 10x20 pixels grid cell, that could be filled by 3 + 1/3 Sixel rows, and 10 sixel columns (assuming 1:1 ratio). Is this correct?

Yes and no. Other sixel devices, and even on the VT340 itself, text was not always in 10x20 character cells, so it should not have ever been a presumption. However, there didn't used to be a way to query geometry, so programs had to make presumptions. A case could be made that terminals should be backwards compatible with those presumptions.

However, 10x20 is not required or even recommended for sixel conformance. Application developers have ways to find out the size of the character cells — no presumptions necessary! My programs fall back to 10x20 only if a terminal is so old that it cannot report its geometry. I would be disappointed in any sixel terminal that lacked the ability to draw at native resolution.

That's why I fixed the test script for GLF so it doesn't rely on having a 10x20 font. Instead, I line up 2:1 sixels versus 1:1 sixels

@j4james
Copy link

j4james commented Sep 20, 2022

However, 10x20 is not required or even recommended for sixel conformance.

No, but it is required to emulate the VT340 and the VT240 in 80 column mode. If anyone were seriously attempting to emulate other sixel devices, or bothering to emulate ops like DECCOLM and DECSNLS correctly, then I'd recommend adjusting accordingly.

I would be disappointed in any sixel terminal that lacked the ability to draw at native resolution.

And I would be disappointed in any sixel terminal that lacked the ability to emulate even one of the actual sixel terminals. That makes it unusable as a terminal emulator. Not drawing at native resolution may be something some might dislike, but it's not going to stop anything from working.

Anyway, I didn't mean to start this argument all over again. My point in mentioning 10x20 above was because it's a requirement for all of my test cases. Making them independent of cell size was not an option when I originally wrote them because they were designed to be used as OS-independent text files that you could simply cat (or whatever your OS equivalent was).

I wouldn't object to anyone trying to convert them to be size-independent (although in some cases that may be impossible), but I personally don't see the point. If a TE doesn't care enough to emulate the VT240/VT340 cell size, then I doubt they're likely to care about the kinds of edge cases I was testing.

@jerch
Copy link
Owner Author

jerch commented Sep 20, 2022

Well for me the key take-aways from this discussion and hacker's test script is not to perfectly resemble vt340 output (as I dont strive for perfect vt340 emulation), but:

  • treating sixel too "image-like" is utterly flawed leading to several issues (no endless mode possible, no progress on naked GLFs etc.)
  • naked GLFs should advance text cursor in rows direction by the same amount a proper sixel sequence with pixel data would have done, specced in DEC STD 070, but not to assume a pixel perfect matching in y direction, instead apps are supposed to do a vertical re-positioning after sixels - sorry @j4james, I tend to see this as the most normative write-up from DEC we have for sixels, including its vaguely phrasing (which is often the case for specs to not get trapped in impl details, in that sense vt340's behavior is just impl detail of vt340), and listing vt240 as deviation means exactly this - it deviates from the rules intended by the write-up. (Mind you - looking through that deviation addendum it mostly reads as a list of old devices not following every aspect of the writeup. Idk the circumstances/intentions the sixel chapter were written under/with, but it kinda feels as if DEC tried to get SIXEL more formally specced by that explicitly leaving old devices behind as historical deviations.)

@j4james
Copy link

j4james commented Sep 20, 2022

I tend to see this as the most normative write-up from DEC we have for sixels

I think perhaps you're misunderstanding the purpose of the STD-070, in the same way that many people misinterpret the purpose of ECMA-48. Those aren't instruction manuals for terminal emulator authors - they were mostly targetting hardware developers, trying to maintain a minimal level of compatibility between a range of devices.

But if you're implementing a terminal emulator, what really matters is the behavior of the device you're emulating. STD-070 can be very helpful in figuring that out, but the actual DEC terminals didn't always match the STD-070 documentation exactly, so you need to look at what that actual device is doing if you want to get things right.

If you're not building a terminal emulator, but instead consider yourself to be designing a new kind of terminal that happens to be ANSI/ECMA compliant (or perhaps STD-070 compliant), then you have a lot more leeway to do things however you want. Although I'd argue you really shouldn't then be replying to a DA query with a response suggesting that you're DEC compatible, when that's not your intention.

Anyway, I'm not arguing against people that want to go that route. I've long since accepted that's the choice of most modern Linux terminals. But my intention with the tests I contributed to vt340test was to help people that actually do want to build a VT340-compatible terminal emulator. If that's not of interest to you, that's fine.

looking through that deviation addendum it mostly reads as a list of old devices not following every aspect of the writeup

That's exactly the case for the VT240. It's not like it's implementing some completely different cursor positioning scheme. It just doesn't support sixel scrolling. The VT340 with DECSDM enabled works in exactly the same way.

@hackerb9
Copy link

Anyway, I didn't mean to start this argument all over again. My point in mentioning 10x20 above was because it's a requirement for all of my test cases. Making them independent of cell size was not an option when I originally wrote them because they were designed to be used as OS-independent text files that you could simply cat (or whatever your OS equivalent was).

It's a point well taken. When I started the VT340 test project, I had no notion about any of this and my tests made the same assumption until I noticed 132 column mode didn't work. I think for unidirectional sixels (files meant to be just catted to the screen), a presumption of 80x24 characters and 800x480 pixels makes sense.

This leads to @christianparpart's question about proper documentation. It would be good to make a set of optional "diffs" from the VT340 manual and STD-070 to guide modern sixel engine developers. Ideally, there would be few and they would be easily testable. @christianparpart: from a developers perspective, would you prefer these addenda to be in the same style as the DEC documentation? (Formal, verbose, and with stylistic tics like 8-bit controls shown in ECMA's decimal-encoded hexadecimal.) Or would it be sufficient to have short prose with sample test programs? (The latter would be easier for me).

I believe one of those possible differences from the spec should be the idea of fixing the sixel geometry to be identical to the VT340, no matter the screen resolution. (@j4james, would you be willing to write up documentation and a test and post it to hackerb9/vt340test#24?)

  • treating sixel too "image-like" is utterly flawed

True, though it depends how much you can presume. I wouldn't have thought the PostScript language could make the leap from Turing-complete printer language to static image files, yet people have done so.

The VT340 is able to generate device independent hardcopy in Sixel because it presumes the receiving device will understand "Select Size Unit" -- an escape sequence the VT340 itself ignores. If modern terminals eventually implement SSU, then we'll be one step closer to being able to have sixels in a static file and cat them anywhere.

@jerch
Copy link
Owner Author

jerch commented Sep 21, 2022

@j4james For me the vertical cursor placement on a vt340 is still not quite clear in relation to the last sixel band position, also scrolling is affected by that.

Thus I try to approach the issue more algorithmically:

  • Lets assume for a sec: A sixel contains only 2 pixels in height.
  • Lets further assume: The terminal has a text row height of 3 sixel pixels.

Now this leads to the following sixel printing conditions:

rn - n-th text row in terminal
sn - n-th sixel band

    r1  s1
    r1  s1
    r1  s2
    r2  s2
    r2  s3
    r2  s3

Now lets further assume the sixel printing stops after s1, s2 or s3. Where does the text cursor end up?
For s1 and s3 things are pretty clear, it maps fully into a corresponding text row, thus the text cursor ends up there. And so happens scrolling if needed.

For s2 things are not so clear anymore. From your vt340 test findings I'd assume, that the text cursor would end up in r1 and not in r2? Also about scrolling - if r1 happens to be at the lower scrolling border, I'd assume no scrolling would occur (basically cutting off one pixel row in s2)? Is that correct or is there another twist to respect? Regarding scrolling - does screen end behave exactly the same as early lower scroll margin? Or would it still print the 2nd pixel row of s2 in the non-scrolling area below? I know we already did several scrolling tests, but cant remember if there is an overflow/overprint edge case from within the last sixel band. (This question could be further reduced to when exactly the text scrolling happens related to sixel printing...)

Edit:
Btw, things are further complicated on TEs, that do fractional cell heights, which I did not account for above. So basically the question can be extended to "Is it the start or the end of first pixel row"? Following the pixel reduction scheme from above I'd assume "where it starts".

@Utkarsh-khambra
Copy link

Utkarsh-khambra commented Sep 21, 2022

from vt100.net

When sixel display mode is enabled, the sixel active position begins at the upper-left corner of the ANSI text active position. Scrolling occurs when the sixel active position reaches the bottom margin of the graphics page. When sixel mode is exited, the text cursor is set to the current sixel cursor position.
The VT300 sends a sixel next line (-) character following a sixel dump

According to this depending on where in s2 the printing stops we may end up in either r1 or r2.

@jerch
Copy link
Owner Author

jerch commented Sep 21, 2022

According to this depending on where in s2 the printing stops we may end up in either r1 or r2.

I dont think so - the problem is, that sixel bands have a multi pixel height of its own, so you cannot end up in printing only one pixel line, as it always accounts for the full pixel height, except for those edge cases I am trying to find out. Technically the problem is, whether the sixel-cursor is implemented with a height of a band, or virtually "implodes" to the top of a band (which the tests suggest).

To not mix-up terms I roughly use them as following:

  • sixel-pixel: smallest adressible coloring entity (one bit in a sixel), 1px in height and width
  • a sixel: byte value containing 6 sixel-pixels in height, one in width (2 sixel-pixels in height in my thought experiment)
  • pixel line: one line of sixel-pixels horizontally, progression to the right from horizontal sixel-cursor advance (happens automatically)
  • sixel band: progression of sixels horizontally (thus consisting of several pixel lines)

@j4james
Copy link

j4james commented Sep 21, 2022

For s2 things are not so clear anymore. From your vt340 test findings I'd assume, that the text cursor would end up in r1 and not in r2?

@jerch That would be my assumption, yes.

Also about scrolling - if r1 happens to be at the lower scrolling border, I'd assume no scrolling would occur (basically cutting off one pixel row in s2)?

No, I don't think so. Scrolling is independent of the text cursor position. If I remember correctly, it typically scrolls as soon as a sixel band would otherwise fall off the bottom of the screen or margin. But there are some edge cases where it scrolls early (see the discussion here: hackerb9/vt340test#11 (comment)).

But the best way to get answers to these questions is to run the tests and see if your implementation passes. For the text cursor position, everything you need to know should be encapsulated in the cursor_position.sh test. For scrolling, look at the scrolling.sh test. If there's something not covered by those tests, then it's almost certainly something I won't know.

I believe one of those possible differences from the spec should be the idea of fixing the sixel geometry to be identical to the VT340, no matter the screen resolution. (@j4james, would you be willing to write up documentation and a test and post it to hackerb9/vt340test#24?)

@hackerb9 To be brutally honest, I think writing more documentation is probably going to be a waste of time. The one thing I've learnt from testing terminal implementations is that most developers either don't read documentation, or are simply not very good at following it. Not that I want to discourage you from writing any, because I've certainly appreciated the information you've shared regarding the VT340, but I'm kind of burnt out on this subject.

@jerch
Copy link
Owner Author

jerch commented Sep 22, 2022

No, I don't think so. Scrolling is independent of the text cursor position. If I remember correctly, it typically scrolls as soon as a sixel band would otherwise fall off the bottom of the screen or margin. But there are some edge cases where it scrolls early (see the discussion here: hackerb9/vt340test#11 (comment)).

Thx for the pointer - oh well, so sixel scrolling itself does not follow the text cursor back mapping mechs, but has its own logic, now respecting the full sixel band height, and happens before back mapping. Lol, this really messes things further up, as it cannot be handled a uniform fashion:

  • within screen (no scrolling needed) - final text cursor gets mapped to top of last sixel band position
  • below screen/margin - scroll text lines until last pixel line maps back to margin, then apply back mapping - prolly w'o top-of-last-band mapping being applied (as this would place the text cursor one row higher)

Gonna check if the test scripts cover these edges cases (they are more like definition holes in my current understanding), once I have applied the other pending changes.

@hackerb9 To be brutally honest, I think writing more documentation is probably going to be a waste of time. The one thing I've learnt from testing terminal implementations is that most developers either don't read documentation, or are simply not very good at following it. Not that I want to discourage you from writing any, because I've certainly appreciated the information you've shared regarding the VT340, but I'm kind of burnt out on this subject.

I really would appreciate such an effort. The sixel doc is very scarce with a lot of "reading between the lines", and I think that most developers try to follow things, if they are clearly laid out and can be applied to the inner works of their TEs. The latter ofc is a problem of its own, as treating sixels too image-like shows. Not sure if devs can be convinced to change that (no clue if anyone is really interested in sixel endless mode), as proper graphics-layered mode from sixels while keep working with xterm-style scrollbuffer is quite hard to achieve (gets very resource hungry).
The whole matter gets further complicated by the question, whether to emulate certain device mechanics (e.g. fully vt340 compliant), or whether to do a least denominator impl and get sixels somehow or well enough working (as far as I am aware non of the OSS emulators currently implements aspect/pixel ratios). The latter could be solved by propagating SSU more, which could really level out all those nasty "I resized my TE window - why is the image totally screwed up now?" issue. It would also help to get other image protocols into a uniform coordinate system.

@j4james
Copy link

j4james commented Sep 22, 2022

The sixel doc is very scarce with a lot of "reading between the lines", and I think that most developers try to follow things

When it comes to edge cases I might agree, and cursor positioning in particular is weird, but the very first thing you'll encounter when reading the sixel documentation is that the sequence takes two parameters, one selecting an aspect ratio, and one controlling transparency , and the vast majority of terminals don't support either. That's just developers choosing to ignore the spec. I don't want to waste days of my life writing up detailed specifications which devs are just go to skim over and go "no, I don't think I want to bother with that".

as far as I am aware non of the OSS emulators currently implements aspect/pixel ratios

Essentially yeah. RLogin at least tries, but not correctly (as far as I can recall, something like 2:1 would make the image half width, instead of double height, so it's technically correct as an aspect ratio, but not what it should be). XTerm looks like it also tries to support aspect ratio - there's code for it - but it's so buggy that it always ends up 1:1. I suspect they may have just given up half way, because I tried putting together a patch to fix the obvious problems, and there was quite a lot of additional rendering work required.

@jerch
Copy link
Owner Author

jerch commented Sep 22, 2022

but the very first thing you'll encounter when reading the sixel documentation is that the sequence takes two parameters, one selecting an aspect ratio, and one controlling transparency , and the vast majority of terminals don't support either. That's just developers choosing to ignore the spec.

Well I can only speak here for myself: I basically left out the ratio handling for 2 reasons - I wasnt able to find any other TE supporting it back then, thus I had not clear (test) path to implement it. Secondly the docs for it stayed cloudy to me (yes I really wondered myself, whether 2:1 is exactly this or in reality means 1:0.5, or if there is another inch/dp translation is going on top of it) - and with nothing to test against it is hard to get somewhere. My guess here - everyone else had the same problem, thus we all ended up ignoring it.
About the second param - imho thats a very much underrated param, as it basically allows bandwidth saving diff updates from sixels. But xterm has perf issues with that transparency mode, so again my guess - no app ever used it because of poor perf, which in turn lowers the expectations as a needed feature on TE side.

Well, I am not trying to excuse things and cannot speak for others, but I still think that most of it is a direct follow-up of lousy or hard to grasp docs. (Or a good test battery, as good tests can replace lengthy docs.)

@j4james
Copy link

j4james commented Sep 22, 2022

This is literally page 1 of the VT340 sixel documentation:

The pixel aspect ratio defines the shape of the pixel dots the terminal uses to draw images. For example, a pixel that is twice as high as it is wide has an aspect ratio of 2:1.

Then it goes on to list the supported P1 aspect ratios. Edge cases I can understand, like the rounding rules for aspect ratios defined by the raster attributes, but the basic macro parameter is something everyone should have been able to figure out.

Add to that, it's now been more than a year since hackerb9 set up the vt340test repo, and I think we've covered every possible edge case regarding aspect ratios in the tests here:

https://github.com/hackerb9/vt340test/blob/main/j4james/raster_aspect_ratio.sh
https://github.com/hackerb9/vt340test/blob/main/j4james/macro_aspect_ratio.sh

I'm sorry but I just don't buy the argument that there's a lack of information on the subject. For those of you that are genuinely trying, I apologise if I'm coming across as overly critical, but I've seen far too many devs making statements like "who cares about a 30 year old spec". At this point, I've come to assume that almost nobody cares.

@christianparpart
Copy link

  • I wasnt able to find any other TE supporting it back then,

that might change soon. We at least give it a try (thanks to @Utkarsh-khambra), at least as much as reasonably possible.

Secondly the docs for it stayed cloudy to me

I couldn't agree more. @hackerb9, wrt. documentation effords, I partly agree with @j4james (sorry for being burnt out), I don't try to get 100%-no-matter-what compat, just as much as reasonably implementable & fun while doing so. The good motivation (or only) is to have something that is somewhat backwards compatible, and putting the energy instead into the future protocol. (just my 2 cents). But I also don't want to let your PR here explode @jerch, sorry for that 🤗

My big motivations are actually being able to look at the actual genuine screenshots and shell scripts for testing from the vt340test repo (thx @ all for that).

@jerch
Copy link
Owner Author

jerch commented Sep 23, 2022

Wrote a small endless test script here https://github.com/jerch/xterm-addon-image/blob/9a45531788e4624bb6efe842137b6f0aabf92443/fixture/endless.sh, which slowly prints a sine wave forever.

The endless mode isnt supported by any TE I can test here (only get some output after pressing Ctrl-C). So all my TEs do the image abstraction thingy not returning anything from sixel mode in between. @hackerb9 could you test this on the vt340? I have mainly 2 questions linked to endless mode:

  • At which frequency would a vt340 update the graphics layer? I'd assume it draws directly into its outgoing graphics buffers, thus only the display update freq would limit it? Or is there some perceivable intermediate buffering, thus it adds multiple points/pixel lines at once?
  • Would this really run forever or will the vt340 break out of sixel mode after some time? This is mainly the question whether there is some kind of upper sixel limit in place, either time or payload based.

@j4james
Copy link

j4james commented Sep 23, 2022

Wrote a small endless test script here

Very cool! FWIW, it works for me, and it works in Reflection Desktop. Based on past experience, I would also expect it to work in IBM Personal Communications, but that's a pain to test.

At which frequency would a vt340 update the graphics layer?

I'm curious about this too. From my experiments on the VT240, it used to refresh every 6 bytes or so (it seemed to differ depending on the kind of operation you were performing). Reflection Desktop I think refreshes after every GNL, and I think I'm doing something similar (I don't have easy access to my source at the moment).

Would this really run forever

I would assume so - it's not like it's going to run out of buffer space, because it doesn't have scroll back. And in my tests on the VT240 I didn't see any limit, but I admittedly haven't run it for that long.

@hackerb9
Copy link

hackerb9 commented Sep 24, 2022

Wrote a small endless test script here https://github.com/jerch/xterm-addon-image/blob/9a45531788e4624bb6efe842137b6f0aabf92443/fixture/endless.sh, which slowly prints a sine wave forever.

Love it!

The endless mode isnt supported by any TE I can test here (only get some output after pressing Ctrl-C). So all my TEs do the image abstraction thingy not returning anything from sixel mode in between. @hackerb9 could you test this on the vt340? I have mainly 2 questions linked to endless mode:

Works great, of course.

  • At which frequency would a vt340 update the graphics layer? I'd assume it draws directly into its outgoing graphics buffers, thus only the display update freq would limit it? Or is there some perceivable intermediate buffering, thus it adds multiple points/pixel lines at once?

No perceivable intermediate buffering nor clustering of multiple pixel lines at once. I admit, it was a bit hard to tell once it started scrolling by lines as those, of course, jump by 20 pixels. I'll have to add the escape sequence to turn on smooth scroll and see if that works...

  • Would this really run forever or will the vt340 break out of sixel mode after some time? This is mainly the question whether there is some kind of upper sixel limit in place, either time or payload based.

It ran, essentially, forever. I had it going for over an hour before the laptop the VT340 was logged into had to sleep. The VT340 screensaver did not activate during that time, even after losing the serial connection. I guess while it's processing sixels, the VT340 cannot do anything else.

@christianparpart
Copy link

Would anyone with that capability be so kind and create a video demonstrating that?

@jerch
Copy link
Owner Author

jerch commented Sep 24, 2022

Thx for testing. The results are as I expected - w'o perceivable buffering breaks, and indeed running forever. Ofc if buffering happens at a much smaller scale (within few bytes), it might easier to test with a much faster growing line from single sixels.

For completeness I also made the sine script somewhat faster: https://github.com/jerch/xterm-addon-image/blob/49def6e4141ef7681054636303155c286c0c5438/fixture/endless.sh (prolly cannot go much faster in shell script due subprocess spawning).

@j4james
Copy link

j4james commented Sep 24, 2022

Would anyone with that capability be so kind and create a video demonstrating that?

@christianparpart This is what it looks like on my terminal. I think Reflection Desktop was actually a bit smoother, possibly because my current implementation is using some sort of timing heuristic to decide when to flush the buffer. I can't remember the details anymore.

james@Bragi_.2022-09-24.10-02-32_Trim.mp4

@j4james
Copy link

j4james commented Sep 24, 2022

There is something else I've been meaning to bring up if some of you are seriously considering adding support for streaming. You should be aware that the streaming requirements can be somewhat in conflict with modern usage for things like video. At least that was the case in my architecture.

For example, you can get shearing in video frames where you've started to display part of one frame while the lower half of the previous frame is still visible below. Even worse is that some libraries output frames with the background select parameter configured to erase the background, so you can get flickering when you see the background temporarily cleared before the next frame is fully output.

Depending on your architecture, this may not be an issue for you, but I just wanted to warn you to look out for it.

@christianparpart
Copy link

christianparpart commented Sep 24, 2022

2 solution ideas. Either bring the terminal explicitly into VT340 mode für streaming sixels or enable synchronized rendering für to have video output flicker free. Would that work from your point of view?

Another idea is that if raster attributes width and height are provided them it is image like, otherwise it is not.

EDIT: okay, that is 3 ideas 💡.

@j4james
Copy link

j4james commented Sep 24, 2022

I've had a chance to look at my code now, and the basic approach I'm taking is to flush the output if it's been more than 0.5s since the start of the image or the last flush (there are also certain things that force a flush, like some palette changes, and obviously the end of the image). This works fine for videos - as long as I can keep up with at least 2 fps - but streaming can be a bit jerky.

If I ever get back to sixel, though, one of the ideas I wanted to try was to use the terminal's write buffer to determine when to flush. For example, if you're playing a video, I expect you'd typically receive a whole frame in a single write, so you can leave off flushing the output when you know there's still data pending. For streaming output I'd expect a lot of short writes, which could then be flushed much more frequently.

That buffer idea wouldn't have been possible at the time I originally wrote the code, but I think it should be feasible now with recent changes to our architecture.

Edit: @christianparpart I realise I didn't actually answer your question. I don't think there's necessarily anything terribly wrong with your suggestions - frankly I'm just happy you're even considering supporting stuff like this - but personally my goal is to try and make everything just work by default (i.e. both modern and legacy sixel applications).

So if I run a streaming sixel application, I ideally want that to work without switching to a VT340 compatibility mode (one reason being I'd want to be able to use streaming in combination with modern extensions like 256 colors). And if I'm viewing a sixel animation with something like img2sixel, I also expect that to work reasonably well without them having to change their existing implementation.

@jerch
Copy link
Owner Author

jerch commented Sep 29, 2022

Another idea is that if raster attributes width and height are provided them it is image like, otherwise it is not.

Thats how my decoder currently handles it, but I think it is not the appropriate distinction for streaming sixels - an app might want to clear its drawing area beforehand by setting raster extends and still goes into endless sixel drawing. From sixel seq itself there is no way to decide, whether a well-formed image packed into sixels is following, or if sixels will keep coming in.

So yes, with a more fine-grained screen update from sixels we basically re-introduce tearing, which is easily avoided by the image-like handling. Bummer.

A heuristic approach might be possible time-based, as a full image is likely to come in very fast (or with very fast band progression or in big data chunks), and thus will finish at raster extends soon. Ofc this is quite flakey, as any time-based heuristic might get fooled by transmission latency. Any other ideas, beside a hard TE setting?

Maybe GIP can come for a rescue here? @christianparpart Idk if you remember - I once suggested to allow to embed a sixel seq as data part in GIP. If we'd go that route, we could restrict sixel usage there to level2 always containing raster extends and never exceeding extends - thus basically limiting it to images packed in sixels. With that we have our distinction:

  • GIP(sixel) == image --> treat as image, screen update not before end of sixel data
  • plain sixel == prolly endless/streaming --> more fine-grained screen updates

@j4james
Copy link

j4james commented Sep 29, 2022

Ofc this is quite flakey, as any time-based heuristic might get fooled by transmission latency.

Sure it's not going to work perfectly all the time, but my thinking was that if your transmission latency is so bad that you can't get you a full frame in under 500ms then shearing of the video is the least of your problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants