Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kobo Elipsa support (a.k.a, sunxi) #64

Merged
merged 158 commits into from Jul 15, 2021
Merged

Kobo Elipsa support (a.k.a, sunxi) #64

merged 158 commits into from Jul 15, 2021

Conversation

NiLuJe
Copy link
Owner

@NiLuJe NiLuJe commented Jul 11, 2021

Well, that was not fun.


This change is Reviewable

NiLuJe added 30 commits June 27, 2021 00:29
And putting it in a centralized place, because that way lies madness.
So, at least ensure we use saner-ish data types on *our* side...
And, good news, the command handlers are mildly saner.
They're still hobbled by the handler's prologue, though...
(Confirmed by raising the disp print level via debugfs).
Because we can't rely on fb0 anymore :(
Mostly for FFI purposes.
Not completely critical, as we don't write to the fb on that platform.
rotation, and make 0 behave as the canonical Portrait,
as that's what it's effectively always set to in Nickel.
@lgtm-com
Copy link

lgtm-com bot commented Jul 11, 2021

This pull request introduces 2 alerts when merging 24acf62 into 9e6b568 - view on LGTM.com

new alerts:

  • 2 for FIXME comment

@NiLuJe
Copy link
Owner Author

NiLuJe commented Jul 11, 2021

A few significant quirks:

  • The framebuffer itself (as in, the character device) is now almost entirely meaningless. We basically only need it to tell us the screen's dimensions. This makes the whole FBFD_AUTO workflow kind of tricky, because we need a whole other set of things besides a fb fd to make it work... All of that extra stuff is kept internal, but if you're wondering why there's suddenly a few new and unfamiliar fds open, that would be the reason (they're all flagged CLOEXEC, though).

  • Another direct consequence of that fact is that there's no longer a public way to poke at the screen's work buffer, as each process gets one (or more, if you're really into crazy stuff) private buffer. If you just wanted screen grabs, see utils/sunxigrab.sh for a way to do that.

  • This also means that dealing with rotation has become even more hellish than usual, since you don't actually have any way of telling what's the layout of what is currently on the screen. Instead, I had to resort to some dirty hackery to poke directly at the accelerometer to tell us how the device is physically laid out. But that doesn't necessarily match the screen's layout... And, as a cherry on the cake, if you do attempt to display stuff in a different layout, all hell breaks loose, and none of the available workarounds are all that great. (Follow the breadcrumbs around fbink_toggle_sunxi_ntx_pen_mode at the bottom of the public header for more details).

  • That wasn't a surprise, because knowing other devices on sunxi SoCs, I was kind of expecting it, but, there is no hardware inversion support, there is no hardware dithering support (and software dithering selection is currently broken, more details below), and AUTO waveform mode selection is also not hardware accelerated, so, prefer specifying a mode explicitly.

  • The fact that each process works on a private buffer also makes anything that relied on "previous" buffer state meaninless in the CLI tool, since each invocation starts with a brand new black-filled buffer (a possibly non-exhaustive list, I'll update the CLI doc as I go. Note that this is mainly an issue for the CLI tool. API users get their own stable buffer, so, as long you're working on your own content, stuff behaves as usual.

    • --refresh, since it gets its own buffer, will only yield a solid black region.
    • Which in turns makes --norefresh useless, since each draw gets a different buffer, and the final refresh too ;p.
    • --fgless / --bgless / --overlay drawing modes are also unusable, since they all rely on the existing state of the buffer to some extent, and that doesn't really work anymore.
  • For API users, this affects what you can do with the dump & restore API calls. Documentation has been updated to mention those limitations.

  • I'm probably forgetting a few things, because I've lost count of how many times doing stuff wrong caused a kernel OOPS/BUG/hang/softlock/reboot. -_-"

@NiLuJe
Copy link
Owner Author

NiLuJe commented Jul 11, 2021

And now, ping @gtalusan, because I come bearing bug reports ;p.

In no particular order:

Kernel issues:

  • The ION_HEAP_QUERY ioctl is subtly broken, because of a mainline issue. c.f.,

    FBInk/utils/ion_heaps.c

    Lines 111 to 116 in 24acf62

    // NOTE: There's a fun bug in the Elipsa kernel,
    // where ion_query_heaps never actually sets the return value to 0 on success...
    // It only does so for bufferless queries :s.
    // NOTE: It was only fixed for Linux 4.12,
    // which is also the kernel where this all mess became useless because of the new dmabuff API :D
    // c.f., https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/staging/android/ion/ion.c?h=v4.12&id=804ca94a98237e12ad8fabf7cdc566f14b05b7df

    (Granted, we don't need it, and, as the code above shows, we can get it to work, but it came in handy to triple-check a few ION things, and the fix is trivial ;)).

  • There's a rather nasty issue around the EINK_DITHERING_ constants: they all share a bit with EINK_DITHERING_Y1, which happens to be the bit that the if ladder checks for first in eink_image_process_thread @ drivers/video/fbdev/sunxi/disp2/disp/de/eink_buffer_manager.c, which effectively means that every dithering flag actually ends up being detected as this one.
    c.f.,

    FBInk/fbink.c

    Lines 2619 to 2624 in 24acf62

    // NOTE: Results are eerily similar to the Y1 algos... (i.e., we do *much* better).
    // That's probably because the bitmask handling in eink_image_process_thread is fishy as hell...
    // (The constants are slightly broken in that *every* EINK_DITHERING_?? flag
    // includes the same bit as EINK_DITHERING_Y1,
    // and the check starts by checking for... EINK_DITHERING_Y1 >_<").
    // TL;DR: No matter what you request, the kernel always uses EINK_DITHERING_Y1 :(.

  • Gnarlier, there's a similar bitmask issue with the waveform constants, this time: all the constants strictly greater than EINK_GLD16_MODE share at least one bit with other modes. In particular, the GLK/GCK ones share bits with A2, and, at a quick glance, there is code that checks and takes A2-specific code-paths after night-specific codepaths...

The issue is slightly gnarlier than the dithering one in that there is enough space in the first 16 bits (even without moving RECT & AUTO to higher bits) to fit all of these, but a lot of the code masks the wrong amount of bytes manually (& 0xFF) instead of using the GET_UPDATE_MODE macro...

  • In the same vein, and possibly related, GC4L & CLEAR currently crash the driver (first causes a timeout and then silent failures, and the device becomes extremely reluctant to soft reboot).

  • Something slightly more trivial that I just noticed when cruising by: in ring_buffer_manager_init (@ drivers/video/fbdev/sunxi/disp2/disp/de/eink_buffer_manager.c), in the error codepath, a buffer is freed with a wrong size (* 4) @ L3493 while it's allocated without a multiplier @ L3340...


nickel/pickel quirks:

  • pickel appears to be setting the heap_id_mask field in the ION_IOC_ALLOC ioctl to ION_HEAP_MASK_CARVEOUT|ION_HEAP_MASK_DMA. Is that on purpose (there is enough space left in the carveout heap, after all)? Right now DMA gets picked simply because it has higher priority thanks to its higher id (this is where ION_HEAP_QUERY came in handy ;p).
    I haven't straced nickel yet, but it does also end up with an alloc in the DMA heap.

  • pickel passes a pointer instead of a value for the flags arg in the ION_IOC_ALLOC ioctl. Again, haven't checked nickel.

  • Both nickel & pickel pass a pointer instead of a value for the cfa_use arg in the DISP_EINK_UPDATE2 ioctl.

The ion stuff was sniffed via a patched strace, as was the disp stuff, although it's also evident in debug logging if you raise disp's debug level:

FBInk/fbink.c

Lines 2571 to 2575 in 24acf62

// NOTE: In case of issues, enable full verbosity in the DISP driver:
// echo 8 >| /proc/sys/kernel/printk
// echo 9 >| /sys/kernel/debug/dispdbg/dgblvl
// And running klogd so you get everything timestamped in the syslog always helps ;).
// "Small" caveat: it appears to make the driver *that* much buggy and crashy...

@gtalusan
Copy link

Nickel uses its own accelerated dithering instead of the kernel's. You should too.

Don't confuse pointers with uninitialized/doesn't-crash-then-don't-care values.

@gtalusan
Copy link

Also CARVEOUT | DMA is probably definitely a bug on our side. CARVEOUT is the way to go. 💯 on that.

@NiLuJe
Copy link
Owner Author

NiLuJe commented Jul 11, 2021

Nickel uses its own accelerated dithering instead of the kernel's. You should too.

I usually am (well, without the "accelerated" part :D), but the EPDCv2 dithering support available on Mk. 7 was surprisingly decent ;).

Don't confuse pointers with uninitialized/doesn't-crash-then-don't-care values.

I was on the fence, but the fact that they appeared to be close to actual useful pointers in the same ioctl args and deref to "correct" (usually 0 ;p) values was strange ^^.

(Definitely on the "mostly harmless" side of things, though. Although I wouldn't necessarily trust all the cfa_use checks in the kernel to do the right thing...).

@NiLuJe
Copy link
Owner Author

NiLuJe commented Jul 11, 2021

Also CARVEOUT | DMA is probably definitely a bug on our side. CARVEOUT is the way to go. 100 on that.

Switched to the carveout in 934616f, at a very quick glance nothing new horribly blew up, thanks ;).

┌─(ROOT@europa:pts/1)──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────(~)─┐
└─(2.04:19%:21:55:100%:#)── cat /sys/kernel/debug/ion/heaps/carveout                                                                                                                                                                                                                                                                                   ──(Sun, Jul 11)─┘
          client              pid             size
----------------------------------------------------
        disp_ion                1         97714176
    finger_trace             2908          2629632
----------------------------------------------------
orphaned allocations (info is from last known client):
----------------------------------------------------
  total orphaned                0
          total         100343808
   deferred free                0
----------------------------------------------------

@lgtm-com
Copy link

lgtm-com bot commented Jul 11, 2021

This pull request introduces 2 alerts when merging 934616f into 9e6b568 - view on LGTM.com

new alerts:

  • 2 for FIXME comment

@NiLuJe NiLuJe changed the title Kobo Ellipsa support (a.k.a, sunxi) Kobo Elipsa support (a.k.a, sunxi) Jul 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants