-
-
Notifications
You must be signed in to change notification settings - Fork 9
GPU-accelerated Display (Vulkan) #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
First step toward rendering multiple objects (using different SDF shaders)
(I want to draw multiple things in a row without having them each clear the screen.)
Remove csubst now that it's part of base C lib
multiple of a shape
Also fix linebreaks in csubst
(next, update images, which is harder)
Now we can standardize on struct representation of C structs in Tcl instead of dict representation... hopefully simpler code too in the end.
WIP: Start on images/sampler2D. Improve char array support in C. Actually put uboFields in Pipeline struct so they don't shimmer out (because we now run the Pipeline data back into C to set up the image right before draw time).
(mostly undoing refactor; also pointer fix in calibrate)
Review time! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
99% amazing!
README.md
Outdated
$ CFLAGS="-O2 -march=armv8-a+crc+simd -mtune=cortex-a72" CXXFLAGS="-O2 -march=armv8-a+crc+simd -mtune=cortex-a72" meson -Dglx=disabled -Dplatforms= -Dllvm=disabled -Dvulkan-drivers=broadcom -Dgallium-drivers=v3d,vc4,kmsro -Dbuildtype=release .. | ||
|
||
# AMD (radeonsi), including Beelink SER5 | ||
$ meson -Dglx=disabled -Dplatforms= -Ddri-drivers='' -Dvulkan-drivers=amd -Dgallium-drivers=radeonsi -Dbuildtype=release .. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-Ddri-drivers=''
is unnessary and causes it to fail
|
||
1. See [notes](https://folk.computer/notes/vulkan) and [Naveen's | ||
notes](https://gist.github.com/nmichaud/1c08821833449bdd3ac70dcb28486539). | ||
1. `sudo adduser folk video` & `sudo adduser folk input` (?) & log out and log back in (re-ssh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe put the video and input groups before the vulkan stuff? It tripped me up when the folk user couldn't use the video out, causing vkcube to fail
``` | ||
|
||
Go to http://whatever.local:4273/frame-image/ to see the camera's | ||
current field of view. Reposition your camera to cover your table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh this is awesome! Way better than trying to use the projection itself.
calibrate.tcl
Outdated
int captureNum = 0; | ||
uint8_t* delayThenCameraCapture(Tcl_Interp* interp, const char* description) { | ||
usleep(100000); | ||
usleep(500000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this longer?
variable rtypes { | ||
int { expr {{ $robj = Tcl_NewIntObj($rvalue); }}} | ||
int32_t { expr {{ $robj = Tcl_NewIntObj($rvalue); }}} | ||
double { expr {{ $robj = Tcl_NewDoubleObj($rvalue); }}} | ||
float { expr {{ $robj = Tcl_NewDoubleObj($rvalue); }}} | ||
char { expr {{ $robj = Tcl_ObjPrintf("%c", $rvalue); }}} | ||
bool { expr {{ $robj = Tcl_NewIntObj($rvalue); }}} | ||
uint16_t { expr {{ $robj = Tcl_NewIntObj($rvalue); }}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels like uint8_t
is missing?
pi/Gpu.tcl
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all is flying over my head, but it seems great!
vendor/fonts/PTSans-Regular.csv
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe write up how to generate these?
namespace eval ::Display { | ||
variable WIDTH | ||
variable HEIGHT | ||
variable LAYER 0 | ||
regexp {mode "(\d+)x(\d+)"} [exec fbset] -> WIDTH HEIGHT | ||
if {$::isLaptop} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idk if this is relevant to this code, but the laptop stuff still doesn't work on my end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good to me, but the calibration being broken is wierd
It's worked for me -- can you post some of the jpegs in the folk folder that get emitted during calibration? I wonder whether it's a timing issue or a repeatability/optics issue. It's definitely janky though |
The sleep (and extra webcam framegrabs) during calibration is because it takes some time for the projected stripes to get to the real world -> be visible from the webcam (and to appear in the received buffer), but they're kind of guesses... |
try more v4l2-ctl stuff. Using Gpu directly seems to fix us dropping the white draw call.
Replace our software rendering with GPU rendering using shaders. All Display primitives (text, lines, images, filled shapes) should work as before.
Performance is extremely good on the NUC (60fps without any drops, even rotating multiple large images and text, compared to 20-30fps before). (less so on Pi 4, maybe moderately worse than the software renderer but not terrible)
You'll need to install Vulkan on your machine to get this to work; see the README and wiki.
Replaces
with
Implements a new 'GPU FFI' which allows you to write vertex and fragment shaders to implement drawing primitives:
The first argument to
Gpu::pipeline
is a list of overall arguments for both shaders. These are accessible from both the vertex and fragment shader. You pass these in at draw time (they're push constants -- sort of like uniforms in GL, but limited in total size to 128 bytes). (The exception isfn
-type arguments, which aren't actually passed in; they're meant to be names ofGpu::fn
functions you've already made, which are looked up at call time from the surrounding Tcl environment and inlined into the shader. They just fall out and don't map to arguments in Gpu::draw)The second argument to
Gpu::pipeline
is the source code of the main function of a vertex shader. It should return a vec2 vertex of a quad in Vulkan triangle-strip vertex order (should be topleft, topright, bottomleft, bottomright, like a Z, not counterclockwise) based ongl_VertexIndex
(it will be called with vertex index 0, 1, 2, 3).The returned vertex should be in screen coordinates, not in [0, 1]. You can access the builtin
vec2 _resolution
to get resolution of the screen. (In practice, so far we mostly use the vertex shader for clipping so we don't have to touch the whole display on every draw.)The third (or fourth, if you make fragment shader fn arguments before it) argument to
Gpu::pipeline
is the source code of the main function of a fragment shader. You can accessvec2 gl_FragCoord
to get current pixel coordinates. It should return a vec4 color. You can access any of the overall arguments (including_resolution
) from here as well.See
Gpu.tcl
anddisplay.folk
for more examples. You can makesampler2D
-type arguments if you want to pass an image in; you can makefn
arguments to a fragment shader as third argument.A pressing TODO is to break the drawing primitives up into separate virtual programs and expose all this better to the user (it's currently hard to use in a user program, because you need to run this code on the Display process).
Substantially rewrites calibrate.tcl to use the new drawing system. Also slightly friendlier print output.
Removes the live camera preview from pi/Camera.tcl -- we'll recommend the web image for now, unless we reimplement it.
Extends c.tcl to automatically generate Tcl-side struct getter functions like
image_t data
,image_t width
, etc (ensemble commands under the struct type name, one for each field).Extends c.tcl to allow some throwing of Tcl exceptions deeper in C (still WIP).
Extends the Folk interprocess heap in main.tcl to allow freeing of heap allocations (it now is built on top of dlmalloc).
On each allocation, the heap now also stores a random 64-bit 'version' for that allocation (which can then be remembered by the caller). You can query the heap with any address and it will try to give you the 'version' if there is an allocation containing that address. This is used to check for staleness of images that have been copied to the GPU (like camera slices). If the version mismatches the one we stored at previous copy-time, then we know we have to recopy that image to the GPU.
This implementation is pretty inefficient and unsafe (it walks a capped-256 array of all allocations) and we may want to replace it with an interval tree or something at some point. We will probably want to introduce more locking, too.
Fixes #22.
Please test this if you can (and help with documentation if you can); I'd like to merge it by Monday if possible and have at least 1-2 people sign off on it. I know @charlesetc has been running it for a few days without problems.