Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize OpenGL Drawing #90

Open
kaveh808 opened this issue Sep 5, 2022 · 30 comments
Open

Optimize OpenGL Drawing #90

kaveh808 opened this issue Sep 5, 2022 · 30 comments

Comments

@kaveh808
Copy link
Owner

kaveh808 commented Sep 5, 2022

Use vertex arrays and the like to speed up the current naive drawing code in opengl.lisp.

@JMC-design
Copy link
Contributor

it's been over 5 years since the last release of Opengl, there probably shouldn't be any reason to target anything but the latest. But since the writing is on the wall, perhaps some thought should be put into how to abstract over both opengl and vulkan. Though I'm thinking that might need somebody familiar with vulkan.
If not, then there's choosing an abstraction to handle modern gl or write yet another one.

@jolby
Copy link
Collaborator

jolby commented Sep 6, 2022

Regarding the need to keep our eyes on the next graphics API as @JMC-design was talking about: Piet-gpu is a good project to follow. They have a 2d/font focus, but they are pushing the envelope for doing as much of the compute for a UI on the GPU:
https://github.com/linebender/piet-gpu
project vision:
https://github.com/linebender/piet-gpu/blob/main/doc/vision.md
.. And Raph Levien has some fantastic articles about doing graphics/compute on modern GPU/gpu-apis:
https://github.com/linebender/piet-gpu/blob/main/doc/blogs.md

@JMC-design
Copy link
Contributor

Might also want to set a bar for minimum gpu memory, I guess that's something that needs to be tracked, such a weird concept.
I've run across piet when looking for ideas on a rich text sort of api. I'm not sold on specifying ranges, though it is nice that it allows the text to be unmodified. I'm still leaning towards something I can read or write to a stream, so list of objects and lists that change attributes.

@JMC-design
Copy link
Contributor

yet another opengl abstraction for lisp
https://github.com/jl2/simple-gl

@kaveh808
Copy link
Owner Author

kaveh808 commented Sep 6, 2022

I am very keen to maximize use of the GPU as well as SIMD and multiple cores. I really want our system to be able to handle production-level datasets with the same (or better) speed as commercial packages.

How we architect this (improved OpenGL interface, Vulkan, compute on GPU) is something we should discuss.

If we do have a Vulkan enthusiast, a first step could be to implement the equivalent of the code in opengl.lisp.

Also, one of my goals is to develop a cross-platform GUI toolkit. Currently we're building it on OpenGL, using the text engine by @awolven and font rasterizer by @JMC-design .

@JMC-design
Copy link
Contributor

So I've just drawn my first triangle using vertex arrays and here are some of my initial thoughts.
I'm assuming we'd like to fill buffers by just sending a list of points? What I've done for a test is just fill up a cl array, grab the vector-sap, and use that to fill buffers. With points we have to pack them. Do we pack into a cl array, pin and use, or just pack directly into a foreign array, and then free or keep the array around?
Does any packing we do into cl arrays have any effect on packing into simd packs?

Writing glsl in a string in a lisp buffer is a nightmare of formatting. In the long run it doesn't matter what a person uses to get a string for a shader program, but maybe there should be some default shader dsl, or formatting to make code and examples easier to read?

It'd seems like it might be nice to encapsulate these buffers into structs that can be passed around easily, then you have to build a bunch of functions to use those structs, and then years later you have cepl... or something similar. I wonder if anybody has made a comparison of the different layers on top of gl?

I'm not even sure if sbcl system pointers work the same way on windows or osx. So maybe packing directly into foreigns is required? And definitely so if any plans to support another implementation.
If anybody is interested this is the code I used to test. https://plaster.tymoon.eu/view/3408#3408 , just replace the surface:update with whatever your window needs to swap buffers.

@awolven
Copy link
Collaborator

awolven commented Sep 7, 2022 via email

@JMC-design
Copy link
Contributor

JMC-design commented Sep 7, 2022

I tried, but it reads like c and I don't see any lispy abstraction. The only thing I see is direct writing of individual bytes to foreign memory.
I'm not bright enough to understand other languages.

@kaveh808
Copy link
Owner Author

kaveh808 commented Sep 7, 2022

These are good questions, and there are a lot of moving parts on how we encode geometry: ease of editing in CL, optimized OpenGL display, for SIMD, for threads.

One possibility I have been mulling over is whether we should keep a low-level C representation which can act like an old school display list for our geometry classes. We would need to sync up the CL point arrays with these C-type vectors after modeling operations, which would be optimized for OpenGL and such.

Or we could have C-level structs for internal geometry, which we access and modify from GL. That might make CL editing a bit slower, but could result in faster rendering.

@JMC-design
Copy link
Contributor

@foretspaisibles
Copy link
Collaborator

I am very keen to maximize use of the GPU as well as SIMD and multiple cores. I really want our system to be able to handle production-level datasets with the same (or better) speed as commercial packages.

Does it include distributed computing as a goal? :-)

@kaveh808
Copy link
Owner Author

Down the road, why not? :)

@ghost
Copy link

ghost commented Sep 10, 2022

Down the road, why not? :)

Because would be a 30MB SBCL runtime per node? I really wish there was something like MirageOS (which uses OCaml) for Common Lisp or Scheme.

@ghost
Copy link

ghost commented Sep 10, 2022

good eats

I'm a bit full from their 130 page slide deck on optimization. Looks like OpenGL 4.2+ only, which caused a stomach rumble. Sometimes I wonder, "Why can't we just implement OpenGL in pure Common Lisp and be done with it?"

@JMC-design
Copy link
Contributor

I think the approach is still interesting. Today I'm going to try and test if it makes any difference packing arrays from different types of points, into cl arrays that are pinned and sent, as well as foreign arrays and sent.
In my brain it doesn't seem like there'd be much difference.
Besides un/packing structured bits to be sent is on my todo list, calling it pipeline. For use with a new CLX and wayland.
the thing with 4.2 is that 4.1 might have the same things just in extensions. Whether it's like that on Mac I don't know. That or maybe MGL isn't hard to install/use? I have no mac to test that.

@ghost
Copy link

ghost commented Sep 10, 2022

I think the approach is still interesting.

I agree, especially given the potential performance improvement. (I don't like vinegar on my salad, but wouldn't suggest other people shouldn't enjoy it, if you can tolerate one more food joke.) Thank you for posting the link and doing the testing.

I don't have a (capable enough) Mac to try it out on either, but if you do have success I wonder if it would help for you to post a simplified gist somewhere so someone who does could try it out.

@JMC-design
Copy link
Contributor

Trying to come up with a good test for display as well. But so far, with just 333,333 points there's no time difference in packing cl arrays from either origin vectors or 3d-vector structs. From vectors uses slightly less cpu, but I probably need more points, since this is all taking ~0.004 seconds. .020 using generic functions.
submitting cl arrays to gl by pinning them and passing the pointer is, well, just passing a pointer. I guess I should probably through in some static-vectors stuff.

@JMC-design
Copy link
Contributor

so here's just some basic testing. If you make smaller arrays then origin's lead widens. whether it's worth the trade off in not being able to dispatch on...
But the surprising thing is the foreign being slower. If we can depend on just using sbcl to send pointers then i'm not sure what the benefit is.

https://plaster.tymoon.eu/view/3413#3413

@kaveh808
Copy link
Owner Author

Nice work. Is the cost of sending sbcl pointers and ffi arrays to OpenGL (and GPUs) the same?

On a slight tangent, should we bite the bullet and go with double-float as our default? Or is the performance hit a serious one?

@JMC-design
Copy link
Contributor

i can't see why it would be different as they're both just pointers to memory. Unless being in sbcl's mem space somehow affects it. That's why I think an actual drawing test might elucidate further. at least just in terms of packing/repacking something over and over.

I don't know if I've been reading out dated stuff, but what I've seen is that lots of opengl drivers will just convert to single as their internal format. The support for doubles for gp compute is relatively new and requires above 4.1 and in some cases a new card. I've seen figures of half to 1/3 of performance of singles.
For anything like CAD I'd think a fixedpoint format would probably be better.

@awolven
Copy link
Collaborator

awolven commented Oct 11, 2022 via email

@awolven
Copy link
Collaborator

awolven commented Oct 11, 2022 via email

@lukego
Copy link
Collaborator

lukego commented Oct 11, 2022

Perhaps someone who doesn't necessarily have vulkan experience could volunteer.

I volunteer to make an attempt this month. What do I need to know to start off in the right direction? (Either in absolute terms or based on the tiny start I made in #109 a ways back.)

@theottm
Copy link

theottm commented Sep 23, 2023

I'm interested in trying to write this. I will try to build on what @JMC-design has proposed and the text-rendering engine @awolven has written.

It would probably make sense to reuse parts of the code of the text-rendering engine. In order to do so I would have a lot of questions, since there are a lot of things I don't understand the purpose of - it seems like a pretty advanced implementation to me which take a lot of nitty details of OpenGL into consideration, am I right ?

Anyway, I'll start by proposing something and hopefully we can improve on it incremental after with your feedback.

@awolven
Copy link
Collaborator

awolven commented Sep 24, 2023 via email

@ghost
Copy link

ghost commented Sep 24, 2023

unless you live in a cold cabin and need your PC to double as a toaster oven.

I used to render movies on my Mac Dual G4 only in the winter in Colorado, b/c it used nearly 1500W, like a hair dryer (which would have been quieter).

in the long run one will want to support retained mode paradigms

Retained mode caching in OpenGL-based scene graphs usually used "display lists". What method exists to do that now?

@theottm
Copy link

theottm commented Sep 24, 2023

I see. I could also join the effort of porting kons-9 to krma then, if this makes more sense. I'm mostly interested in having a rendering engine I can understand and modify on the fly. If krma can fulfill this role, I'm in.

About the modularity of krma, how would you do things like offscreen rendering, multiple passes? How would you create and load custom pipelines? Having some simple examples would be nice.

@ghost
Copy link

ghost commented Sep 24, 2023

Kaveh rejected the vulkan branch and continued to make changes to the main branch until the vulkan branch bit rotted.

I have the feeling anything I say here is going to get me in trouble with someone. Adieu.

@theottm
Copy link

theottm commented Sep 24, 2023

Could krma evolve to become something like CEPL for vulkan? Because that's in the end what I am looking for: a CL interface to a graphics API. Not just the bindings of course, but an interface that make programming OpenGL or Vulkan in CL more natural

@ghost
Copy link

ghost commented Sep 24, 2023

CEPL

+1

Adieu to this topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants