Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Improvements #6

Open
saviorand opened this issue Jan 4, 2024 · 7 comments
Open

Performance Improvements #6

saviorand opened this issue Jan 4, 2024 · 7 comments

Comments

@saviorand
Copy link
Owner

Parallelization and performance optimizations

@crunchy-vonage
Copy link

I appreciate you may not have optimised yet.
But fyi, I get approximately:-

  • 50req/s with mojo lightbug.🔥
  • 100req/s compiled

whereas python flask does about 1000 req/s on a single core.
Performance profile attached
perf

@crunchy-vonage
Copy link

Nether mind it's just something in the Welcome handler
With my own handler, I get 2700 req/s.
perf

@saviorand
Copy link
Owner Author

Woah, nice! Thanks for testing! Yes, the welcome handler serves an html page with an image, which might be slower. Can I ask how you're profiling this? The charts look sick

@crunchy-vonage
Copy link

Profile was with Linux's built in kernel profiler and "perf" usermode tool, I couldn't find a profiler specifically for mojo yet. This technique does have the advantage of showing all user and kernel mode activity, i.e. the libc and cpython work.

I suspect there is a lot of memory allocation or copying happening in the welcome handler but I'm not all that familiar with mojo and haven't found a technique to profile memory allocation.

i'm also suspicious the use of python sockets might be suboptimal, but what do i know?

flame graph is by https://www.brendangregg.com/perf.html

git clone https://github.com/brendangregg/FlameGraph
cd FlameGraph
sudo perf record -F99 -g -p `pgrep lightbug` -- sleep 60
sudo perf script | ./stackcollapse-perf.pl > out.perf-folded
./flamegraph.pl out.perf-folded > perf.svg
google-chrome perf.svg

@crunchy-vonage
Copy link

you might also enjoy perf top

@crunchy-vonage
Copy link

Yeah 1500req/s with the base64 image removed.

@saviorand
Copy link
Owner Author

@crunchy-vonage we're actually doing external_calls to C in the Mojo server implementation in the sys folder (this one is enabled by default) and not talking to Python! Python is only invoked in the separate Python implementation in the python folder

@saviorand saviorand changed the title [EPIC] Performance Improvements Performance Improvements Apr 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants