-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v10 MAP_POPULATE speed up possibile explanation #1
Comments
Hi @Pluvie, thank you very much, There are faster implementations than mine, but I gave my best in this challenge, had a lot of fun, and also tried to document what I did, both for myself and for others, and that's what matters to me after all. Let me see if I understood: you suggest that the old approach (with The reason I used Anyway, there isn't much documentation on how But thank you very much for the feedback and I'm glad you liked the repo =). |
Yes exactly. As far as I understand, when you call As you said, though, these features are not well documented. Also, in my opinion, programming with the kernel / syscalls is much less fun than programming with the CPU / hardware: it makes all the experience more obscure. Sometimes I wish we could just skip the OS and tap directly all the bare metal power 😄 |
Yes, I know... only the obscurity of
Definitely... at least on Linux there is still the possibility of looking at the source and getting some idea of what is going on, or at least asking someone more experienced (who will certainly do the same).
It is possible... just... not trivial. x86_64 is quite complicated to deal with multi-core and so on on bare metal, maybe one day I'll try to do something like that, just for fun. However, there may not even be that much of a performance gain, the Linux kernel is surprisingly very optimized. |
Hi @Theldus!
First of all let me thank you for the very much enjoyable read that you gave me with this repo.
You did an excellent write-up and it was very nice to see all the changes discussed and their improvements!
Speaking of that, I have an idea on what is causing the speed up on v10, with the removal of the MAP_POPULATE flag.
Taken from man page of
mmap
:Since you are mapping a file, the MAP_POPULATE flag caused a prefault for the entire memory block. It is true that the memory will eventually be completely faulted (since the file will be read wholly), however the actual read is performed in threads, while the MAP_POPULATE flag happens in the main thread only.
So apparently, the concurrency of the page fault happening in the threads is causing the speedup.
What do you think about that?
Thanks again, wish you a wonderful day!
The text was updated successfully, but these errors were encountered: