-
Notifications
You must be signed in to change notification settings - Fork 772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Alternative Array.Filter implementation #1479
[WIP] Alternative Array.Filter implementation #1479
Conversation
Optimized for memory usage, and average performance
OK; well here is an alternative version, I have started collecting some data on from runs with various parameters, and they can be seen in this spreadsheet. So what magic is included in this version? Well there is a bit of magical IL used to convert a bool into an int (safer than my previous struct magic, but probably would like a CIL guru to confirm if it is kosher or not... passes peverify and runs on all the JITS so I .. umm... assume it is...) Anyway, have a look... |
The alternative version had better performance on original tests that I ren it on, but not across a bevy of tests thrown at it. It also had too a bit of IL magic which is probably avoided, and was quite verbose. So good riddance.
@manofstick Looks like you took out the CIL modification? Didn't work? Not safe? |
I think it was safe, but really it was just taking the complexity level up a bit too high (documentation for instructions like ceq said they pushed 0 for false, 1 for true which was of type int32 - this would seem like you wouldn't even need any instructions, other than type change, but playing around it appeared that true was valid for command like brtrue for any non-negative value - so the cgt.un should have been fine). Anyway, maybe worth it if the results I got for my first trial were replicated across all parameters, but once I removed the fully random data source then the results were just worse that the previous, simpler version. (And @KevinRansom was already making comments about complexity!) Anyway, I'll just polish a couple of edge cases, and then check the test suite to see if it sufficiently covers the added complexity, and then I think I'm done. Hopefully get that done now, but my time can be interrupted at any moment be my two little monsters,,, |
- Small and large array edge case optimizations included - Broke functionality into multiple functions
Well in all it's overkill glory, I'm ready, I think, for this ugly baby to strut it's stuff, or alternatively be smashed to oblivion. Your choice :-) |
I don't know what to do ... you have presented a very real conundrum. The performance gains are very real, the memory savings are equally real ... and yet the code is impressively sized, and I wouldn't really want to have to debug an issue with it. However I expect you can factor out the mask stuff and make choose benefit from this approach too ... I am going to pull it, since speed is a real thing ... and there is no law that says we must write easy code ... Please take a look at choose and see if you can apply the same technique. And a quick gander through the rest of the libraries to see where else we can leverage this goodness. Thanks Kevin |
Ha, yeah I felt like sighing when I saw the end result; not particularly the most beautiful work I have ever done as far as code elegance goes... But then performance related code rarely is I guess; and hopefully it will never need to be changed again! (here's hoping!) Anyway, assuming that my test does what it is supposed to do (maybe worth stepping through) then it should be exhaustively tested... hopefully not famous last words!! Probably can't be extended for "choose" as you want to store the result of mapping, not just true/false. Possibly could be extended for "partition"... |
I think you could reduce memory use for partition, but not speed it up. |
Hmm... not sure if could really improve distinct; still need some sort of hash, and then some populate step... Don't think it would beat:
which save on memory, at the cost of performance... |
@manofstick maybe parallel.partition? |
Optimized for memory usage, and average performance