-
Notifications
You must be signed in to change notification settings - Fork 17
Rust Import Scanner #221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rust Import Scanner #221
Conversation
CodSpeed Instrumentation Performance ReportMerging #221 will degrade performances by 73.02%Comparing Summary
Benchmarks breakdown
|
b420e06 to
cf7672f
Compare
This necessitates us turning off multiprocessing, but we'll switch to multithreading in a subsequent commit.
cf7672f to
bf68f26
Compare
bf68f26 to
0cfe293
Compare
|
It occurred to me we could probably keep multiprocessing for the time being if, rather than passing the file system as an argument to the jobs, we pulled it from the settings object within each job. Might be worth a try so we can merge this. |
|
Replacing in favour of #229, which doesn't turn off the multiprocessing. |
Moves the ImportScanner class to Rust.
In order to do this we also need to define Rust-based swappable fake/real file systems to be passed in, so we can still unit test via Python. To reduce the amount of work involved, I've defined a narrower interface for the filesystem that only implements what is needed by
ImportScanner. We can broaden it later when we come to do the same with caching / module finding.I had trouble getting the ImportScanner to pickle/unpickle correctly, which is needed to do multiprocessing, so I've removed multiprocessing. This slows things down considerably (on a large graph, locally, building the graph goes from 5s -> 15s). So really we need to keep going and move the multiprocessing to threads before this is releasable.
There's also distinctly smelly Rust code that loads the
ModuleandDirectImportdataclasses, rather than defining them in Rust - all in the interests of trying to limit how many changes I needed to make to keep the tests passing.Parallelism next steps
The
ImportScannerclass currently requires the GIL, which means we still have a bit of work to do before we can move to multithreading. I think the best thing to do is abandon the ports-and-adapters approach forImportScanner(we only have one of them anyway) and instead make a function that we can unit test from Python, that does them in bulk along the lines of #222. We should keep the unit tests of import scanner but just adapt them to call the bulk function instead - which at first could be in Python, but then we could push that down to Rust. That would allow us to turn theImportScannerinto a pure Rust class that doesn't need the GIL, and do any mapping to Python classes / exceptions in a wrapper function in Python.Still to do