New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed feels slow #52
Comments
Thanks for the feedback. You're right, the speed is not ideal. I doubt it can be as fast as Python without many person years of work improving compilation speeds in rustc. It should however be better than it currently is. I think the compilation speed regressed somewhat with the change that made it work on Mac and Windows. On Linux I used to see timings of ~600ms, but now I'm seeing around 2 seconds (on the same machine). I've got some ideas for changes that I might be able to make that might be able to improve the speed, hopefully for all platforms, not just Linux. Out of curiosity, what platform are you on? I've had another report of ~10 second times that was on Windows, so I'm wondering if there's something odd happening there. Unfortunately I don't have Windows to test on. |
Oh, actually one big change I realised since when it used to take ~600ms, is that optimization was turned on. If you turn it off with ":opt", then it gets a bit faster to compile stuff. Stuff will then obviously run slower, but if you're not doing significant computation that may not matter. |
There's quite a lot of fixed overheads, so it probably doesn't make much difference to the time whether you've doing let a = 5, or writing 20 lines of code. 1500ms is certainly slow, but not entirely unexpected. I'm hopeful I can get it a little bit faster with the changes I'm wanting to make, but we'll see how we go. Regarding downloading of crates, extern crate and :dep do more or less the same thing. In both cases, it's cargo that is doing the downloading. Cargo caches the downloads, so it shouldn't be downloading it again if you've already downloaded it. It will however have to recompile it again after you've restarted the Jupyter kernel. One workaround to avoid this, is that you can set the environment variable EVCXR_TMPDIR to some fixed directory. If you do this, then it will reuse the same cargo target directory each time you run. |
Yes, I can confirm that A huge leap forward! :) But can you recall a revision or tag which had ~600ms? On the same machine,
|
I've written line/cell magics for rust which runs at full available speed. Does not store state. Runs separate cells independently. |
Nice work :) On the evcxr front, I've pushed a few commits that might help speed a bit. For one, I've disabled optimization by default. I've also changed the execution model. Currently, for running things line println, I'm getting times of about ~415ms, whereas the currently release (0.3.5) is more like ~530ms. So probably about 100ms faster. Note that 2+2 will be slightly slower because it needs to make two attempts at compiling in order to determine how to display the result. The first attempt tries to call evcxr_display on the result. When that fails, it then uses the Display trait to format as plain text. That said, even 2+2 only takes 500-600ms on my laptop. It's also worth noting that the first thing run does tend to be a little slower that subsequent lines. All my times are taken after running a few similar, but slightly different bits of code - e.g. println with different strings, or adding different numbers together. I haven't pushed a new version to crates.io yet. Will wait a few days and see if I decide to make any more changes first. |
I've just released 0.4.0, so you could try installing that. It'll probably be slightly easier to debug if nothing else. Previous versions wrote a separate crate for each bit of code, with each crate depending on the previous crates. With 0.4.0, it now writes a single crate and just updates it with each bit of new code. If you do: If you run perl -pi -e "s/foo=\\d+/foo=$RANDOM/" src/lib.rs; time cargo rustc -- -C prefer-dynamic That way the source won't be identical to the last time it compiled, forcing the compiler to do some work. If compilation is slow there, then you could try editing Cargo.toml or adjusting cargo options to see if any make a big difference. If you have the nightly compiler installed, you can try running: perl -pi -e "s/foo=\\d+/foo=$RANDOM/" src/lib.rs; time cargo +nightly rustc -- -C prefer-dynamic -Ztime-passes That will tell you how long it's spending in different phases of the compilation. Again, it's probably good to run it a couple of times to see the results with incremental compilation after a small edit. I'll be very interested in what you find. |
Here's what I get when I run "cargo rustc -- -C prefer-dynamic" in the last_compile_dir on windows:
I also found that deleting the 'target' dir is sufficient to force the compiler to recompile the project. |
Added empty main function to lib.rs, compiled with 'rustc lib.rs' - fast. Fixed mut warnings, removed all sections from Cargo.toml except [package] - slow:
|
0.4.1 fixed the unused mut and dead code warnings (by ignoring them). It's true that deleting the target dir does force a recompile, however it forces a full recompile, which is not what evcxr does with each bit of code you run. Only the first bit of code evaluated will be a full compile, the others will be incremental. That's why making a small edit to the code is more representative of what actually happens. So compiling the crate takes 4.3s, but rust_magic, which presumably is also compiling a crate takes 0.6s. Something odd is happening. If you just create a new Rust crate with cargo new then compile that, how long does it take? |
Thanks, that's helpful! I've managed to track it down to the call std::panic::catch_unwind. If I don't call that, it speeds up compilation a good bit. I'm going to make it so that if :preserve_vars_on_panic is turned off, it doesn't emit the catch_unwind. Given the difference it makes, I'll probably also change preserve_vars_on_panic to default to false. Interestingly with catch_unwind gone, opt level 0 vs 2 appears to make no difference, so I'm thinking of turning optimisation back on by default. In my local timings, evcxr is still a tiny bit slower. That appears to be due to the code for loading and storing variables. I'm going to change it so that that code is only emitted when you actually have stored variables, so that a simple print or eval doesn't pay the cost. |
…anic catching slows down compilation. #52
My main concern with doing that would be if the user makes a mistake with a :dep instruction. They wouldn't get feedback straight away that it was wrong, and worse, once the :dep is added, there's not currently any way to remove it, so every compile after that would fail. But that's a good point that having to wait separately for two relatively long compiles is annoying. It probably wouldn't take much to allow multiple :dep commands, together with code to be executed. Then you can put it all in the one cell. |
I've pushed 0.4.4, which allows putting one or more :dep commands at the start of a block of code. It also adds support for using sccache. |
on evcxr_jupyter, there was a code (my first PR) that allow to have several
":command" at the begin of a cell, and split them into multiple command
send to eval.
…On Sun, Aug 25, 2019 at 12:07 PM David Lattimore ***@***.***> wrote:
I've pushed 0.4.4, which allows putting one or more :dep commands at the
start of a block of code. It also adds support for using sccache.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#52>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAACTM3ZMM5RNVCKSBSAWBLQGJKWNANCNFSM4IKVJQHQ>
.
|
Huh, I wonder if I just duplicated functionality... |
You're quite correct. Sorry, I forgot about that. Here's what happened... I went to add support for init.evcxr and thought that CommandContext::execute supported mixed commands and code, but then found that it didn't. I then didn't think to actually try it in the Jupyter kernel, when I went to implement it, I just started by writing a test for mixed commands and code in CommandContext::execute, made that test pass, then confirmed that it worked in the Jupyter kernel. So I guess at the moment, my splitting code isn't getting run because the Jupyter kernel pre-splits it. |
I finally got around to trying using lld to link with evcxr. I'd tried once before, but ran into problems. This time it worked. It does appear to make a difference, although how much difference depends on things like optimization level, what crates if any you're using etc. Ran with: RUSTFLAGS="-C link-arg=-fuse-ld=lld" evcxr With no crates, compilation is already relatively fast, although it did make it marginally faster. More interesting though was when I pulled in the regex crate with optimization disabled ( let re : regex::Regex = regex::Regex::new(r"^\d{4}-\d{2}-\d{2}$").unwrap(); At At Might be worthwhile adding built-in support to make it easier to enable without needing to set the environment variable. |
Experience report: I'm currently on a mission to create a version of your mybinder configs that will pre-cache some reasonable-sized dependency (I went for I'm tracking my progress on this branch: master...alsuren:nightly-2020-07 which can be build at https://mybinder.org/v2/gh/alsuren/evcxr/nightly-2020-07 Currently I feel like I'm going backwards, but I will report the numbers anyway: On the host (macos),
If I kill the evcxr instance and rerun the test with the EVCXR_TMPDIR prepopulated, it takes 5 seconds then 2 seconds. This feels like a promising start.
If I do This now takes 46 seconds followed by 50 seconds.
I'm not sure why this is so slow, but my money is on the filesystem being slow (docker is using overlay fs for /). This is still quicker than if I don't have EVCXR_TMPDIR set (3m44s followed by 44s):
Note that my docker image has I'm wondering if there's any way to set EVCXR_TMPDIR to point at a sensible filesystem and then copy the cache from the dockerfile. Does anyone else have experience with evcxr in docker? |
I am curious how much difference |
Evcxr already passes |
Hi, I'm writing one open-source project and want to use Ecxvr for both REPL and Jupyter Notebook. The project has one larger crate - the build time with
Thank you |
You could use
Rustc requires that the compiler version used to compile dependencies matches the current version. In addition convincing cargo to not rebuild it even with a matching compiler version may be hard. At the very least you have to restore the mtime of all files in the target directory and possibly of the dependency source files in |
You may be able to use sccache with a cloud storage, however this requires giving your users write access to said storage I think. In addition it requires the absolute path of the target dir and of all dependency sources ( |
Compared to a Python Notebook, the speed feels slow. Running :timing, each block execution even for simple code is 5 to 10 seconds, and extern crate is really slow.
I was hoping for something like notebook driven development like you can do in Python, but the speed doesn't seem to allow me to do so.
The text was updated successfully, but these errors were encountered: