New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for SLiM simulations on Windows #149
Conversation
Hi Martin, Here is my edit to try and get the slim command working in Windows. The primary changes are with the slim command line request with specific changes to the pathing. I hope this pull request works appropriately, let me know if there are any issues. Couple of notes: I'm on a Windows 11 machine. My slim is held in a conda environment and was downloaded through conda-forge. I call the slim path directly with the "slim_path" option in your slim() R command. I used you setup_env() to get access to the necessary python modules. Other than that, these changes were sufficient to allow me to work through the tutorial that you have on the slendr website. I mention this briefly in notes in the code itself, but I did not test anything regarding the slim gui, this was command line only. Sincerely, GK
The main way for people to install on Windows nowadays, I think, is the |
Yeah, that's what I understood as well (looks to me that pacman is something like Homebrew on macOS but ¯\(ツ)/¯ I haven't used Windows in 15 years). What I meant is that I it looked to me that CLI programs installed via pacman appear to be using some sort of bash ported for Windows but not the actual Windows "Command Prompt" (I think that's how it used to be called) or the "newer" Powershell. This is what the colored terminal screenshots in SLiM manual are from. So this wasn't actually about the installation itself but about running those programs. In any case, it's all good -- this wasn't meant to be a criticism of any kind. Or it wasn't meant to express surprise or anything. Again, I don't use Windows, and have no prior expectations about any of this. :) |
Ah, hold on. I can run Also, when I said
I actually meant pacman / msys2 compiled programs in general. But the above reads as I'm talking specifically about SLiM, which wasn't my intention, sorry. |
No worries, just checking that you were using |
Hi @bodkan, Thank you for looking into this! Yeah, I'm happy to try running the new version when it is ready! |
@bodkan Hi! I assume you're using the latest tskit as some time ago we increased the possible lengths of ragged columns such as metadata to a 64bit int. It may be that on windows the type may still be 32 bit. What length of metadata column causes this issue? You can tell by with |
@benjeffery Might it be a good idea to add a bit of code in tskit, at startup or some convenient spot like that, that just checks |
Hello @benjeffery, thanks for jumping in.
I release a new slendr version whenever a new version of tskit or msprime or pyslim is released, so currently it includes tskit 0.5.6.
I'll try to get that as soon as I manage to isolate a broken simulation run in a fully reproducible way. I don't have Windows access and debugging this over GitHub actions or inside a super slow virtualized Windows machine has been what I imagine using punchcards in the 70s must've been like... The fact that this doesn't happen (for any given simulation) in every run has made investigating this even more challenging. I'll be in touch as I get more info on this, thank you. |
I haven't had time to dig into this yet - but am surprised we don't have a test for it. Hopefully we can replicate in CI. |
So, I have to say I’m not yet 100% convinced it’s really a tskit-on-Windows issue rather than reticulate-on-Windows issue. In fact, so far I'm leaning towards the latter. I did finally managed to create a tiny test case which produces a tree sequence which reproducibly causes the crash every time. This will hopefully make it possible to really dig deep to determine just what is it that’s causing the problem! From a casual glance I managed yesterday evening before I had to run home it’s something… utterly bizarre. |
OK, I don't remember dealing with such an obscure weird issue. @rdinnager Would you mind testing one more thing for me? I managed to create a tree sequence which always breaks on my Windows virtual machine but I'd like to make sure it also happens on a native Windows system. Interestingly, this happens regardless of whether the tree sequence is simulated with SLiM or msprime. I attach two tree sequences,
So accessing the metadata breaks but only on every other access?!?! What the hell? This must be some weird, super low-level memory problem related to the Python embedding / caching the @rdinnager Could you please check that you observe the same behavior? (Note that in my VM it happens for both the SLiM and msprime tree sequence). At least then I'll know I'm not chasing ghosts but that the attached tree sequences break also for you in this way. Thank you very much. Again, for completeness, here's the error in full:
Here's a slendr script which generates a broken SLiM and msprime tree sequence. No need to run this, both tree sequences are attached to this message:
|
Holy crap. It is the seed?!?
This is the weirdest, craziest bug I've ever encountered, ever. How hilarious that all this frustrating debugging and digging boils down to... a huge-ass number being a little bigger than allowed! I still have no idea why @rdinnager, I think you can ignore my request for help in the message above. I'll adjust the range of integers that slendr picks a random seed from and also put a constraint on the range of user-provided random seeds -- for both Then I'll run the unit tests again and see what happens. 🤞 |
Oh! It doesn't look like slendr is generating a random seed in case the user doesn't provide any. If no random seed is explicitly specified, then slendr lets SLiM or msprime pick their own. This is interesting because if SLiM tends to provide a wider range of integers as random seeds than msprime, it would explain why it's been only Given that the error happens someplace slendr has no control over (low-level interface between R/embedded-Python), maybe a compromise (at least intermediate one) is for slendr to actually generate a random seed of its own, restricted to be less than |
Wow, craziness. Nice debugging! |
This reverts commit 9d97733.
OK, so all the slendr unit tests are passing across macOS / Linux / Windows with no exceptions. I also verified that all the vignettes (which contain much more ambitious simulations runs) are building on all three systems. It really was the random seed being bigger than a 32-bit integer causing the intermittent crashes on Windows during parsing of the metadata (in which slendr saves dictionary with some custom metadata, including the seed used). Some very obscure interaction at the level of R-Python interface that's impossible for me to explain because it would require digging into that interface at a C level. I'm sure it would be fun to figure out but I can't put more time into this. As far as slendr is concerned, this is fixed by making sure that random seeds are less than 2,147,483,647 which makes Windows happy in the current state of things. Once this is merged, I will put out a new slendr version to release the Windows support to the wild. I'm really happy to have this in because porting the entire unit test suite to Windows required changes to make the code more robust overall. I'm pretty sure there are still smaller issues here and there but discovering those would require using slendr and it's tskit interface on Windows for more serious work. Not something I can manage on a slow-oh-my-god-why-is-it-so-slow virtualized Windows system with a hilariously low screen resolution (I can't even get a proper fullscreen RStudio view! :)). Those bugs can be taken care of with individual GitHub issues. What a journey this has been. Thank you everyone for helping out! |
This is inspired by the changes suggested by @GKresearch here.
I manageds to setup Windows 11 in a virtual machine on my Mac. It's not a great development experience but it's something!
The goal is to check if we can support the default means of installation described in Section 2.3.1 of the SLiM manual. Conda-installed SLiM should work as well, except that doesn't support SLiMgui which makes it not the best default target. That said, Conda users will be able specify the path to their slim binary directly via the
slim_path=
argument of theslim()
function.Speaking about paths -- I still haven't figured out how path management is supposed to work in Windows shell. In fact, it appears that the SLiM installation on Windows described in the SLiM manual is using a non-native non-Windows shell? That's why I'm considering making specification of path to a SLiM binary to be used (be it CLI SLiM or SLiMgui) mandatory on Windows.
Hey @GKresearch, if you're still interested in helping, it would be great if you could test the new version of slendr with Windows support on your computer, after I manage to put everything together.
Thanks for pushing me to do this and sorry again for the huge delay!