-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
harpy impute error IsADirectoryError: #68
Comments
Thanks for writing. I'd like to diagnose this with you and it seems to be two parts:
I'm not sure why this would be happening. Neither is it an issue in any of Harpy's test suite (example). I could try to add, as a failsafe, a
I'm sure we can get to the bottom of this! |
Hi- thank you very much for the fast reply! problem 1 - flexdashboardthis arose while running the preflight bam check I did solve it by installing flexdashboard manually within the conda environment, but as you say this might point towards installation issues.
InstallationI had a couple of issues installing mamba. But my successful installation commands were as follows
During the first preflight checks I got this error: "Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorit" Which I fixed as advised:
problem 2 - directory errorI attach a tree of the current harpy dir, im testing with 6 bam files harpy.dir.tree.txt
Note that I first ran harpy impute in a subdirectory of harpy called harpy/stitch/, but I have since deleted that and run everything in the base harpy/ since it looked that that's how it prefers to work and where the SNP/ and bams/ are. Could I clean up snakemake somehow to restart harpy impute from scratch? Thanks again! |
As an additional test, I ran the module harpy phase with the following command:
And it exited with
Found an error in the snakelog in the linkeFragments step:
So went to the linked log, and found that the issue is pysam!! issue with dependencies again.
I'd be happy to make a clean install of harpy if you could advice on how to do that without risking incompatibilities. thanks again! |
I'm still thinking about the first issues, but I think the issue with |
@gmkov there are some huge internal API changes in that branch I was working with. I'm trying to merge it into # get the git repo
> git clone --depth 1 https://github.com/pdimens/harpy.git
> cd harpy
# switch to dev branch
> git checkout dev
# create conda/mamba env
> mamba env create --name harpydev --file resources/harpyenv.yaml
> mamba activate harpy
# install everything
bash resources/buildlocal.sh |
@gmkov Your mamba installation was not done the recommended way. See the official mamba docs on how to install it correctly via miniforge or micromamba: https://mamba.readthedocs.io/en/latest/installation/mamba-installation.html#fresh-install-recommended Try starting from there and see if it still creates issues. I'm almost ready with the developmental version. |
Hi Yes i (and someone else) tried to installed mamba in that way and kept getting errors (sorry didnt keep track of those)- hence my attempt with conda-forge::mamba I have uninstalled the conda conda-forge mamba by running Then i've updated conda using, now have conda 24.3.0 (although I think it was already updated) :
Then ran this to install harpy with conda: ` (base) [mgm49@login-n-1 vcf]$ conda create -n harpy-conda -c bioconda -c conda-forge harpy ` And the following errors come up, something with limbamba:
can try with harpy dev as explained above. |
Woah, that's a new one. Can you undo the conda strict priority that you implemented before? Snakemake likes to complain about that but it honestly hasn't been an issue so I never actually set it to strict. Also, remove |
ok changed strict priority again, updated all:
so now my condarc looks like this:
weirdly, the anaconda channel is NOT in ~/.condarc , Now I run again the conda install:
and things installed nicely! TEST1: impute Now I re ran stitch in the new harpy-conda env: but sadly still get "IsADirectoryError: [Errno 21] Is a directory:" TEST2: preflight check new problems
it's realised there is no mamba and is not happy about it. could try and install in the way described in the error and reinstall? conda install -n base -c conda-forge mamba |
That's progress, of a sort! nice harpy impute --threads 10 --parameters stitch.params \
-s "--conda-frontend conda" \
--vcf SNP/mpileup/variants.raw.bcf \
bams 2> log.impute & |
hi, unfortunately same isadirectoryerror appears. Even though I delete the whole Impute/ dir before running harpy, just in case
but when running the preflight with conda-frontend it runs fine but complains about channel priorities (which i reverted to flexible), and fails at the end because flexdashboard not installed
preflight bams error:
any other ideas? thanks |
This is so very confusing, honestly. For starters, and this may not be relevant, the Second, I merged my branch into You can also try to install I need to caveat that the |
Changing the preflight bams folder input to the end of the command did not fix the issue with the flexdashboard not installed. But if I install it it works fine. Sadly I cant seem to get into the dev branch following the steps above
anything I can change? thank you EDIT 1 I removed the --depth 1 and it seems to have worked ??
had to change a few things from the commands you had sent to install dev, so i put them all below:
going to test Impuptation |
Sorry about that, the |
@gmkov Please let me know if the IsADirectoryError persists. |
Hi, im back in action! sorry had to change priorities. I also had to reinstall miniconda3 in a completely new location (+ get rid of all the old instal), as our home directory in the cluster is only 50gb and with all these environments I was quickly running out of quota. So now we have a fresh start to install harpy. I still dont want to use mamba, sorry, it messed things up last time and im scared. Testing harpy (not dev)
sadly hitting isadirectoryerror again :( but everything else ran well, so hoping harpy dev works Testing harpy dev
snps is gonna take a while and need to change computer. will update this comment once snp module of harpydev has run, and if possible will test impute thanks |
|
@gmkov The current version of
You're welcome to try the changes by doing: > git pull
# make sure you are in the harpytest environment
> bash resources/buildlocal.sh |
FWIW, I never configure the conda --strict thing and just ignore the messages about that Also, the workflow/module should index the input BAM files if indices aren't present. |
Hello Ok so ive done the following to carry on testing harpydev Note: I first tried the steps below while the channel priority was set to STRICT and when running preflight it gave errors saying cant find packages or create enviroments (e.g.
It is stuck in the "Downloading and installing remote packages." step, checked .harpy_envs and the last file was updated 6 minutes ago, so I think it's a dead end. I'm gonna leave it running on a screen and will update later thank you |
eventually it ran and showed a couple of errors:
and eventually hit the flexdashboard issue again
|
Sometimes on HPC's it takes unusually long to do the internal conda things. I'm not sure why that is, might have to do with how many files are being changed at once and how the RAID array has to manage that over however many drives. |
The R package situation is annoying to say the least. I'm sorry about that and I'm still not sure why it's lying to you about package absence. I'm not crazy about this solution, but I'll write some |
@gmkov I added section to the R code in |
@gmkov as a heads up, the most recent commit to
|
Hi Re- r packages. Well the weird thing is that the fresh harpy bioconda installation is NOT throwing any errors around libraries, whereas harpy dev is :/ Sadly i pulled the latest changes and harpydev still cant find flexdashboard. Now running imputation with harpydev:
harpydev imputation threw error about indexes And now I realised that my harpy impute command was incorrect (as i was adding as an extra parameter the positions file, when this is the MAIN file for stitch, and one that presumably harpy produces internally). harpy wants bcf/vcf input even though stitch only requires a position file. I assume this is because harpy wants to curate the vcf to an extent (biallelic SNPs and sort the input VCF file) but not to check for duplicate positions/sample names. So instead now:
But still getting this issue what could be going on? Im going to try other modules with the fresh bioconda harpy install |
I can also see that the harpydev Impute/workflow dir structure looks different to harpy Impute/workflow harpydev:
harpy:
|
There's a few things that might be going on here. Also, is the bcf file sufficiently small enough that I could download it and do some testing on my end (in addition to you having permission to share it)?
bcftools norm --rm-dup all variants.raw.bcf -o variants.raw.nodups.bcf Make sure you specify |
@gmkov I think I found why the git pull
bash resources/buildlocal.sh
nice harpy impute --threads 20 --parameters stitch.params \
--vcf SNP/mpileup/variants.raw.bcf \
--skipreports --snakemake "--conda-frontend conda" \
bams > log.impute & |
@gmkov the |
Good morning from Cambridge. harpydev impute --skipreports works!Ok so I pulled the harpydev git last night, reinstalled, and first run impute without skipping reports after having manually installed flexdashboard (just in case) and a new error came up:
But then I ran impute with 4 threads only, k=3 or 4 in the stitch parameters, --skipreports and voila!!! it's running! It's nearly done (ran in most contigs with all four models i set in stitch parameters). This is wonderful Next steps
|
This is big, we've made progress! I know how to fix the R issue and will let you know when it's ready. That'll be in a few hours because it's only 5a here and I happened to check my phone when I woke up to get some water. I think you can bump up the thread count, though. |
It's so strange that the other R packages are installed (clearly STITCH works), but flexdashboard won't |
@gmkov you can try the |
testing with preflight bams. silly question: can i run harpydev simultaneously on the same conda env for different jobs? in different directories. or will snakemake get confused... thanks! |
Harpy passes the |
Ok great, good to know -o can help with concurrent runs (want to divide some steps by populations AND it might help if running analyses by chromosome in the cluster for speed). And running harpy in different project directories shouldn't be an issue either right, because .snakemake is local to the project directory? I think the reason why I just had issues is because I pulled the harpydev version while some harpydev jobs were running - i knew this was potentially dangerous |
bad news. harpydev preflight bams ran perfectly until the end...
if i install it manually it's fine. sorry this is so complicated! |
and tested prefligth fastqs after having manually installed flexdashboard and got this, in case its helpful (some issue with libxml which i also had when trying to install high charter / xml manually):
|
I added |
FWIW, a long-term solution will eventually be to containerize the environments and have the software distribution method be Singularity/Docker. |
arg sorry, same! ive pulled new changes after preflight fastq
|
I swapped the xml dependency as described here |
Hi new error when running preflight bams, slightly different:
when running align bwa, it runs well (produces bams) but also struggles to make reports:
bit weird will try again |
I'm at a loss with this. I've begun trying to figure out how to achieve this using apptainer but it might be a bit because my initial attempts are not succesful. |
let me know if there is anything i can do to help. is there any way we could go back to the harpy dev version whereby if i installed flexdashboard manually it would generate the report? #68 (comment) - at some point yesterday (and before) this worked and they are fantastic the good news is that harpy impute is now running when --skipreports is used, which was the topic of this Issue. I havent tested harpy impute with the harpdev version that did compile reports if i installed fleshdashboard manually, that might work. if not ill try to compile them locally. if i have issues with other modules ill open new issues. do you plan on updating bioconda harpy with harpydev? thanks |
Thanks for sticking this out. Harpy dev is going to be the next release but it's not ready yet. Since I've already begun it, I'm going to continue working on Singularity env management and try to incorporate that into the next release. It may take a week or few bc I've fallen ill and probably won't work until my health improves. |
@gmkov if you're feeling dangerous, you can try the conda/mamba install -c conda-forge apptainer |
@gmkov took a bit to augment the test suite to accommodate the container situation, but they are setup now and all modules except |
@gmkov all tests have passed, the |
good evening! i was NOT feeling dangerous or brave this morning, sorry. but this sounds great! so phase should work on the singularity branch? will try tomorrow, and will try preflight bam with reports :) thanks |
So long as you add |
testing harpy dev singularity. created a new environment so that i can carry on using the old dev
now to test preflight bams
unfortunately, hit an error:
it says "Preflight/bam/workflow/input/CAM046072.bam" doesnt exist but it definitely does! and this same set of files was working before with the normal dev version thanks |
You don't need to add a conda frontend as the container approach doesn't use your system's conda. Try it without the |
tried with
|
What the actual heck is going on here; your system never ceases to impress me. I'm kind of out of ideas. The only thing I can think of is whether your In the event you're using symlinks, I've made an adjustment to |
@gmkov FWIW, if you're still having issues, there's a new release of harpy that incorporates the conainterization of things (or not, you can toggle it with |
@gmkov there's been a lot of internal work and several releases since we last tried to troubleshoot this. I welcome continuing this discussion and troubleshooting if/when you have the bandwidth for it :) |
Describe the bug
Thank you for this great resource. I realise it still under development, so unsure about whether to expect modules to run smoothly.
After some initial teething issues with sample names in my own vcf file created with bcftools mpileup outside harpy, I ran into a cryptic conda issue. So I went a step back, and used the harpy snp mpileup module to obtain a bcf, which finished with an error (i think just unable to compile the bcf html report) but the files looked ok within SNP/mpileup/. So then I tried imputation with harpy impute and using this file as input.
my command is:
my stitch param file is:
This produces the same cryptic conda error as before, which I paste below.
I also ran the
harpy preflight bam bams
and whole snakemake build a DAG correctly and seems to create/activate conda environments fine, and runs correctly (after I manually installed install.packages("flexdashboard"), this was the error inititally). So the preflight checks with the bam files were ok.Harpy Version
0.9.1
File that triggers the error (if applicable)
No response
Harpy error log
Before submitting
The text was updated successfully, but these errors were encountered: