Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running NeoFuse- cannot find models_class1_pan/models.combined/manifest.csv #25

Closed
mantczakaus opened this issue Jun 22, 2023 · 11 comments
Labels
bug Something isn't working

Comments

@mantczakaus
Copy link

Hi,
Thank you for this amazing pipeline! I am currently running it WES and RNA-seq data and I'm having trouble with running NeoFuse.
The content of command.log is as following:

INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred
INFO:    fuse: warning: library too old, some operations may not work
[-------------------------------- [NeoFuse] --------------------------------]

[NeoFuse]  Paired End (PE) Reads detected: commencing processing
[NeoFuse]  Processing files TESLA_3_1.fastq.gz - TESLA_3_2.fastq.gz
[NeoFuse]  STAR Run started at: 16:13:03
[NeoFuse]  Arriba Run started at: 16:13:03
[NeoFuse]  Parsing custom HLA list: 18:08:02
[NeoFuse]  featureCounts Run started at: 18:08:02
[NeoFuse]  Converting Raw Counts to TPM and FPKM: 18:09:38
[NeoFuse]  Searching for MHC I peptides of length 8 9 10 11 : 18:09:39
[NeoFuse]  Searching for MHC II peptides of length 15 16 17 18 19 20 21 22 23 24 25 : 18:09:39
[NeoFuse]  MHCFlurry Run started at: 18:09:39
An error occured while creating the MHCFlurry temp files, check ./patient1/LOGS/patient1_MHCI_final.log for more details

./patient1/LOGS/patient1_MHCI_final.log 's content:

Traceback (most recent call last):
  File "/usr/local/bin/source/build_temp.py", line 122, in <module>
    final_out(inFile, outFile)
  File "/usr/local/bin/source/build_temp.py", line 61, in final_out
    with open(assoc_file) as csv_file:
FileNotFoundError: [Errno 2] No such file or directory: './patient1/NeoFuse/tmp/MHC_I/patient1_8_NUP133_ABCB10_1_8.tsv'

I also checked the content of the patient1_X_MHCFlurry.log. The all say:

Traceback (most recent call last):
  File "/usr/local/bin//mhcflurry-predict", line 8, in <module>
    sys.exit(run())
  File "/usr/local/lib/python3.6/dist-packages/mhcflurry/predict_command.py", line 207, in run
    affinity_predictor = Class1AffinityPredictor.load(models_dir)
  File "/usr/local/lib/python3.6/dist-packages/mhcflurry/class1_affinity_predictor.py", line 480, in load
    manifest_df = pandas.read_csv(manifest_path, nrows=max_models)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py", line 688, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py", line 454, in _read
    parser = TextFileReader(fp_or_buf, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py", line 948, in __init__
    self._make_engine(self.engine)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py", line 1180, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "/usr/local/lib/python3.6/dist-packages/pandas/io/parsers.py", line 2010, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.__cinit__
  File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: '/home/neofuse/.local/share/mhcflurry/4/2.0.0/models_class1_pan/models.combined/manifest.csv'

What I did to debug it is as I downloaded the current NeoFuse container from https://github.com/icbi-lab/NeoFuse but the run with it resulted in the same errors. Is it possible that the default mhcflurry changed? I'm not that familiar with mhcflurry or NeoFuse - maybe you could point me into the right direction?

@abyssum
Copy link
Member

abyssum commented Jun 22, 2023

Hello @mantczakaus,

Thank you for using nextNEOpi.

This is a weird behavior... can you pull the image locally with smth like:
wget --no-check-certificate https://apps-01.i-med.ac.at/images/singularity/NeoFuse_dev_0d1d4169.sif

then run:
singularity exec NeoFuse_dev_0d1d4169.sif ls /home/neofuse/.local/share/mhcflurry/4/2.0.0/models_class1_pan/models.combined/

and paste the results?

@abyssum abyssum added the bug Something isn't working label Jun 22, 2023
@riederd
Copy link
Member

riederd commented Jun 22, 2023

Moreover, can you also send the contents of the work dir in which the pipeline failed.

@mantczakaus
Copy link
Author

Thank @abyssum for such a prompt response! I run the commands you asked for as an interactive job on my hpc. The result is the similar - it cannot see the folder

/bin/ls: cannot access '/home/neofuse/.local/share/mhcflurry/4/2.0.0/models_class1_pan/models.combined/': No such file or directory
I also tried this: singularity exec NeoFuse_dev_0d1d4169.sif ls /home/neofuse and same thing: '/bin/ls: cannot access '/home/neofuse': No such file or directory'. Could there be some extra singularity options that I would need to run it? All the other containers that the nextNEOpi was using up to NeoFuse worked fine though.

@mantczakaus
Copy link
Author

Moreover, can you also send the contents of the work dir in which the pipeline failed.

Thank you @riederd for coming back to me! Here are all the run and log files from that work folder.
work_NeoFuse.zip

@riederd
Copy link
Member

riederd commented Jun 23, 2023

Can you run the following commands and send the output?

singularity exec -B /QRISdata/Q5952/data/tesla-phase1/melanoma_1/FASTQ -B /scratch/project_mnt/S0091/mantczak --no-home -H /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/pipelines/nextNEOpi/assets -B /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources -B /scratch/project_mnt/S0091/mantczak/soft/hlahd.1.7.0 -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/iedb:/opt/iedb -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/mhcflurry_data:/opt/mhcflurry_data /scratch/project_mnt/S0091/mantczak/tests/nextneopi_validation/work/singularity/apps-01.i-med.ac.at-images-singularity-NeoFuse_dev_0d1d4169.sif /bin/bash  -c  "ls -la /home/neofuse"

and

singularity exec -B /QRISdata/Q5952/data/tesla-phase1/melanoma_1/FASTQ -B /scratch/project_mnt/S0091/mantczak --no-home -H /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/pipelines/nextNEOpi/assets -B /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources -B /scratch/project_mnt/S0091/mantczak/soft/hlahd.1.7.0 -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/iedb:/opt/iedb -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/mhcflurry_data:/opt/mhcflurry_data /scratch/project_mnt/S0091/mantczak/tests/nextneopi_validation/work/singularity/apps-01.i-med.ac.at-images-singularity-NeoFuse_dev_0d1d4169.sif /bin/bash  -c  "mount"

Thanks

@mantczakaus
Copy link
Author

Can you run the following commands and send the output?

singularity exec -B /QRISdata/Q5952/data/tesla-phase1/melanoma_1/FASTQ -B /scratch/project_mnt/S0091/mantczak --no-home -H /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/pipelines/nextNEOpi/assets -B /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources -B /scratch/project_mnt/S0091/mantczak/soft/hlahd.1.7.0 -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/iedb:/opt/iedb -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/mhcflurry_data:/opt/mhcflurry_data /scratch/project_mnt/S0091/mantczak/tests/nextneopi_validation/work/singularity/apps-01.i-med.ac.at-images-singularity-NeoFuse_dev_0d1d4169.sif /bin/bash  -c  "ls -la /home/neofuse"

and

singularity exec -B /QRISdata/Q5952/data/tesla-phase1/melanoma_1/FASTQ -B /scratch/project_mnt/S0091/mantczak --no-home -H /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/pipelines/nextNEOpi/assets -B /scratch/project_mnt/S0091/mantczak/.nextflow/NXF_TEMP -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources -B /scratch/project_mnt/S0091/mantczak/soft/hlahd.1.7.0 -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/iedb:/opt/iedb -B /scratch/project_mnt/S0091/mantczak/data/nextNEOpi_1.3_resources/databases/mhcflurry_data:/opt/mhcflurry_data /scratch/project_mnt/S0091/mantczak/tests/nextneopi_validation/work/singularity/apps-01.i-med.ac.at-images-singularity-NeoFuse_dev_0d1d4169.sif /bin/bash  -c  "mount"

Thanks

Hi @riederd

The first command gave the following output: ls: cannot access '/home/neofuse': No such file or directory
The output of the second command attached
mount.txt

Thanks!

@riederd
Copy link
Member

riederd commented Jun 23, 2023

Thanks,

can you try again but with the option --containall added after --no-home

@mantczakaus
Copy link
Author

Thanks,

can you try again but with the option --containall added after --no-home

For the first command
contain_ls.txt
For the second command
contain_mount.txt
I also run the following command singularity exec --containall apps-01.i-med.ac.at-images-singularity-NeoFuse_dev_0d1d4169.sif ls -la /home/neofuse/.local/share/mhcflurry/4/2.0.0/models_class1_pan/models.combined/ in the folder with the container downloaded by the pipeline (work/singularity) and it gave me the following list of files:

-rw-r--r-- 1 uqmantcz qris-uq   760514 Jun 10  2020 allele_sequences.csv
-rw-r--r-- 1 uqmantcz qris-uq 59372077 Jun 10  2020 frequency_matrices.csv.bz2
-rw-r--r-- 1 uqmantcz qris-uq      102 Jun 10  2020 info.txt
-rw-r--r-- 1 uqmantcz qris-uq  1012279 Jun 10  2020 length_distributions.csv.bz2
-rw-r--r-- 1 uqmantcz qris-uq   115260 Jun 10  2020 manifest.csv
-rw-r--r-- 1 uqmantcz qris-uq  4483361 Jun 10  2020 model_selection_data.csv.bz2
-rw-r--r-- 1 uqmantcz qris-uq   215596 Jun 10  2020 model_selection_summary.csv.bz2
-rw-r--r-- 1 uqmantcz qris-uq 83090609 Jun 10  2020 percent_ranks.csv
-rw-r--r-- 1 uqmantcz qris-uq  4488832 Jun 10  2020 train_data.csv.bz2
-rw-r--r-- 1 uqmantcz qris-uq 11261512 Jun 10  2020 weights_PAN-CLASS1-1-05734e73adff1f25.npz
-rw-r--r-- 1 uqmantcz qris-uq 11261512 Jun 10  2020 weights_PAN-CLASS1-1-0c7c1570118fd907.npz
-rw-r--r-- 1 uqmantcz qris-uq  9160264 Jun 10  2020 weights_PAN-CLASS1-1-24d9082b2c8d7a60.npz
-rw-r--r-- 1 uqmantcz qris-uq  4582984 Jun 10  2020 weights_PAN-CLASS1-1-3ed9fb2d2dcc9803.npz
-rw-r--r-- 1 uqmantcz qris-uq  9160264 Jun 10  2020 weights_PAN-CLASS1-1-8475f7a9fb788e27.npz
-rw-r--r-- 1 uqmantcz qris-uq  5821000 Jun 10  2020 weights_PAN-CLASS1-1-9e049de50b72dc23.npz
-rw-r--r-- 1 uqmantcz qris-uq  7396364 Jun 10  2020 weights_PAN-CLASS1-1-9f7dfdd0c2763c42.npz
-rw-r--r-- 1 uqmantcz qris-uq  4845580 Jun 10  2020 weights_PAN-CLASS1-1-b17c8628ffc4b80d.npz
-rw-r--r-- 1 uqmantcz qris-uq  9160264 Jun 10  2020 weights_PAN-CLASS1-1-ce288787fc2f6872.npz
-rw-r--r-- 1 uqmantcz qris-uq  7396364 Jun 10  2020 weights_PAN-CLASS1-1-e33438f875ba4af2.npz

Thank you!

@riederd
Copy link
Member

riederd commented Jun 23, 2023

Great, so I'd suggest to change

runOptions = "--no-home" + " -H " + params.singularityTmpMount + " -B " + params.singularityAssetsMount + " -B " + params.singularityTmpMount + " -B " + params.resourcesBaseDir + params.singularityHLAHDmount + " -B " + params.databases.IEDB_dir + ":/opt/iedb" + " -B " + params.databases.MHCFLURRY_dir + ":/opt/mhcflurry_data"

to:

runOptions =  "--no-home --containall" + " -H " + params.singularityTmpMount + " -B " +  params.singularityAssetsMount + " -B " + params.singularityTmpMount + " -B " + params.resourcesBaseDir + params.singularityHLAHDmount + " -B " + params.databases.IEDB_dir + ":/opt/iedb" + " -B " + params.databases.MHCFLURRY_dir + ":/opt/mhcflurry_data"

I'm not sure if you would hit an issue elsewhere with this change, but it is worth trying. Let us know it it works, we might change it in the next version then.

@mantczakaus
Copy link
Author

Thank you! I've just launched the pipeline with the changed config file. I'll let you know how it goes.

@mantczakaus
Copy link
Author

It worked - thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants