Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too few assignation of fragments to transcripts in the index #1111

Closed
colin893 opened this issue Nov 12, 2023 · 3 comments
Closed

Too few assignation of fragments to transcripts in the index #1111

colin893 opened this issue Nov 12, 2023 · 3 comments
Labels
bug Something isn't working
Milestone

Comments

@colin893
Copy link

Description of the bug

I am using the rnaseq pipeline to analyze some samples of single-cells. While I successively completed the analysis for 2 other experiments, it seems like a particular cell throws an error related to the number of frags : [warning] salmon was only able to assign 3 fragments to transcripts in the index, but the minimum number of required assigned fragments (--minAssignedFrags) was 10. This could be indicative of a mismatch between the reference and sample, or a very bad sample. You can change the --minAssignedFrags parameter to force salmon to quantify with fewer assigned fragments (must have at least 1).

Of course, I tried to give the parameter to the command line : --extra_salmon_quant_args "--minAssignedFrags 1"

However, still get the error and not sure of how to manage this then.

Command used and terminal output

command :

./nextflow run nf-core/rnaseq --input Samples/Exp1/sampleSheet.csv --outdir ../ProcessedData/Exp1/ --fasta ../Ref/genomer103pEXT002.fa --gtf ../Ref/genesr103pEXT002.gtf -profile docker --max_memory '60.GB' --star_index /media/zddm2021/T7/FlashSeq/genome/index/star/ --trimmer trimgalore --rsem_index /media/zddm2021/T7/FlashSeq/genome/rsem/ --salmon_index /media/zddm2021/T7/FlashSeq/genome/index/salmon/ --extra_salmon_quant_args "--minAssignedFrags 1"


logs :

Nov-11 15:31:54.865 [Task submitter] INFO  nextflow.Session - [3a/48af72] Submitted process > NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT (22_11_15_GFP-3_G12)
Nov-11 15:31:54.866 [Task monitor] DEBUG nextflow.processor.TaskProcessor - Handling unexpected condition for
  task: name=NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT (22_11_15_GFP-3_H1); work-dir=/mnt/d1d54bcf-dec1-4d25-ae36-3647835a7fd4/FlashSeq/Scripts/work/94/7f8cea7b810e9a2a7633e2a22059c2
  error [nextflow.exception.ProcessFailedException]: Process `NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT (22_11_15_GFP-3_H1)` terminated with an error exit status (1)
Nov-11 15:31:54.886 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT (22_11_15_GFP-3_H1)'

Caused by:
  Process `NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT (22_11_15_GFP-3_H1)` terminated with an error exit status (1)

Command executed:

  salmon quant \
      --geneMap genesr103pEXT002.gtf \
      --threads 6 \
      --libType=A \
      --index salmon \
      -1 22_11_15_GFP-3_H1.subsampled_R1.fastq.gz -2 22_11_15_GFP-3_H1.subsampled_R2.fastq.gz \
      --skipQuant \
      -o 22_11_15_GFP-3_H1
  
  if [ -f 22_11_15_GFP-3_H1/aux_info/meta_info.json ]; then
      cp 22_11_15_GFP-3_H1/aux_info/meta_info.json "22_11_15_GFP-3_H1_meta_info.json"
  fi
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT":
      salmon: $(echo $(salmon --version) | sed -e "s/salmon //g")
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  [2023-11-11 14:31:15.016] [jointLog] [info] Usage of --validateMappings implies use of minScoreFraction. Since not explicitly specified, it is being set to 0.65
  [2023-11-11 14:31:15.016] [jointLog] [info] Setting consensusSlack to selective-alignment default of 0.35.
  [2023-11-11 14:31:15.016] [jointLog] [info] parsing read library format
  [2023-11-11 14:31:15.016] [jointLog] [info] There is 1 library.
  [2023-11-11 14:31:15.017] [jointLog] [info] Loading pufferfish index
  [2023-11-11 14:31:15.017] [jointLog] [info] Loading dense pufferfish index.
  -----------------------------------------
  | Loading contig table | Time = 25.809 s
  -----------------------------------------
  size = 24956290
  -----------------------------------------
  | Loading contig offsets | Time = 75.55 ms
  -----------------------------------------
  -----------------------------------------
  | Loading reference lengths | Time = 220.7 us
  -----------------------------------------
  -----------------------------------------
  | Loading mphf table | Time = 439.05 ms
  -----------------------------------------
  size = 1831963234
  Number of ones: 24956289
  Number of ones per inventory item: 512
  Inventory entries filled: 48743
  -----------------------------------------
  | Loading contig boundaries | Time = 6.7553 s
  -----------------------------------------
  size = 1831963234
  -----------------------------------------
  | Loading sequence | Time = 396.97 ms
  -----------------------------------------
  size = 1083274564
  -----------------------------------------
  | Loading positions | Time = 3.5343 s
  -----------------------------------------
  size = 1488550916
  -----------------------------------------
  | Loading reference sequence | Time = 316.42 ms
  -----------------------------------------
  -----------------------------------------
  | Loading reference accumulative lengths | Time = 389.27 us
  -----------------------------------------
  
  
  
  
  [2023-11-11 14:31:52.345] [jointLog] [info] done
  [2023-11-11 14:31:52.426] [jointLog] [info] Index contained 55267 targets
  [2023-11-11 14:31:52.447] [jointLog] [info] Number of decoys : 994
  [2023-11-11 14:31:52.447] [jointLog] [info] First decoy index : 54264 
  [2023-11-11 14:31:52.739] [jointLog] [warning] salmon was only able to assign 3 fragments to transcripts in the index, but the minimum number of required assigned fragments (--minAssignedFrags) was 10. This could be indicative of a mismatch between the reference and sample, or a very bad sample.  You can change the --minAssignedFrags parameter to force salmon to quantify with fewer assigned fragments (must have at least 1).

Work dir:
  /mnt/d1d54bcf-dec1-4d25-ae36-3647835a7fd4/FlashSeq/Scripts/work/94/7f8cea7b810e9a2a7633e2a22059c2

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
Nov-11 15:31:54.894 [Task monitor] INFO  nextflow.Session - Execution cancelled -- Finishing pending tasks before exit
Nov-11 15:31:54.897 [main] DEBUG nextflow.Session - Session await > all processes finished
Nov-11 15:31:54.908 [Actor Thread 19] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Nov-11 15:31:54.909 [Actor Thread 8] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Nov-11 15:31:54.908 [Actor Thread 12] DEBUG nextflow.file.SortFileCollector - FileCollector temp dir not removed: null
Nov-11 15:31:54.918 [Actor Thread 25] DEBUG nextflow.sort.BigSort - Sort completed -- entries: 7; slices: 1; internal sort time: 0.008 s; external sort time: 0.002 s; total time: 0.01 s
Nov-11 15:31:54.978 [Actor Thread 25] DEBUG nextflow.file.FileCollector - Saved collect-files list to: /mnt/d1d54bcf-dec1-4d25-ae36-3647835a7fd4/FlashSeq/Scripts/work/collect-file/ab0427e85927608f98b84fd6188e9c70
Nov-11 15:31:54.981 [Actor Thread 25] DEBUG nextflow.file.FileCollector - Deleting file collector temp dir: /tmp/nxf-6395931895396976487
Nov-11 15:32:39.914 [Task monitor] DEBUG n.processor.TaskPollingMonitor - Task completed > TaskHandler[id: 176; name: NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT (22_11_15_GFP-3_G12); status: COMPLETED; exit: 0; error: -; workDir: /mnt/d1d54bcf-dec1-4d25-ae36-3647835a7fd4/FlashSeq/Scripts/work/3a/48af7284c6bbba72f3ce1b37a8b1d8]
Nov-11 15:32:39.921 [Task monitor] DEBUG n.processor.TaskPollingMonitor - <<< barrier arrives (monitor: local) - terminating tasks monitor poll loop
Nov-11 15:32:39.921 [main] DEBUG nextflow.Session - Session await > all barriers passed
Nov-11 15:32:39.928 [main] DEBUG nextflow.util.ThreadPoolManager - Thread pool 'PublishDir' shutdown completed (hard=false)
Nov-11 15:32:39.932 [main] INFO  nextflow.Nextflow - -[nf-core/rnaseq] Pipeline completed with errors-
Nov-11 15:32:39.938 [main] DEBUG n.trace.WorkflowStatsObserver - Workflow completed > WorkflowStats[succeededCount=175; failedCount=1; ignoredCount=0; cachedCount=0; pendingCount=167; submittedCount=0; runningCount=0; retriesCount=0; abortedCount=0; succeedDuration=6h 6m 12s; failedDuration=4m 5s; cachedDuration=0ms;loadCpus=0; loadMemory=0; peakRunning=10; peakCpus=12; peakMemory=60 GB; ]
Nov-11 15:32:39.938 [main] DEBUG nextflow.trace.TraceFileObserver - Workflow completed -- saving trace file
Nov-11 15:32:39.941 [main] DEBUG nextflow.trace.ReportObserver - Workflow completed -- rendering execution report
Nov-11 15:32:41.077 [main] DEBUG nextflow.trace.TimelineObserver - Workflow completed -- rendering execution timeline
Nov-11 15:32:41.339 [main] DEBUG nextflow.cache.CacheDB - Closing CacheDB done
Nov-11 15:32:41.381 [main] DEBUG nextflow.util.ThreadPoolManager - Thread pool 'FileTransfer' shutdown completed (hard=false)
Nov-11 15:32:41.383 [main] DEBUG nextflow.script.ScriptRunner - > Execution complete -- Goodbye

Relevant files

No response

System information

Nextflow version 23.10.0
Docker
Exectued on local PC
Linux Ubuntu
nf-core/rnaseq v3.12.0-g3bec233

@colin893 colin893 added the bug Something isn't working label Nov 12, 2023
@mahesh-panchal
Copy link
Member

mahesh-panchal commented Dec 8, 2023

Someone I know just encountered this issue too. Digging around the code, it seems that the FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT process doesn't use the params.extra_salmon_quant_args and is coded to only be --skipQuant (in the file conf/modules.config). To solve this issue on your own, you can supply a custom config using the -c option to nextflow run with the following contents.

salmon_quant.config:

process {
    withName: '.*:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT' {
        ext.args   = '--skipQuant --minAssignedFrags 1'
    }
}

and then

nextflow run nf-core/rnaseq -c salmon_quant.config ...

@drpatelh drpatelh added this to the 3.13.3 milestone Jan 3, 2024
@pinin4fjords
Copy link
Member

I think we may also need to make some changes to assist on this issue, since I came across the same thing in development of the riboseq workflow. Specifically:

  • You may have high adapter content, and the workflow currently trims AFTER strandedness inference- we will reorder the steps either here or in a new factored-out subworkflow.
  • You may have reads too short for the kmer size used (we have allowed specification of kmer size in this recent PR).

@pinin4fjords
Copy link
Member

Fixed (I believe) in #1144 and #1154

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants