Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantification Fails #2

Open
abhideeplife opened this issue May 27, 2022 · 3 comments
Open

Quantification Fails #2

abhideeplife opened this issue May 27, 2022 · 3 comments

Comments

@abhideeplife
Copy link

Hi,

Thank you for the tool. I am using it for isoform quantification. As a test run I am testing it on one sample from 10xV2.

The command that I am using

scasa --fastq ERX3806131/4861STDY7462259_R1.fastq.gz,ERX3806131/4861STDY7462259_R2.fastq.gz --ref $refPath --nthreads 8 --out Scasa_out

Processing message that I get

##############################################################

SCASA V1.0.0

SINGLE CELL TRANSCRIPT QUANTIFICATION TOOL

Version Date: 2021-04-07

FOR ANY ISSUES, CONTACT: LU.PAN@KI.SE

https://github.com/eudoraleer/scasa/

##############################################################

mkdir: cannot create directory ‘Scasa_out/SCASA_My_Project_20220527085718/’: File exists

Preparing for alignment..
Indexing reference..
Directory Scasa_out/SCASA_My_Project_20220527085718/0PRESETS//REF_INDEX/ already exists. Writing into existing directory..
Version Info: ### PLEASE UPGRADE SALMON ###

A newer version of salmon with important bug fixes and improvements is available.

The newest version, available at https://github.com/COMBINE-lab/salmon/releases
contains new features, improvements, and bug fixes; please upgrade at your
earliest convenience.

Sign up for the salmon mailing list to hear about new versions, features and updates at:
https://oceangenomics.com/subscribe
###[2022-05-27 08:57:18.759] [jLog] [warning] The salmon index is being built without any decoy sequences. It is recommended that decoy sequence (either computed auxiliary decoy sequence or the genome of the organism) be provided during indexing. Further details can be found at https://salmon.readthedocs.io/en/latest/salmon.html#preparing-transcriptome-indices-mapping-based-mode.
[2022-05-27 08:57:18.759] [jLog] [info] building index
out : Scasa_out/SCASA_My_Project_20220527085718/0PRESETS//REF_INDEX/
[2022-05-27 08:57:18.759] [puff::index::jointLog] [info] Running fixFasta

[Step 1 of 4] : counting k-mers

[2022-05-27 08:57:26.609] [puff::index::jointLog] [warning] Removed 237 transcripts that were sequence duplicates of indexed transcripts.
[2022-05-27 08:57:26.609] [puff::index::jointLog] [warning] If you wish to retain duplicate transcripts, please use the --keepDuplicates flag
[2022-05-27 08:57:26.610] [puff::index::jointLog] [info] Replaced 5 non-ATCG nucleotides
[2022-05-27 08:57:26.610] [puff::index::jointLog] [info] Clipped poly-A tails from 12501 transcripts
wrote 70629 cleaned references
[2022-05-27 08:57:27.256] [puff::index::jointLog] [info] Filter size not provided; estimating from number of distinct k-mers
[2022-05-27 08:57:30.321] [puff::index::jointLog] [info] ntHll estimated 84081876 distinct k-mers, setting filter size to 2^31
Threads = 2
Vertex length = 31
Hash functions = 5
Filter size = 2147483648
Capacity = 2
Files:
Scasa_out/SCASA_My_Project_20220527085718/0PRESETS//REF_INDEX/ref_k31_fixed.fa

Round 0, 0:2147483648
Pass Filling Filtering
1 25 69
2 4 0
True junctions count = 266516
False junctions count = 404633
Hash table size = 671149
Candidate marks count = 4093104

Reallocating bifurcations time: 0
True marks count: 2954071
Edges construction time: 4

Distinct junctions = 266516

allowedIn: 12
Max Junction ID: 308126
seen.size():2465017 kmerInfo.size():308127
approximateContigTotalLength: 65012593
counters for complex kmers:
(prec>1 & succ>1)=25336 | (succ>1 & isStart)=59 | (prec>1 & isEnd)=73 | (isStart & isEnd)=10
contig count: 417773 element count: 96576272 complex nodes: 25478

of ones in rank vector: 417772

[2022-05-27 08:59:28.244] [puff::index::jointLog] [info] Starting the Pufferfish indexing by reading the GFA binary file.
[2022-05-27 08:59:28.244] [puff::index::jointLog] [info] Setting the index/BinaryGfa directory Scasa_out/SCASA_My_Project_20220527085718/0PRESETS//REF_INDEX
size = 96576272

| Loading contigs | Time = 9.0059 ms

size = 96576272

| Loading contig boundaries | Time = 5.0433 ms

Number of ones: 417772
Number of ones per inventory item: 512
Inventory entries filled: 816
417772
[2022-05-27 08:59:28.456] [puff::index::jointLog] [info] Done wrapping the rank vector with a rank9sel structure.
[2022-05-27 08:59:28.460] [puff::index::jointLog] [info] contig count for validation: 417772
[2022-05-27 08:59:28.648] [puff::index::jointLog] [info] Total # of Contigs : 417772
[2022-05-27 08:59:28.648] [puff::index::jointLog] [info] Total # of numerical Contigs : 417772
[2022-05-27 08:59:28.676] [puff::index::jointLog] [info] Total # of contig vec entries: 3035777
[2022-05-27 08:59:28.676] [puff::index::jointLog] [info] bits per offset entry 22
[2022-05-27 08:59:28.787] [puff::index::jointLog] [info] Done constructing the contig vector. 417773
[2022-05-27 08:59:28.924] [puff::index::jointLog] [info] # segments = 417772
[2022-05-27 08:59:28.924] [puff::index::jointLog] [info] total length = 96576272
[2022-05-27 08:59:28.957] [puff::index::jointLog] [info] Reading the reference files ...
[2022-05-27 08:59:29.688] [puff::index::jointLog] [info] positional integer width = 27
[2022-05-27 08:59:29.688] [puff::index::jointLog] [info] seqSize = 96576272
[2022-05-27 08:59:29.688] [puff::index::jointLog] [info] rankSize = 96576272
[2022-05-27 08:59:29.688] [puff::index::jointLog] [info] edgeVecSize = 0
[2022-05-27 08:59:29.688] [puff::index::jointLog] [info] num keys = 84043112
for info, total work write each : 2.331 total work inram from level 3 : 4.322 total work raw : 25.000
[Building BooPHF] 100 % elapsed: 0 min 10 sec remaining: 0 min 0 sec
Bitarray 440364608 bits (100.00 %) (array + ranks )
final hash 0 bits (0.00 %) (nb in final hash 0)
[2022-05-27 08:59:39.507] [puff::index::jointLog] [info] mphf size = 52.4956 MB
[2022-05-27 08:59:39.580] [puff::index::jointLog] [info] chunk size = 48288136
[2022-05-27 08:59:39.580] [puff::index::jointLog] [info] chunk 0 = [0, 48288136)
[2022-05-27 08:59:39.580] [puff::index::jointLog] [info] chunk 1 = [48288136, 96576242)
[2022-05-27 08:59:52.325] [puff::index::jointLog] [info] finished populating pos vector
[2022-05-27 08:59:52.325] [puff::index::jointLog] [info] writing index components
[2022-05-27 08:59:52.728] [puff::index::jointLog] [info] finished writing dense pufferfish index
[2022-05-27 08:59:52.766] [jLog] [info] done building index
Finnished indexing reference..
Begins pseudo-alignment..
nohup: redirecting stderr to stdout

The ERROR that I am getting as soon as the quantification step starts is below

Congratulations! Pseudo-alignment has completed in 1590 seconds!
Scasa quantification has started..
Begin Scasa quantification for sample 4861STDY7462259..
Loading required package: iterators
Loading required package: parallel
Error in { : task 1 failed - "NA/NaN argument"
Calls: %dopar% ->
Execution halted
Loading required package: iterators
Loading required package: parallel
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
Calls: load -> readChar
In addition: Warning message:
In readChar(con, 5L, useBytes = TRUE) :
cannot open compressed file '/home/jupyter/Scasa_out/SCASA_My_Project_20220527085718/2QUANT/4861STDY7462259_quant/Sample_eqClass.RData', probable reason 'No such file or directory'
Execution halted
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
Calls: load -> readChar
In addition: Warning message:
In readChar(con, 5L, useBytes = TRUE) :
cannot open compressed file 'Scasa_out/SCASA_My_Project_20220527085718/2QUANT//4861STDY7462259_quant//scasa_isoform_expression.RData', probable reason 'No such file or directory'
Execution halted
Congratulations! Scasa single cell RNA-Seq transcript quantification has completed in 30 seconds!
All done!

I have installed all the R packages and I am not sure why the quantification is not being performed.

Could you please help.

Thank you

@eudoraleer
Copy link
Owner

Error in { : task 1 failed - "NA/NaN argument"

Hihi,

It looks like something is wrong with either the R library or the data itself. Do you have the post-alignment file?

Best,
Lu

@ThepeachYolado
Copy link

Error in { : task 1 failed - "NA/NaN argument"

Hihi,

It looks like something is wrong with either the R library or the data itself. Do you have the post-alignment file?

Best, Lu

Hi,
I had the same problem.
Could you please help.
Thanks!

@nghiavtr
Copy link
Collaborator

@yangwh1998

Please consider using docker to run scasa to avoid the issue of installing the dependencies.

How to run scasa with docker is provided here:
https://github.com/eudoraleer/scasa/blob/main/README.md#using-docker-to-run-scasa

Best,
Nghia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants