Skip to content
This repository has been archived by the owner on May 3, 2024. It is now read-only.

Error in sqanti_qc2.py #10

Closed
ljw90607 opened this issue Mar 4, 2019 · 13 comments
Closed

Error in sqanti_qc2.py #10

ljw90607 opened this issue Mar 4, 2019 · 13 comments

Comments

@ljw90607
Copy link

ljw90607 commented Mar 4, 2019

Dear @Magdoll

Hello I ran your updated version of SQANTI2 with script below

python /appl/sqanti2_2/SQANTI2/sqanti_qc2.py -t 30 -c illumina/PM-AU-0002-N-A1SJ.out.tab 20180817_colon_2N_Nanoflit_q7_pychopper_2.fasta /data/ONT_RNA/reference/Homo_sapiens.GRCh38.93.gtf /data/ONT_RNA/reference/hg38.fa

and the script stopped running due to the following reason

Error in `$<-.data.frame`(`*tmp*`, SJ_type, value = "__SJ") : 
  replacement has 1 row, data has 0
Calls: $<- -> $<-.data.frame
Execution halted
Traceback (most recent call last):
  File "/appl/sqanti2_2/SQANTI2/sqanti_qc2.py", line 1515, in <module>
    main()
  File "/appl/sqanti2_2/SQANTI2/sqanti_qc2.py", line 1511, in main
    run(args)
  File "/appl/sqanti2_2/SQANTI2/sqanti_qc2.py", line 1346, in run
    if subprocess.check_call(cmd, shell=True)!=0:
  File "/usr/local/lib/python2.7/subprocess.py", line 190, in check_call
    raise CalledProcessError(retcode, cmd)

Script has ran just fine without the -c (illumina SJ file) option, but with this result I'm having this trouble.

Thank you very much for your help!

Jungwoo

@Magdoll
Copy link
Owner

Magdoll commented Mar 4, 2019

Hi @ljw90607 ,

Can you check that the .classification.txt and .junctions.txt are already written? This error is odd as it seems to be pertaining to the R script portion of the pipeline which should not be affected by the additional -c junction info.

If the .classification.txt and .junctions.txt files are already written, please try running:

python <path_to_SQANTI2>/utilities/SQANTI_report2.R  xxx.classifications.txt xxx.junctions.txt

and see if you can reproduce the error.

--Liz

@ljw90607
Copy link
Author

ljw90607 commented Mar 4, 2019

Thank you @Magdoll ,

The .classification.txt file is written, but also tmp file is present.
And .junctions.txt file with only the header is written (with no output), and also the tmp file is present

I did check the script you have provided, and got the following error

File "/appl/sqanti2/SQANTI2/utilities/SQANTI_report2.R", line 11
class.file <- args[1]
^
SyntaxError: invalid syntax

This error did not occur when I did not put the short read junction file in as a input.

Thank you so much for your help.

Jungwoo

@ljw90607
Copy link
Author

ljw90607 commented Mar 5, 2019

Dear @Magdoll ,

After few more test run for sqanti_qc2.py, I found that the junction.txt file is not being produced.
Also after the error, classification.txt file remained as classificatino.txt_tmp file.

What could have gone wrong?

Thank you very much for your help.

Jungwoo

@Magdoll
Copy link
Owner

Magdoll commented Mar 5, 2019

Hi @ljw90607 ,

I'm not sure why you are getting this error. The .junctions.txt files should be written.

In order to best help with debugging, please first update to the latest SQANTI2 through GitHub and re-run again. Note the error. If it's the same, I will ask you to send me the input files for debugging.

--Liz

@ljw90607
Copy link
Author

ljw90607 commented Mar 8, 2019

Dear @Magdoll

After reinstalling the tool and trying out few script run, I was able to run the sqanti_qc2.py.
But using the output (classification.txt and junctions.txt), I am getting no output (0 file size) for filtered.lite.fasta file.
I have used short read junction files to get the classification.txt and junctions.txt output.
Not sure why no output is being generated.
If you could help me with it, I would really appreciate it!

(I am getting about 50,000 reads for filtered.lite.reasons.txt)

Jungwoo

@Magdoll
Copy link
Owner

Magdoll commented Mar 9, 2019

Hi @ljw90607 ,

Please see how many entries you have in .classification.txt and what the reasons listed in filtered.lite.reasons.txt are. I wonder if all of them were filtered out for some reason.

--Liz

@ljw90607
Copy link
Author

Dear @Magdoll

Thank you for your comment.

Without using any options (using default) for the run,
I do get 1,378,410 rows of read information in .classification.txt file and 54,095 row of reads were filtered in filtered.lite.reasons.txt file due to low coverage, intrapriming, RT switching.

I tried with -c 10 and the log said 1122226 to be kept (and 250,000 reads filtered out), but still no output is being written.

Jungwoo

@Magdoll
Copy link
Owner

Magdoll commented Mar 10, 2019

Hi @ljw90607 ,
I think I know why you are not getting the output fasta. You probably gave it the wrong fasta with not the right seq IDs.

Please use the ".renamed.fasta" from the sqanti_qc.py output for the filtering step.

--Liz

@ljw90607
Copy link
Author

ljw90607 commented Mar 11, 2019

Dear @Magdoll

Thank you so much for your comment.
I have just ran sqanti_filter.py using the renamed.fasta and still did not get the output.

What I have actually noticed is that there were added information right after the isoform IDs in .classification.txt.
For example, in the .renamed.fasta, one of IDs are present as "55c76dab-0331-4dfa-9b79-a3cd1392b783|-". But in the .classification.txt, the same ID is shown as "55c76dab-0331-4dfa-9b79-a3cd1392b783%7C-". (%7C- as added letters)
Would this be the issue of why I am not able to get the output fasta file?
How could I match the IDs between .classification.txt and the fasta file.

Thank you so much again for your wonderful help.

Jungwoo

@Magdoll
Copy link
Owner

Magdoll commented Mar 19, 2019

Hi @ljw90607 ,

I think there's some weird symbols in your original FASTA file. Can you please clean up the original input fasta to use only IDs with format PB.X.Y or at least something like it?

--Liz

@ljw90607
Copy link
Author

ljw90607 commented Apr 3, 2019

Dear @Magdoll

Thank you for your wonderful help.

I am actually trying to run another sqanti_qc run using another input.
What could be the issue if the sqanti_qc run ends with the "error: data input empty"?
(no renamed.corrected.faa is being made)
I have used a filtered out fastq file as a input, but I cannot figure out why this is occurring.
The error message is shown as below.

Skipping ffffa8cd-c32e-4598-a39b-16c09a16d1a6 because unmapped. Skipping ffffb2d6-93db-4206-85ad-8c5e9f4b46c1 because unmapped. Skipping ffffecf7-8599-4a65-b775-372827dbf509 because unmapped. Skipping fffff947-b4bd-4181-b580-e6f7f425b657 because unmapped. Skipping fffffd1a-bfec-4ef5-968e-e0fbd3ab3f2a because unmapped. error: data input empty Traceback (most recent call last): File "/appl/sqanti2_2/SQANTI2/sqanti_qc2.py", line 1518, in <module> main() File "/appl/sqanti2_2/SQANTI2/sqanti_qc2.py", line 1514, in main run(args) File "/appl/sqanti2_2/SQANTI2/sqanti_qc2.py", line 1165, in run orfDict = correctionPlusORFpred(args, genome_dict) File "/appl/sqanti2_2/SQANTI2/sqanti_qc2.py", line 444, in correctionPlusORFpred if subprocess.check_call(cmd, shell=True, cwd=gmst_dir)!=0: File "/usr/local/lib/python2.7/subprocess.py", line 190, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'perl /appl/sqanti2_2/SQANTI2/utilities/gmst/gmst.pl -faa --strand direct --fnn --output /data/ONT_RNA/Data/PM-AU-0002-T-A1/PM-AU-0002-T-A1_notag.fastq_sqanti_output/GMST/GMST_tmp /data/ONT_RNA/Data/PM-AU-0002-T-A1/PM-AU-0002-T-A1_notag.fastq_sqanti_output/PM-AU-0002-T-A1_notag.renamed_corrected.fasta' returned non-zero exit status 1

Thank you so much for your help.

Jungwoo

@Magdoll
Copy link
Owner

Magdoll commented Apr 8, 2019

Hi @ljw90607 ,

It is not possible to know the exact cause without looking at the data, but I suspect one possibility is all of your reads are unmapped? Hence there is no data to run SQANTI on.

--Liz

@Magdoll
Copy link
Owner

Magdoll commented Jun 2, 2019

Close unless further notice.

@Magdoll Magdoll closed this as completed Jun 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants