Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The genome file annotation_chr.genome has no valid entries #2

Open
Kingatsu opened this issue Oct 28, 2022 · 10 comments
Open

The genome file annotation_chr.genome has no valid entries #2

Kingatsu opened this issue Oct 28, 2022 · 10 comments

Comments

@Kingatsu
Copy link

Hello,
The CRAFT software is a power tools for me.
However, I got the following error messages when I want to use fasta file from GENCODE, which is used to find my circRNA.
Error: The genome file annotation_chr.genome has no valid entries. Exiting.
Error: The genome file annotation_chr.genome has no valid entries. Exiting.
mv: cannot stat '/data/input/backsplice_gene_name.txt': No such file or directory
I have changed the list_backsplice.txt and backsplice_gene_name.txt from 1 to chr1, to adapt my fasta file.
But the error still here.

I don't know whether I MUST use the genome file from Ensembl for CRAFT.
Can you give me some helps?

Thanks in advance.

Regards,
Kingatsu

@annadalmolin
Copy link
Owner

annadalmolin commented Oct 28, 2022 via email

@Kingatsu
Copy link
Author

Hello Anna,
Thanks for the prompt reply.

Yes, I used the Ensembl genome file and it seems work!
But it took me more than 12h but just finished the MiRNA binding site prediction analysis for ~20 circRNAs.
I am not sure if it is normal.

Thanks again for the help!

Regards,
Kingatsu

@annadalmolin
Copy link
Owner

annadalmolin commented Oct 29, 2022 via email

@Kingatsu
Copy link
Author

Hi Anna,

Here is my computer data:

cat /proc/cpuinfo| grep "cpu cores"| uniq
cpu cores       : 10
cat /proc/cpuinfo| grep "processor"| wc -l
20

And my param.txt is:

MRO
hsa
80      -15
all     all
hg38
1       1       30      FALSE   both
l=160000
l=160000

I think it doesn't check the overlapping with AGO2 binding sites because I don't put the AGO2_binding_sites.bed into /input.
Also, I tested the data with same parameters except changed "MRO" to "RO" yesterday.
And I got the output this morning

...
Quitting from lines 94-165 (functional_predictions_single_circRNA.Rmd)
Error in read.table(file_beRBP, header = T, sep = "\t") :
  no lines available in input
Calls: <Anonymous> ... withCallingHandlers -> withVisible -> eval -> eval -> read.table
In addition: Warning messages:
1: In system("timedatectl", intern = TRUE) :
  running command 'timedatectl' had status 1
2: package(s) not installed when version(s) same as current; use `force = TRUE` to
  re-install: 'org.Hs.eg.db'

Execution halted
mv: cannot stat 'functional_predictions_single_circRNA.html': No such file or directory
...
Quitting from lines 90-153 (functional_predictions_all_circRNAs.Rmd)
Error in read.table(file_beRBP, header = T, sep = "\t") :
  no lines available in input
Calls: <Anonymous> ... withCallingHandlers -> withVisible -> eval -> eval -> read.table
In addition: Warning message:
package(s) not installed when version(s) same as current; use `force = TRUE` to
  re-install: 'grid'

Execution halted

It seems the RBP prediction failed, and here is the beRBP.log:

working on M001_0.6 Sat Oct 29 03:43:26 UTC 2022
analysis_RBP featureMat_start=Sat Oct 29 03:43:26 UTC 2022
1 rank...Sat Oct 29 03:43:29 UTC 2022
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: Error pre-fetching sequence data 
cp: cannot stat 'siteSegment_1.csrv': No such file or directory
2 rank...Sat Oct 29 03:43:58 UTC 2022
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: Error pre-fetching sequence data 
cut: siteSegment_2.csrv: No such file or directory
paste: csrvI.txt: No such file or directory
3 rank...Sat Oct 29 03:44:19 UTC 2022
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: Error pre-fetching sequence data 
cut: siteSegment_3.csrv: No such file or directory
4 rank...Sat Oct 29 03:44:47 UTC 2022
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: Error pre-fetching sequence data 
cut: siteSegment_4.csrv: No such file or directory
5 rank...Sat Oct 29 03:45:16 UTC 2022
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: Error pre-fetching sequence data 
cut: siteSegment_5.csrv: No such file or directory
6 rank...Sat Oct 29 03:45:44 UTC 2022
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: Error pre-fetching sequence data 
cut: siteSegment_6.csrv: No such file or directory
7 rank...Sat Oct 29 03:46:05 UTC 2022
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: Error pre-fetching sequence data 
cut: siteSegment_7.csrv: No such file or directory
8 rank...Sat Oct 29 03:46:28 UTC 2022
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: Error pre-fetching sequence data 
cut: siteSegment_8.csrv: No such file or directory
9 rank...Sat Oct 29 03:46:54 UTC 2022
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: Error pre-fetching sequence data 
cut: siteSegment_9.csrv: No such file or directory
10 rank...Sat Oct 29 03:47:14 UTC 2022
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: Error pre-fetching sequence data 
cut: siteSegment_10.csrv: No such file or directory
analysis_RBP featureMat_finished=Sat Oct 29 03:47:40 UTC 2022
M001_0.6 all
randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
Error in file(file, "rt") : cannot open the connection
Calls: generalPred -> read.delim -> read.table -> file
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'analysis_RBP.ftMat.txt': No such file or directory
Execution halted

Is this the circRNA sequence does not predict RBP or sth wrong before RBP prediction?

Thanks in advance.

Regards.
Kingatsu

@annadalmolin
Copy link
Owner

annadalmolin commented Nov 2, 2022 via email

@Kingatsu
Copy link
Author

Kingatsu commented Nov 3, 2022

Hi Anna,

  • did you placed the UCSC genome and indexes in the input/ directory?
    Yes, here is my input files list:
backsplice_gene_name.txt  hg38.02.idx  hg38.nhr  hg38.not  hg38.nto                     Homo_sapiens.GRCh38.dna.primary_assembly.fa
hg38.00.idx               hg38.fa      hg38.nin  hg38.nsq  hg38.shd                     Homo_sapiens.GRCh38.dna.primary_assembly.fa.fai
hg38.01.idx               hg38.ndb     hg38.njs  hg38.ntf  Homo_sapiens.GRCh38.104.gtf
  • And *hg38 *is the corresponding genome version?
    Yes, I used circompara2 to find the circRNAs with GRCh38.p12.genome.fa [from GENCODE]

  • the parameters in the params.txt file are tab-separated?
    Yes, I am sure that.

Now I am re-running CRAFT with "MO" parameter.
I will tell you the error details later if it occurs.

Regards,
Kingatsu

@Kingatsu
Copy link
Author

Kingatsu commented Nov 3, 2022

Hi Anna,
Here is the error log:

Error (circ1) : No ORFs found for the specified parameters
Error (circ1) : No ORFs found for the specified parameters
Error (circ1) : No ORFs found for the specified parameters
Error (circ1) : No ORFs found for the specified parameters
Error: Invalid record in file result_30.bed. Record is
circ2     101     10
ERROR: Received illegal bin number 262143 from getBin call.
ERROR: Unable to add record to tree.
ORF prediction analysis completed.

The circ1 is a 'chr_start-end' form and circ2 is a 'chr:start-end' form, I am sorry for blurring them off here.
Then it came to the R pipeline, and so many warnings and errors occured.
Such as:

- Maybe a connection error: 
Warning: unable to access index for repository https://bioconductor.org/packages/3.14/bioc/src/contrib:
  cannot open URL 'https://bioconductor.org/packages/3.14/bioc/src/contrib/PACKAGES'
Warning: unable to access index for repository https://bioconductor.org/packages/3.14/data/annotation/src/contrib:
  cannot open URL 'https://bioconductor.org/packages/3.14/data/annotation/src/contrib/PACKAGES'
Warning: unable to access index for repository https://bioconductor.org/packages/3.14/data/experiment/src/contrib:
  cannot open URL 'https://bioconductor.org/packages/3.14/data/experiment/src/contrib/PACKAGES'
Warning: unable to access index for repository https://bioconductor.org/packages/3.14/workflows/src/contrib:
  cannot open URL 'https://bioconductor.org/packages/3.14/workflows/src/contrib/PACKAGES'
Warning: unable to access index for repository https://bioconductor.org/packages/3.14/books/src/contrib:
  cannot open URL 'https://bioconductor.org/packages/3.14/books/src/contrib/PACKAGES'
Warning: unable to access index for repository https://cloud.r-project.org/src/contrib:
  cannot open URL 'https://cloud.r-project.org/src/contrib/PACKAGES'
PS: but actually I can download the PACKAGES from this URL by 'wget'

- Error in function
Quitting from lines 650-700 (functional_predictions_single_circRNA.Rmd)
Error in function (type, msg, asError = TRUE)  :
  Failed to connect to multimir.org port 80: Connection refused
Calls: <Anonymous> ... submit_request -> <Anonymous> -> .postForm -> <Anonymous> -> fun
In addition: Warning messages:
1: In system("timedatectl", intern = TRUE) :
  running command 'timedatectl' had status 1
2: package(s) not installed when version(s) same as current; use `force = TRUE` to
  re-install: 'org.Hs.eg.db'
3: position_stack requires non-overlapping x intervals
4: ggrepel: 287 unlabeled data points (too many overlaps). Consider increasing max.overlaps
5: position_stack requires non-overlapping x intervals
6: ggrepel: 292 unlabeled data points (too many overlaps). Consider increasing max.overlaps
7: ggrepel: 287 unlabeled data points (too many overlaps). Consider increasing max.overlaps

That is the error I have seen. it seems some circ analysis failed before (like can't find the ORF) and started cascade of Error in R later.

Thanks in advance.

Regards,
Kingatsu

@annadalmolin
Copy link
Owner

annadalmolin commented Nov 10, 2022 via email

@annadalmolin
Copy link
Owner

annadalmolin commented Nov 18, 2022 via email

@Kingatsu
Copy link
Author

Hi Anna,
Sorry for the late reply.

Well, I cann't run the CRAFT with RBP detection yet. So maybe the problems are in the circRNA sequences, I guess.
I am running the circRNA detection tools again and try to find some other circRNAs, then run the CRAFT again.
I will tell you if it works!

Regards,
Kingatsu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants