-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[nf-core/circrna] error: Your FASTQ files do not have the appropriate extension #42
Comments
Hey there, unfortunately, the workflow has been designed to work with paired-end FASTQ files. |
Hi Barry, thanks so much for letting me know that. |
I'm looking at the experiment metadata for SRR6343608 It states the data is Might I suggest using Good luck! |
Hey Barry, Thanks for your kind help! Thanks again! Kind regards, |
Hi Barry, Thanks for letting me know about the useful pipeline nf-core/fetchngs! It works well! You are right, it is paired end datasetSRR6343628. I don't know why my previous method didn't work.
Last week I tried the same data set: PRJNA420975 with nf-core/fetchngs pipeline, but I was a little confused. Perhaps this data (SRX3441728) is large and the pipeline splits the data into two parts. So when I want to merge them, should I merge them as shown below? Thanks again for your time and work! Kind regards, |
Hey Birong, Yep that didn't work because you need to include the Your merge strategy looks correct to me. Judging by the file sizes, they might have split SRX3441728 over two lanes to increase sequencing depth, in which case merging makes sense. However, just to be safe, run Also, if you merge the files, check to make sure that the Best, |
Hi Barry, Sorry, it is me, again! Thanks for your reply! They are all working now. I've successfully downloaded several datasets! But I have a new problem with circrna pipeline. I am using the supercomputer Hawk, and paired data.
Error executing process > 'STAR_1PASS (SRR6343628)' Let me know if you need any further information. Thanks so much for your time and patient! Best regards, |
Hi Birong, Don't worry about it - happy to help. So this means that the process STAR_1PASS requested 16 CPUs, but you only have 8 CPUs available on the queue you sent the job to on Hawk. You will need to change the configuration file settings. Try the following:
Here is what I mean by point 3:
You could ask your system administrator about the maximum CPU and memory capacity of Hawk so you can configure this file in such a way that it never asks for more resources than are available. |
@BirongZhang Going to re-open this issue because it has a lot of good troubleshooting questions in it - if that's ok? |
Hi Barry, Thanks so much for your kind help! I will try what you said before, and ask hawk team about the maximum CPU. I will let you know what happens. Thanks again. Best, |
Hi Barry, I am back.
Have you ever met this before? Let me know if you need more details, thanks. Best, |
Hey Birong, It looks like you do not have internet connection on the cluster. Try pinging google from the cluster, the result should look like this.. barry@YT-1300:/data$ ping www.google.com
PING www.google.com(di-in-f106.1e100.net (2a00:1450:400b:c01::6a)) 56 data bytes
64 bytes from di-in-f106.1e100.net (2a00:1450:400b:c01::6a): icmp_seq=1 ttl=110 time=55.6 ms
64 bytes from di-in-f106.1e100.net (2a00:1450:400b:c01::6a): icmp_seq=2 ttl=110 time=132 ms
64 bytes from di-in-f106.1e100.net (2a00:1450:400b:c01::6a): icmp_seq=3 ttl=110 time=43.7 ms
^C
--- www.google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 43.667/77.093/132.042/39.157 ms |
If you can locate the reference genome files you need (GRCh37 FASTA, GTF files [previous runs on your laptop maybe?]) and upload them to the cluster manually, you will not need to connect to the AWS iGenomes bucket to automatically pull reference files. Then I can look into running the pipeline 'offline' for you - I've never done it but can try to learn |
Hi Barry, I am back again! Yes, you are right. The supercomputer team also told me that sometimes I was not allowed to download some external data because of the firewall. This also reminds me that sometimes I cannot even use I really appreciate for your "offline" help, but I don't think I should continue to consume any more of your time and energy because of my particular case. You have done enough for me, and I really learned a lot for our conversation. No worries, when I was trying to use your pipeline, I have run some STAR junction files, next I will try to use circular RNAs tools one by one. Nice to meet you online! Thanks so much for you kind help all the time! Best, |
Hi Barry, I am back again! Here is my scripts:
Here is my STAR output:
Here is my scripts:
Is there anything wrong with my scripts or the input? Thanks! Best regards, |
Hey Birong, So one or two things that might help, (but it is hard to tell from the output):
Here is an example of one I have on my computer:
(14 columns). Good luck , Barry |
The way I designed
In the workflow, for sample
The There is no Barry |
Hi Barry, Thanks for your time! It is so clear, I will try it and let you know what happens. Best, |
Hi all,
Thanks so much for generating this useful pipeline!
I wanted to find circrnas in a different way, and I found your work. But when I use it, I encounter the following problems:
Here is my code:
My fastq.gz data:
Here I also have a question, is this pipeline only for fastq.gz data? Can I use fastq data?
My error:
Could you please take a look at this? Any advice would be appreciated. Thanks!
Kind regards,
Birong
The text was updated successfully, but these errors were encountered: