-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
not working without parameter selction #41
Comments
I don't think your screenshot is attached. Can you add it again? Let me make sure I got this correct. You have R1 that has no barcode and is 39bp long and R2 which has the barcode (16bp) and the rest of R2 is identical to R1? |
I see. The reason why this is failing, is because Calib default parameter sets have been tested for read length between 60 and 250bp. So you will have to select the parameters yourself. I suggest to start with |
Can you check the cluster file for how many clusters did Calib generate?
…On Fri., Jun. 18, 2021, 7:41 a.m. kmoosi, ***@***.***> wrote:
[image: calib error2]
<https://user-images.githubusercontent.com/84722946/122577843-1a084f00-d008-11eb-9c4f-ddc9ada6cf9a.PNG>
Thank you for the quick answer. It's working now - I got my cluster file
and tried to do the calib_cons command (screenshot) but all I've got are
empty files. Did I choose the wrong input fastq since I've chosen the same
as in the calib command?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#41 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABP6AOMWH4HEUIWFFIENSYTTTNLI7ANCNFSM46ZUHSQA>
.
|
The consensus stage expects each cluster to be at least of size 2 and at
most of size 1000 (configurable parameters). It's probably the reason why
you don't have any output.
…On Fri., Jun. 18, 2021, 7:51 a.m. kmoosi, ***@***.***> wrote:
[image: calib3]
<https://user-images.githubusercontent.com/84722946/122579242-92bbdb00-d009-11eb-89f5-9a6a0b7c4862.PNG>
is it the first number in the first column? then it will be 94
but I have to say for this first test I only have used a file containing
only about the first 101 sequences of my whole ngs data. so maybe the input
size and/or variety is to small?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#41 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABP6AOJTZ2ZZOW25YKF4U3LTTNMPXANCNFSM46ZUHSQA>
.
|
Ok, I've tried it with my complete data set and now it seems to be working. Thank you very much! And the msa file just lists the consensus generation in detail with all the belonging aligned reads right? |
The number is just a new read name. The semicolon separated list of numbers
is the ID's of the reads making up this consensus read cluster.
Yeah, the MSA files are just the multiple sequence alignment files used for
computing the consensus sequences.
…On Sun, Jun 20, 2021 at 12:58 AM kmoosi ***@***.***> wrote:
Ok, I've tried it with my complete data set and now it seems to be
working. Thank you very much!
I have only one more question for my understanding. I get a fastq and a
msa file as an output. The fastq is a list of my consensus reads right? But
what's the meaning of the first line of an entry/first two lines of the
first entry, especially the number after the @?
Is after this number in the ID line a list of the entries which belong to
the consenus?
[image: calib4]
<https://user-images.githubusercontent.com/84722946/122666508-8dbe6f00-d162-11eb-8df8-d949b8acdaae.PNG>
And the msa file just lists the consensus generation in detail with all
the belonging aligned reads right?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#41 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABP6AOMMQMYVMZVTIB3OIDLTTWNTBANCNFSM46ZUHSQA>
.
--
Baraa Orabi
PhD Student
Vancouver Prostate Centre
|
OK, thanks for explaining. and as far as I understood the second file (R2 - consisting of the reads without UMI's) is processed in the same manner as R1 although no UMI's are there? |
Yep, exactly (sorry for late reply, was on vacation) |
Hi,
I don't have paired end reads but as described in previous issues I have copied my input fastq, removed the umi's (16 N long) and used it as input for the second file as shown in the screenshot. I've gotten an error message (no error or minimizer parameters passed. Selecting parameters based on barcode and inferred read length
Inferred read length 55 from sample of 10000 reads). Then I've tried to use the example command and only adjusted my input file names and the barcode length and my outfile (cluster) had been generated. But I'm not sure if this is the right parameter selection for my sequences - they are very short - only 55 bases already including 16 bases umi.
But I've tried further if I can use the generated cluster file for calib_cons. No error message here, but empty files. So my question here is, does the described example command refer to the same input files as in the first calib command for clustering or is this another fastq file, different from the input.
To run Calib error correction, run:
calib_cons -c <cluster_file> -q <space_separated_FASTQ_list> -o <space_separated_output_prefix_list>
For example:
calib_cons -c R.cluster -q R1.fastq R2.fastq -o R1. R2.
Thanks in advance and sorry for the probably dumb questions for experts, but I'm new in this topic (:
The text was updated successfully, but these errors were encountered: