Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fastq-multx when barcode is in header between # and / symbols? #30

Closed
GoogleCodeExporter opened this issue Apr 29, 2015 · 7 comments
Closed

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1. I have fastq files where the barcode was written in the header row between 
symbols # (on the left) and / (on the right)
2. To complicate things further, we used 6-bases barcodes, and the machine was 
set to sequence 8-bases barcodes. We know for a fact that our barcode is the 
first 6 bases after the # symbol. The last two bases should always be "AT" in 
our case
3. I used the -B option with a file giving the 6-bases barcodes.

What is the expected output? What do you see instead?
I observed 88% "unmatched" reads (150.174.690 out of total 170.571.259). The 
log mentioned "End used: end", stating it used the 3' of the read. Is there a 
way to make it look into the header?

What version of the product are you using? On what operating system?
Working on Ubuntu Linux, ea-utils.1.1.2-537.tar.gz

Please provide any additional information below.


Original issue reported on code.google.com by kevinru...@gmail.com on 20 Mar 2014 at 12:42

@GoogleCodeExporter
Copy link
Author

Can you attach/upload a couple subsets of data (zcat file | head -100)... so I 
can see what you're talking about?   I can probably add support for "header 
embedded" codes pretty quickly... esp since this is a fairly common case.

Original comment by earone...@gmail.com on 2 May 2014 at 4:58

@GoogleCodeExporter
Copy link
Author

Here you go!

These are the top 25 paired reads extracted from raw data files.

Again, note that the real barcode is only the first 6 nucleotides after the # 
symbol, not the full 8 bases sequenced.

Thanks a lot for looking into that!
Kevin

Original comment by kevinru...@gmail.com on 2 May 2014 at 5:09

Attachments:

@GoogleCodeExporter
Copy link
Author

Issue 31 has been merged into this issue.

Original comment by earone...@gmail.com on 9 Jul 2014 at 2:28

@GoogleCodeExporter
Copy link
Author

finally done, release 762 or greater.

Original comment by earone...@gmail.com on 15 Aug 2014 at 3:05

@GoogleCodeExporter
Copy link
Author

Original comment by earone...@gmail.com on 15 Aug 2014 at 3:05

  • Changed state: Fixed

@GoogleCodeExporter
Copy link
Author

Hi, I'm facing the same problem having the barcodes in the header row. What is 
not clear is how I can get and install the latest release (>762)? since the one 
I can download is release 686.
Thanks for help!

Maurizio

Original comment by maurizio...@gmail.com on 20 Aug 2014 at 10:06

@GoogleCodeExporter
Copy link
Author

You need to do the following:

https://code.google.com/p/ea-utils/source/checkout

NOTE: We're probably not going to stamp a new release until the test 
architecture is all worked out.

Original comment by earone...@gmail.com on 20 Aug 2014 at 5:08

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant