-
Notifications
You must be signed in to change notification settings - Fork 0
pub
The script is designed for automatic picking up sequencing barcodes.
"pub.R" uses pam
algorithm (from cluster
R package) to cluster barcodes and select ones that are the most dissimilar from others.
After picking up barcodes, it checks color balance: i.e. checks if there will not be all the same laser/LED "shining" during one sequencing cycle. See this page for details about color balance.
The script is written in R, so you need R interpreter to use it.
Rscript pub.R <csv_file_with_barcodes> <number_of_samples>
Run Rscript pub.R -h
to see help message in console.
Pick up 28 barcodes from file my_favorite_barcodes.csv
.
Rscript pub.R my_favorite_barcodes.csv 28
"pub.R" accepts either "single" or "double" format of input files.
Two-column CSV file. Header is desirable, but not mandatory.
First column -- barcode name. Second column -- barcode sequence.
Example:
I7_Index_ID,index
P1-A1,TTACCGAC
P2-A2,AGTGACCT
P3-A3,TCGGATTC
P4-A4,CAAGGTAC
Four-column CSV file. Header is desirable, but not mandatory.
First column -- i7 barcode name. Second column -- i7 barcode sequence. Third column -- i5 barcode name. Fourth column -- i5 barcode sequence.
Example:
I7_Index_ID,index,I5_Index_ID,index2
N701,TAAGGCGA,S502,CTCTCTAT
N702,CGTACTAG,S502,CTCTCTAT
N703,AGGCAGAA,S502,CTCTCTAT
N704,TCCTGAGC,S502,CTCTCTAT
You can "mute" (or comment, if you wish) lines with #
in your barcode file. Muted lines will be ignored by "pub.R".
Example (barcode P2-A2
will be ignored by "pub.R"):
I7_Index_ID,index,I5_Index_ID,index2
I7_Index_ID,index
P1-A1,TTACCGAC
# P2-A2,AGTGACCT
P3-A3,TCGGATTC
P4-A4,CAAGGTAC