Creating the CSV file
The CSV file is required for the Quality Control step, it contains information about the samples barcodes and sequence adapters. UCEasy will create a configuration file for Illumiprocessor adapter trimming program, be sure to also check: Creating a Configuration File in Illumiprocessor's documentation.
Here is an example of CSV file for a Illumina TruSeq v3 single indexed adapters library:
Sample Name | i7 Barcode | i5 Barcode | i7 Adapter | i5 Adapter |
---|---|---|---|---|
U4172-A1 | CGATGT | AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC*ATCTCGTATGCCGTCTTCTGCTTG | AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT | |
U4176-A2 | TGACCA | |||
U4155-A3 | ACAGTG | |||
U4315-A4 | GCCAAT |
Note that the samples only have the i7 barcode, and at the i7 adapter that is a * in the middle of the sequence, it means that the i7 barcode will be inserted at this "gap" in the adapter. And there isn't any i5 barcode and * at the i5 adapter, that means this is a single indexed library. If this is the case for your samples, don't forget to pass the --single-indexed
(or --si
for short) in the quality-control
command so that UCEasy can create the appropriate configuration file for trimming.
For this CSV table the illumiprocessor.conf
file will look like this:
[adapters]
i7:AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC*ATCTCGTATGCCGTCTTCTGCTTG
i5:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT
[tag sequences]
sample0_barcode_i7:CGATGT
sample1_barcode_i7:TGACCA
sample2_barcode_i7:ACAGTG
sample3_barcode_i7:GCCAAT
[tag map]
U4172-A1:sample0_barcode_i7
U4176-A2:sample1_barcode_i7
U4155-A3:sample2_barcode_i7
U4315-A4:sample3_barcode_i7
[names]
U4172-A1:U4172-A1
U4176-A2:U4176-A2
U4155-A3:U4155-A3
U4315-A4:U4315-A4
If you have a double indexed library like the Illumina TruSeq HT, you'll need to specify both i7 and i5 Barcodes for the samples. Also note that now the i5 Adapter has the * for the location of the i5 Barcode.
Sample Name | i7 Barcode | i5 Barcode | i7 Adapter | i5 Adapter |
---|---|---|---|---|
U4172-A1 | CGATGT | CAGATC | AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC*ATCTCGTATGCCGTCTTCTGCTTG | AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT*GTGTAGATCTCGGTGGTCGCCGTATCATT |
U4176-A2 | TGACCA | CTTGTA | ||
U4155-A3 | ACAGTG | ATCACG | ||
U4315-A4 | GCCAAT | TTAGGC |
And finally the illumiprocessor.conf
content for this double indexed CSV will be:
[adapters]
i7:AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC*ATCTCGTATGCCGTCTTCTGCTTG
i5:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT
[tag sequences]
sample0_barcode_i7:CGATGT
sample0_barcode_i5:CAGATC
sample1_barcode_i7:TGACCA
sample1_barcode_i5:CTTGTA
sample2_barcode_i7:ACAGTG
sample2_barcode_i5:ATCACG
sample3_barcode_i7:GCCAAT
sample3_barcode_i5:TTAGGC
[tag map]
U4172-A1:sample0_barcode_i7,sample0_barcode_i5
U4176-A2:sample1_barcode_i7,sample1_barcode_i5
U4155-A3:sample2_barcode_i7,sample2_barcode_i5
U4315-A4:sample3_barcode_i7,sample3_barcode_i5
[names]
U4172-A1:U4172-A1
U4176-A2:U4176-A2
U4155-A3:U4155-A3
U4315-A4:U4315-A4