Skip to content

Creating the CSV file

Caio Raposo edited this page Aug 24, 2021 · 4 revisions

The CSV file is required for the Quality Control step, it contains information about the samples barcodes and sequence adapters. UCEasy will create a configuration file for Illumiprocessor adapter trimming program, be sure to also check: Creating a Configuration File in Illumiprocessor's documentation.

Single Indexed

Here is an example of CSV file for a Illumina TruSeq v3 single indexed adapters library:

Sample Name i7 Barcode i5 Barcode i7 Adapter i5 Adapter
U4172-A1 CGATGT AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC*ATCTCGTATGCCGTCTTCTGCTTG AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT
U4176-A2 TGACCA
U4155-A3 ACAGTG
U4315-A4 GCCAAT

Note that the samples only have the i7 barcode, and at the i7 adapter that is a * in the middle of the sequence, it means that the i7 barcode will be inserted at this "gap" in the adapter. And there isn't any i5 barcode and * at the i5 adapter, that means this is a single indexed library. If this is the case for your samples, don't forget to pass the --single-indexed (or --si for short) in the quality-control command so that UCEasy can create the appropriate configuration file for trimming.

For this CSV table the illumiprocessor.conf file will look like this:

[adapters]
i7:AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC*ATCTCGTATGCCGTCTTCTGCTTG
i5:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT

[tag sequences]
sample0_barcode_i7:CGATGT
sample1_barcode_i7:TGACCA
sample2_barcode_i7:ACAGTG
sample3_barcode_i7:GCCAAT

[tag map]
U4172-A1:sample0_barcode_i7
U4176-A2:sample1_barcode_i7
U4155-A3:sample2_barcode_i7
U4315-A4:sample3_barcode_i7

[names]
U4172-A1:U4172-A1
U4176-A2:U4176-A2
U4155-A3:U4155-A3
U4315-A4:U4315-A4

Double Indexed

If you have a double indexed library like the Illumina TruSeq HT, you'll need to specify both i7 and i5 Barcodes for the samples. Also note that now the i5 Adapter has the * for the location of the i5 Barcode.

Sample Name i7 Barcode i5 Barcode i7 Adapter i5 Adapter
U4172-A1 CGATGT CAGATC AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC*ATCTCGTATGCCGTCTTCTGCTTG AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT*GTGTAGATCTCGGTGGTCGCCGTATCATT
U4176-A2 TGACCA CTTGTA
U4155-A3 ACAGTG ATCACG
U4315-A4 GCCAAT TTAGGC

And finally the illumiprocessor.conf content for this double indexed CSV will be:

[adapters]
i7:AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC*ATCTCGTATGCCGTCTTCTGCTTG
i5:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT

[tag sequences]
sample0_barcode_i7:CGATGT
sample0_barcode_i5:CAGATC
sample1_barcode_i7:TGACCA
sample1_barcode_i5:CTTGTA
sample2_barcode_i7:ACAGTG
sample2_barcode_i5:ATCACG
sample3_barcode_i7:GCCAAT
sample3_barcode_i5:TTAGGC

[tag map]
U4172-A1:sample0_barcode_i7,sample0_barcode_i5
U4176-A2:sample1_barcode_i7,sample1_barcode_i5
U4155-A3:sample2_barcode_i7,sample2_barcode_i5
U4315-A4:sample3_barcode_i7,sample3_barcode_i5

[names]
U4172-A1:U4172-A1
U4176-A2:U4176-A2
U4155-A3:U4155-A3
U4315-A4:U4315-A4