# Creating barcode/index pairs

In [1]:
import warnings
import os

def env_check(silent=False):
    """Checks first that the user is in an evSeq environment, then
    checks that evSeq can be imported (regardless of env). If neither
    pf these are accomplised, raises error. Otherwise, warns or prints
    status of env/import, unless `silent` is True.
    """
    env_check = True
    env = os.environ['CONDA_DEFAULT_ENV']
    if 'evSeq' not in env:
        env_check = False
        if not silent:
            warnings.warn('Not in an evSeq environment')
    try:
        import evSeq
        if not silent:
            print('Successfully imported evSeq')
    except ImportError:
        message = 'Could not import evSeq'
        if not env_check:
            message = f'{message} or activate environment'
        raise ImportError(f'{message}.')
            
env_check()

Successfully imported evSeq


In [2]:
from evSeq.util import (index_plate_maker, generate_index_map,
                        check_barcode_pairings, save_csv)

*IMPORTANT:* `evSeq` has only been tested thoroughly with the provided barcode sequencing primers and dual-index plates given in the `index_map.csv` file found [here](https://github.com/fhalab/evSeq/blob/master/evSeq/util/index_map.csv). Any changes to this file in your own repo/installation have not been tested and validated by us, so use at your own risk. We *have* seen barcodes be incompatible with the rest of the `evSeq` machinery! This has only happened for one barcode out of 192 (~0.5% chance), but caution should be exercised. Additionally, our adapter regions have been thoroughly tested, and previous iterations have caused significant issues. We do not recommend changing these.

## Using new barcode primers

Barcodes in the NGS results are parsed with the `index_map.csv` to assign forward and reverse barcode pairs back to specific plate-well positions. This index map is prepared via a given method for pairing barcodes from the forward and reverse master barcode primer plates. In the event that you need more/different barcodes, they can be ordered and updated via creating a new file in the same format as the barcode primer file found [here](https://github.com/fhalab/evSeq/blob/master/lib_prep_tools/evSeq_barcode_primer_seqs.csv). The 'Sequence' column contains the `evSeq` adapter sequences:
```
Forward: 'CACCCAAGACCACTCTCCGG'
Reverse: 'CGGTGTGCGAAGTAGGTGC'
```

followed by a 7-base barcode.

Once a new file is made, the `index_map.csv` file can be updated with code from `evSeq.util` as follows:

In [3]:
# Set the path to the file
path_to_file = '../lib_prep_tools/IdtOrderForm.xlsx'

# Generate a new index map
new_map = generate_index_map(barcode_plate_seqs=path_to_file)

# Check
new_map

Unnamed: 0,IndexPlate,Well,FBC,RBC
0,DI01,A01,GATCATG,GAACTGC
1,DI01,A02,TACATGG,ACCAGGT
2,DI01,A03,AAGCACC,TCTAGAG
3,DI01,A04,TGGCTCA,CACACAA
4,DI01,A05,CTTGCTC,GTGGAAC
...,...,...,...,...
763,DI08,H08,ATTGCCT,GTGAGAT
764,DI08,H09,CATTCGA,TTGGCAG
765,DI08,H10,GCACAAT,ATGCCTG
766,DI08,H11,GCAGTAA,TCCGAAG


This function pulls the barcodes from the forward and reverse primer sequences and arrays them properly. You may then check that your pairings are all unique via:

In [4]:
check_barcode_pairings(new_map)

This will raise an error if there are non-unique pairings. If it passes silently, you can then save your new mapping with:

In [5]:
# Commented out to not create an unnecessary file
# save_csv(new_map, filename='new_index_map.csv')

## Creating new index pair mappings

Code accessible from `evSeq.util` is also available to create new index pair mappings (how to arrange the two plates of forward and reverse barcode primers), which can then be saved in the same way as above.

The simplest change that can be made is to allow twelve dual-index barcodes to be generated from the standard `evSeq` barcode primer plates, i.e., to cycle by columns (1-12) instead of rows (A-H).

For example, the current dual-index plates are made as follows:

In [6]:
index_plate_maker(
#     axis='row', # this is default
#     FBC_map='stamp', # this is default
#     RBC_map='A->N', # this is default for axis='row'
    hide_wells=True, # hide wells to show how the axis (row/column) is cycled
)

Unnamed: 0,IndexPlate,Destination,FBCSource,RBCSource
0,DI01,A,A,A
1,DI01,B,B,B
2,DI01,C,C,C
3,DI01,D,D,D
4,DI01,E,E,E
...,...,...,...,...
59,DI08,D,D,E
60,DI08,E,E,F
61,DI08,F,F,G
62,DI08,G,G,H


You could also change the mapping so that, instead of the *N*th row of plate DI*N* having containing row A of the reverse barcode plate (`A->N`), you could make it such that row A of plate DI*N* contains the *N*th row of the reverse barcode plate (`N->A`):

In [7]:
index_plate_maker(
    axis='row',
    RBC_map='N->A', # swap how columns are cycled
    hide_wells=True,
)

Unnamed: 0,IndexPlate,Destination,FBCSource,RBCSource
0,DI01,A,A,A
1,DI01,B,B,B
2,DI01,C,C,C
3,DI01,D,D,D
4,DI01,E,E,E
...,...,...,...,...
59,DI08,D,D,C
60,DI08,E,E,D
61,DI08,F,F,E
62,DI08,G,G,F


As stated above, you could also cycle by columns and create 12 index plates:

In [8]:
index_plate_maker(
    axis='column', # swap to columns
#     FBC_map='stamp',
#     RBC_map='1->N', # this becomes default for axis='column'
    hide_wells=True,
)

Unnamed: 0,IndexPlate,Destination,FBCSource,RBCSource
0,DI01,1,1,1
1,DI01,2,2,2
2,DI01,3,3,3
3,DI01,4,4,4
4,DI01,5,5,5
...,...,...,...,...
139,DI12,8,8,9
140,DI12,9,9,10
141,DI12,10,10,11
142,DI12,11,11,12


As with above, when you generate a full plate from `index_plate_maker` you should confirm that there are no duplicate barcode pairings with the `check_barcode_pairings` function:

In [9]:
# Passes silently if there are no duplicates in the map
new_map = index_plate_maker(axis='row', RBC_map='N->A')

check_barcode_pairings(new_map)

Once you have a new layout, you may pair this layout with a list of your forward and reverse barcode primers (following the same format as [the standard barcode primer file](https://github.com/fhalab/evSeq/blob/master/lib_prep_tools/evSeq_barcode_primer_seqs.csv) to generate actual barcode pairings with the function `generate_index_map`:

In [10]:
# Create the new mapping
new_mapping = index_plate_maker(
    plate_prefix='Column-DI',
    axis='column',
)

# Path to the barcode primer info
barcode_plate_seqs = '../lib_prep_tools/IdtOrderForm.xlsx'

# Create a new index_map
new_map = generate_index_map(
    barcode_plate_seqs,
    new_mapping,
#     NGS_adapter_f='TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG', # default is Nextera i5
#     NGS_adapter_r='GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG', # default is Nextera i7
)

new_map

Unnamed: 0,IndexPlate,Well,FBC,RBC
0,Column-DI01,A01,GATCATG,GAACTGC
1,Column-DI01,A02,TACATGG,ACCAGGT
2,Column-DI01,A03,AAGCACC,TCTAGAG
3,Column-DI01,A04,TGGCTCA,CACACAA
4,Column-DI01,A05,CTTGCTC,GTGGAAC
...,...,...,...,...
1147,Column-DI12,H08,ATTGCCT,ACTTGCA
1148,Column-DI12,H09,CATTCGA,ACGCGAT
1149,Column-DI12,H10,GCACAAT,TCGACAC
1150,Column-DI12,H11,GCAGTAA,ACTCAAC


You can again use `check_barcode_pairings` to confirm that these pairs are unique, and `save_csv` to save the new `index_map`:

In [11]:
check_barcode_pairings(new_map)

# save_csv(new_map, filename='columnwise_index_map.csv')

---
*Next page: [Running `evSeq` in a Jupyter Notebook](8-full_demo.html).*

*Back to the [main page](index.html).*