In [1]:
import pandas as pd
import csv, warnings
warnings.filterwarnings("ignore")

In [5]:
def write_to_file(output_filename, df, subpool_name, taxon_id, crispr_mech, gene_id_col, gene_sym_col, guide_col):
    with open(output_filename, 'w') as o:
        w = csv.writer(o)
        w.writerow(['Subpool Name', 'Taxon ID', 'CRISPR Mechanism', 'Gene ID', 'Gene Symbol', 'sgRNA sequence'])
        for i,r in df.iterrows():
            row = [subpool_name, taxon_id, crispr_mech, r[gene_id_col], r[gene_sym_col], r[guide_col]]
            w.writerow(row)
    return

This notebook helps generate a 6-column order file to send to Xiaoping to order a CRISPR knockout/base-editor libraries.
Few points of note:
1. The input file is a .txt file with columns containing at least guide sequences, gene symbols and gene IDs. The
design file can have additional columns.
2. The controls that need to be included should be appended to the design file such that
guide sequence column has the guide sequence of the controls, gene symbol column has the type of control and the gene ID
column can also have the type of control annotation.
3. Run the blocks of code below and enter the required information in the prompts.

In [1]:
input_file = input("Please enter design file: ")
input_df = pd.read_csv(input_file, sep='\t')

Suggested inputs for the following prompts:
1. Subpool name: short description of the pool being ordered.
2. Taxon ID: 9606 for human, 10090 for mouse. This [Taxonomy Browser](https://www.ncbi.nlm.nih.gov/Taxonomy/TaxIdentifier/tax_identifier.cgi)
tool can be used to get this information for all species.

In [2]:
output_filename = input("Please enter output file name (.csv): ")
subpool_name = input('Please enter subpool name: ')
taxon_id = input('Please enter taxon ID: ')
crispr_mech = input('Please enter CRISPR mechanism (CRISPRbe, CRISPRko etc.): ')
gene_id_col = input('Please enter name of column with Gene ID: ')
gene_sym_col = input('Please enter name of column with Gene Symbol: ')
guide_col = input('Please enter name of column with guide sequences: ')

write_to_file(output_filename, input_df, subpool_name, taxon_id, crispr_mech, gene_id_col, gene_sym_col, guide_col)
