In [13]:
### TIgGER ###

# A computational method to improve the VDJ gene segment allele assignments.
# First, it determines the complete set of gene segments carried by an individual
# (or novel) allele from the VDJ rearrangement sequences. 
# Second, it infer a subjects genotype from these complete sequences and use
# this genotype to correct the initial VDJ gene segment allele assignments.

# - detect novel alleles.
# - infer a subject genotype.
# - correct preliminary gene segment allele assignments.


# Load packages required for this example
library(tigger)
library(dplyr)
library(ggplot2)

require(data.table)
db <- as.data.frame(fread("../changeo10x/vac_heavy_germ-pass.tsv"))
head(db)

germline_db <- readIgFasta("/usr/local/share/germlines/imgt/mouse/vdj/imgt_mouse_IGHV.fasta")
# germline_db <- readIgFasta("vac_db-pass_sequences.fasta")
head(germline_db)

Unnamed: 0_level_0,sequence_id,sequence,rev_comp,productive,v_call,d_call,j_call,sequence_alignment,germline_alignment,junction,⋯,umi_count,v_call_10x,d_call_10x,j_call_10x,junction_10x,junction_10x_aa,germline_alignment_d_mask,germline_v_call,germline_d_call,germline_j_call
Unnamed: 0_level_1,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,⋯,<int>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
1,ACGATACTCAACGCTA-1_contig_2,ACTGTTCTCTTTACAGTTACTGAGCACACAGGACCTCACCATGGGATGGAGCTGTATCATGCTCTTCTTGGCAGCAACAGCTACAGGTGTCCACTCCCAGGTCCAACTGCAGCAGCCTGGGGCTGAGCTTGTGAAGCCTGGGGCTTCAGTGAAGCTGTCCTGCAAGGCTTCTGGCTACACCTTCACCAGCTACTGGATGCACTGGGTGAAGCAGAGGCCTGGACGAGGCCTTGAGTGGATTGGAAGGATTGATCCTAATAGTGGTGGTACTAAGTACAATGAGAAGTTCAAGAGCAAGGCCACACTGACTGTAGACAAACCCTCCAGCACAGCCTACATGCAGCTCAGCAGCCTGACATCTGAGGACTCTGCGGTCTATTATTGTGCAAGATTAGGGGGCTACGGTAATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAGAGAGTCAGTCCTTCCCAAATGTCTTCCCCCTCGTCTCCTGCGAGAGCCCCCTGTCTGATAAGAATCTGGTGGCCATGGGCTGCCTGGCCCGGGACTTCCTGCCCAGCACCATTTCCTTCACCTGGAACTACCAGAACAACACTGAAGTCATCCAGGGTATCAGAACCTTCCCAACACTGAGGACAGGGGGCAAGTACCTAGCCACCTCGCA,F,T,IGHV1-72*01,IGHD1-1*01,IGHJ4*01,CAGGTCCAACTGCAGCAGCCTGGGGCT...GAGCTTGTGAAGCCTGGGGCTTCAGTGAAGCTGTCCTGCAAGGCTTCTGGCTACACCTTC............ACCAGCTACTGGATGCACTGGGTGAAGCAGAGGCCTGGACGAGGCCTTGAGTGGATTGGAAGGATTGATCCTAAT......AGTGGTGGTACTAAGTACAATGAGAAGTTCAAG...AGCAAGGCCACACTGACTGTAGACAAACCCTCCAGCACAGCCTACATGCAGCTCAGCAGCCTGACATCTGAGGACTCTGCGGTCTATTATTGTGCAAGATTAGGGGGCTACGGTAATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,CAGGTCCAACTGCAGCAGCCTGGGGCT...GAGCTTGTGAAGCCTGGGGCTTCAGTGAAGCTGTCCTGCAAGGCTTCTGGCTACACCTTC............ACCAGCTACTGGATGCACTGGGTGAAGCAGAGGCCTGGACGAGGCCTTGAGTGGATTGGAAGGATTGATCCTAAT......AGTGGTGGTACTAAGTACAATGAGAAGTTCAAG...AGCAAGGCCACACTGACTGTAGACAAACCCTCCAGCACAGCCTACATGCAGCTCAGCAGCCTGACATCTGAGGACTCTGCGGTCTATTATTGTGCAAGANNNNNNNNCTACGGTAATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,TGTGCAAGATTAGGGGGCTACGGTAATGCTATGGACTACTGG,⋯,6,IGHV1-72,,IGHJ4,TGTGCAAGATTAGGGGGCTACGGTAATGCTATGGACTACTGG,CARLGGYGNAMDYW,CAGGTCCAACTGCAGCAGCCTGGGGCT...GAGCTTGTGAAGCCTGGGGCTTCAGTGAAGCTGTCCTGCAAGGCTTCTGGCTACACCTTC............ACCAGCTACTGGATGCACTGGGTGAAGCAGAGGCCTGGACGAGGCCTTGAGTGGATTGGAAGGATTGATCCTAAT......AGTGGTGGTACTAAGTACAATGAGAAGTTCAAG...AGCAAGGCCACACTGACTGTAGACAAACCCTCCAGCACAGCCTACATGCAGCTCAGCAGCCTGACATCTGAGGACTCTGCGGTCTATTATTGTGCAAGANNNNNNNNNNNNNNNNATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,IGHV1-72*01,IGHD1-1*01,IGHJ4*01
2,TGGCCAGTCTCTTATG-1_contig_1,GAACAACCCATGATCAGTATCCTCTCCACAGTCACTGAAGACACTGACTCAAACCATGGAATGGTGCTGGGTCTTTCTCTTCCTCCTGTCAGTAACTGCAGGTGTCCACTCCCAGGTCCAGCTGCAGCAGTCTGGAGCTGAGCTGGTGAAACCCGGGGCATCAGTGAAGCTGTCCTGCAAGGCTTCTGGCTACACCTTCACTGAGTATACTATACACTGGGTAAAGCAGAGGTCTGGACAGGGTCTTGAGTGGATTGGGTGGTTTTACCCTGGAAGTGGTAGTATAAAGTACAATGAGAAATTCAAGGACAAGGCCACATTGACTGCGGACAAATCCTCCAGCACAGTCTATATGGAGCTTAGTAGATTGACATCTGAAGACTCTGCGGTCTATTTCTGTGCAAGACACGAAGAAGACTACTATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAGAGAGTCAGTCCTTCCCAAATGTCTTCCCCCTCGTCTCCTGCGAGAGCCCCCTGTCTGATAAGAATCTGGTGGCCATGGGCTGCCTGGCCCGGGACTTCCTGCCCAGCACCATTTCCTTCACCTGGAACTACCAGAACAACACTGAAGTCATCCAGGGTATCAGAACCTTCCCAACACTGAGGACAGGGGGCAAGTACCTAGCCACCTCGCA,F,T,"IGHV1-62-2*01,IGHV1-71*01",,IGHJ4*01,CAGGTCCAGCTGCAGCAGTCTGGAGCT...GAGCTGGTGAAACCCGGGGCATCAGTGAAGCTGTCCTGCAAGGCTTCTGGCTACACCTTC............ACTGAGTATACTATACACTGGGTAAAGCAGAGGTCTGGACAGGGTCTTGAGTGGATTGGGTGGTTTTACCCTGGA......AGTGGTAGTATAAAGTACAATGAGAAATTCAAG...GACAAGGCCACATTGACTGCGGACAAATCCTCCAGCACAGTCTATATGGAGCTTAGTAGATTGACATCTGAAGACTCTGCGGTCTATTTCTGTGCAAGACACGAAGAAGACTACTATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,CAGGTCCAGCTGCAGCAGTCTGGAGCT...GAGCTGGTGAAACCCGGGGCATCAGTGAAGCTGTCCTGCAAGGCTTCTGGCTACACCTTC............ACTGAGTATACTATACACTGGGTAAAGCAGAGGTCTGGACAGGGTCTTGAGTGGATTGGGTGGTTTTACCCTGGA......AGTGGTAGTATAAAGTACAATGAGAAATTCAAG...GACAAGGCCACATTGACTGCGGACAAATCCTCCAGCACAGTCTATATGGAGCTTAGTAGATTGACATCTGAAGACTCTGCGGTCTATTTCTGTGCAAGACACGAAGANNNNTACTATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,TGTGCAAGACACGAAGAAGACTACTATGCTATGGACTACTGG,⋯,9,IGHV1-71,,IGHJ4,TGTGCAAGACACGAAGAAGACTACTATGCTATGGACTACTGG,CARHEEDYYAMDYW,CAGGTCCAGCTGCAGCAGTCTGGAGCT...GAGCTGGTGAAACCCGGGGCATCAGTGAAGCTGTCCTGCAAGGCTTCTGGCTACACCTTC............ACTGAGTATACTATACACTGGGTAAAGCAGAGGTCTGGACAGGGTCTTGAGTGGATTGGGTGGTTTTACCCTGGA......AGTGGTAGTATAAAGTACAATGAGAAATTCAAG...GACAAGGCCACATTGACTGCGGACAAATCCTCCAGCACAGTCTATATGGAGCTTAGTAGATTGACATCTGAAGACTCTGCGGTCTATTTCTGTGCAAGACACGAAGANNNNTACTATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,IGHV1-62-2*01,,IGHJ4*01
3,ATAAGAGCAAGAAAGG-1_contig_2,CTGGAATTGATTCCTAGTTCCTCACGTTCAGTGATGAGTACTGAACACAGACCCCTCACCATGAACTTCGGGCTCAGATTGATTTTCCTTGTCCTTACTTTAAAAGGTGTCCAGTGTGACGTGAAGCTGGTGGAGTCTGGGGAAGGCTTAGTGAAGCCTGGAGGGTCCCTGAAACTCTCCTGTGCAGCCTCTGGATTCACTTTCAGTAGCTATGCCATGTCTTGGGTTCGCCAGACTCCAGAGAAGAGGCTGGAGTGGGTCGCATACATTAGTAGTGGTGGTGATTACATCTACTATGCAGACACTGTGAAGGGCCGATTCACCATCTCCAGAGACAATGCCAGGAACACCCTGTACCTGCAAATGAGCAGTCTGAAGTCTGAGGACACAGCCATGTATTACTGTACAAGAGATTCTACTACCCTAGCACACTTTGACTACTGGGGCCAAGGCACCACTCTCACAGTCTCCTCAGAGAGTCAGTCCTTCCCAAATGTCTTCCCCCTCGTCTCCTGCGAGAGCCCCCTGTCTGATAAGAATCTGGTGGCCATGGGCTGCCTGGCCCGGGACTTCCTGCCCAGCACCATTTCCTTCACCTGGAACTACCAGAACAACACTGAAGTCATCCAGGGTATCAGAACCTTCCCAACACTGAGGACAGGGGGCAAGTACCTAGCCACCTCGCA,F,T,IGHV5-9-1*02,"IGHD2-1*01,IGHD2-13*01,IGHD2-2*01",IGHJ2*01,GACGTGAAGCTGGTGGAGTCTGGGGAA...GGCTTAGTGAAGCCTGGAGGGTCCCTGAAACTCTCCTGTGCAGCCTCTGGATTCACTTTC............AGTAGCTATGCCATGTCTTGGGTTCGCCAGACTCCAGAGAAGAGGCTGGAGTGGGTCGCATACATTAGTAGTGGT......GGTGATTACATCTACTATGCAGACACTGTGAAG...GGCCGATTCACCATCTCCAGAGACAATGCCAGGAACACCCTGTACCTGCAAATGAGCAGTCTGAAGTCTGAGGACACAGCCATGTATTACTGTACAAGAGATTCTACTACCCTAGCACACTTTGACTACTGGGGCCAAGGCACCACTCTCACAGTCTCCTCAG,GACGTGAAGCTGGTGGAGTCTGGGGAA...GGCTTAGTGAAGCCTGGAGGGTCCCTGAAACTCTCCTGTGCAGCCTCTGGATTCACTTTC............AGTAGCTATGCCATGTCTTGGGTTCGCCAGACTCCAGAGAAGAGGCTGGAGTGGGTCGCATACATTAGTAGTGGT......GGTGATTACATCTACTATGCAGACACTGTGAAG...GGCCGATTCACCATCTCCAGAGACAATGCCAGGAACACCCTGTACCTGCAAATGAGCAGTCTGAAGTCTGAGGACACAGCCATGTATTACTGTACAAGANNNTCTACTANNNNNNNNNACTTTGACTACTGGGGCCAAGGCACCACTCTCACAGTCTCCTCAG,TGTACAAGAGATTCTACTACCCTAGCACACTTTGACTACTGG,⋯,17,IGHV5-9-1,,IGHJ2,TGTACAAGAGATTCTACTACCCTAGCACACTTTGACTACTGG,CTRDSTTLAHFDYW,GACGTGAAGCTGGTGGAGTCTGGGGAA...GGCTTAGTGAAGCCTGGAGGGTCCCTGAAACTCTCCTGTGCAGCCTCTGGATTCACTTTC............AGTAGCTATGCCATGTCTTGGGTTCGCCAGACTCCAGAGAAGAGGCTGGAGTGGGTCGCATACATTAGTAGTGGT......GGTGATTACATCTACTATGCAGACACTGTGAAG...GGCCGATTCACCATCTCCAGAGACAATGCCAGGAACACCCTGTACCTGCAAATGAGCAGTCTGAAGTCTGAGGACACAGCCATGTATTACTGTACAAGANNNNNNNNNNNNNNNNNNNACTTTGACTACTGGGGCCAAGGCACCACTCTCACAGTCTCCTCAG,IGHV5-9-1*02,IGHD2-1*01,IGHJ2*01
4,TTCTACAAGAATTCCC-1_contig_1,GACTAGTGTGCAGATATGGACAGGCTTACTTCCTCATTCCTGCTGCTGATTGTCCCTGCATATGTCCTGTCCCAGGTTACTCTGAAAGAGTCTGGCCCTGGGATATTGCAGTCCTCCCAGACCCTCAGTCTGACTTGTTCTTTCTCTGGGTTTTCACTGAGCACTTCTGGTATGGGTGTGAGCTGGATTCGTCAGCCTTCAGGAAAGGGTCTGGAGTGGCTGGCACACATTTACTGGGATGATGACAAGCGCTATAACCCATCCCTGAAGAGCCGGCTCACAATCTCCAAGGATACCTCCAGAAACCAGGTATTCCTCAAGATCACCAGTGTGGACACTGCAGATACTGCCACATACTACTGTGCTCGAAGGGTTCTACCCTTTCTGGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAGAGAGTCAGTCCTTCCCAAATGTCTTCCCCCTCGTCTCCTGCGAGAGCCCCCTGTCTGATAAGAATCTGGTGGCCATGGGCTGCCTGGCCCGGGACTTCCTGCCCAGCACCATTTCCTTCACCTGGAACTACCAGAACAACACTGAAGTCATCCAGGGTATCAGAACCTTCCCAACACTGAGGACAGGGGGCAAGTACCTAGCCACCTCGCA,F,T,IGHV8-12*01,"IGHD2-1*01,IGHD2-13*01,IGHD2-2*01",IGHJ4*01,CAGGTTACTCTGAAAGAGTCTGGCCCT...GGGATATTGCAGTCCTCCCAGACCCTCAGTCTGACTTGTTCTTTCTCTGGGTTTTCACTGAGC......ACTTCTGGTATGGGTGTGAGCTGGATTCGTCAGCCTTCAGGAAAGGGTCTGGAGTGGCTGGCACACATTTACTGGGAT.........GATGACAAGCGCTATAACCCATCCCTGAAG...AGCCGGCTCACAATCTCCAAGGATACCTCCAGAAACCAGGTATTCCTCAAGATCACCAGTGTGGACACTGCAGATACTGCCACATACTACTGTGCTCGAAGGGTTCTACCCTTTCTGGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,CAGGTTACTCTGAAAGAGTCTGGCCCT...GGGATATTGCAGTCCTCCCAGACCCTCAGTCTGACTTGTTCTTTCTCTGGGTTTTCACTGAGC......ACTTCTGGTATGGGTGTGAGCTGGATTCGTCAGCCTTCAGGAAAGGGTCTGGAGTGGCTGGCACACATTTACTGGGAT.........GATGACAAGCGCTATAACCCATCCCTGAAG...AGCCGGCTCACAATCTCCAAGGATACCTCCAGAAACCAGGTATTCCTCAAGATCACCAGTGTGGACACTGCAGATACTGCCACATACTACTGTGCTCGAAGNNNTCTACNNNNNNNNGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,TGTGCTCGAAGGGTTCTACCCTTTCTGGCTATGGACTACTGG,⋯,7,IGHV8-12,,IGHJ4,TGTGCTCGAAGGGTTCTACCCTTTCTGGCTATGGACTACTGG,CARRVLPFLAMDYW,CAGGTTACTCTGAAAGAGTCTGGCCCT...GGGATATTGCAGTCCTCCCAGACCCTCAGTCTGACTTGTTCTTTCTCTGGGTTTTCACTGAGC......ACTTCTGGTATGGGTGTGAGCTGGATTCGTCAGCCTTCAGGAAAGGGTCTGGAGTGGCTGGCACACATTTACTGGGAT.........GATGACAAGCGCTATAACCCATCCCTGAAG...AGCCGGCTCACAATCTCCAAGGATACCTCCAGAAACCAGGTATTCCTCAAGATCACCAGTGTGGACACTGCAGATACTGCCACATACTACTGTGCTCGAAGNNNNNNNNNNNNNNNNGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,IGHV8-12*01,IGHD2-1*01,IGHJ4*01
5,AGCGTATAGAAACCGC-1_contig_1,AGTGTGCAGATATGGACAGGCTTACTTCCTCATTCCTGCTGCTGATTGTCCCTGCATATGTCCTGTCCCAGGTTACTCTGAAAGAGTCTGGCCCTGGGATATTGCAGTCCTCCCAGACCCTCAGTCTGACTTGTTCTTTCTCTGGGTTTTCACTGAGCACTTCTGGTATGGGTGTGAGCTGGATTCGTCAGCCTTCAGGAAAGGGTCTGGAGTGGCTGGCACACATTTACTGGGATGATGACAAGCGCTATAACCCATCCCTGAAGAGCCGGCTCACAATCTCCAAGGATACCTCCAGAAACCAGGTATTCCTCAAGATCACCAGTGTGGACACTGCAGATACTGCCACATACTACTGTGCTCGAGGGTTTACCGGGGGCTATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAGAGAGTCAGTCCTTCCCAAATGTCTTCCCCCTCGTCTCCTGCGAGAGCCCCCTGTCTGATAAGAATCTGGTGGCCATGGGCTGCCTGGCCCGGGACTTCCTGCCCAGCACCATTTCCTTCACCTGGAACTACCAGAACAACACTGAAGTCATCCAGGGTATCAGAACCTTCCCAACACTGAGGACAGGGGGCAAGTACCTAGCCACCTCGCA,F,T,IGHV8-12*01,,IGHJ4*01,CAGGTTACTCTGAAAGAGTCTGGCCCT...GGGATATTGCAGTCCTCCCAGACCCTCAGTCTGACTTGTTCTTTCTCTGGGTTTTCACTGAGC......ACTTCTGGTATGGGTGTGAGCTGGATTCGTCAGCCTTCAGGAAAGGGTCTGGAGTGGCTGGCACACATTTACTGGGAT.........GATGACAAGCGCTATAACCCATCCCTGAAG...AGCCGGCTCACAATCTCCAAGGATACCTCCAGAAACCAGGTATTCCTCAAGATCACCAGTGTGGACACTGCAGATACTGCCACATACTACTGTGCTCGAGGGTTTACCGGGGGCTATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,CAGGTTACTCTGAAAGAGTCTGGCCCT...GGGATATTGCAGTCCTCCCAGACCCTCAGTCTGACTTGTTCTTTCTCTGGGTTTTCACTGAGC......ACTTCTGGTATGGGTGTGAGCTGGATTCGTCAGCCTTCAGGAAAGGGTCTGGAGTGGCTGGCACACATTTACTGGGAT.........GATGACAAGCGCTATAACCCATCCCTGAAG...AGCCGGCTCACAATCTCCAAGGATACCTCCAGAAACCAGGTATTCCTCAAGATCACCAGTGTGGACACTGCAGATACTGCCACATACTACTGTGCTCGANNNNNNNNNNNNNNCTATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,TGTGCTCGAGGGTTTACCGGGGGCTATGCTATGGACTACTGG,⋯,3,IGHV8-12,,IGHJ4,TGTGCTCGAGGGTTTACCGGGGGCTATGCTATGGACTACTGG,CARGFTGGYAMDYW,CAGGTTACTCTGAAAGAGTCTGGCCCT...GGGATATTGCAGTCCTCCCAGACCCTCAGTCTGACTTGTTCTTTCTCTGGGTTTTCACTGAGC......ACTTCTGGTATGGGTGTGAGCTGGATTCGTCAGCCTTCAGGAAAGGGTCTGGAGTGGCTGGCACACATTTACTGGGAT.........GATGACAAGCGCTATAACCCATCCCTGAAG...AGCCGGCTCACAATCTCCAAGGATACCTCCAGAAACCAGGTATTCCTCAAGATCACCAGTGTGGACACTGCAGATACTGCCACATACTACTGTGCTCGANNNNNNNNNNNNNNCTATGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,IGHV8-12*01,,IGHJ4*01
6,TTATGCTTCAATAAGG-1_contig_2,TGACAGAGGAGGCCGGTCCTGGATTCGAGTTCCTCACATTCAGTGATGAGCACTGAACACGGACCCCTCACCATGAACTTCGGGCTCAGCTTGATTTTCCTTGTCCTTGTTTTAAAAGGTGTCCAGTGTGAAGTGCAGCTGGTGGAGTCTGGGGGAGGCTTAGTGAAGCCTGGAGGGTCCCTGAAACTCTCCTGTGCAGCCTCTGGATTCACTTTCAGTAGCTATGCCATGTCTTGGGTTCGCCAGACTCCGGAAAAGAGGCTGGAGTGGGTCGCAACCATTAGTGATGGTGGTAGTTACACCTACTATCCAGACAATGTAAAGGGCCGATTCACCATCTCCAGAGACAATGCCAAGAACAACCTGTACCTGCAAATGAGCCATCTGAAGTCTGAGGACACAGCCATGTATTACTGTGCAAGAGAAAGTGGTCGGGAGAGTGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAGAGAGTCAGTCCTTCCCAAATGTCTTCCCCCTCGTCTCCTGCGAGAGCCCCCTGTCTGATAAGAATCTGGTGGCCATGGGCTGCCTGGCCCGGGACTTCCTGCCCAGCACCATTTCCTTCACCTGGAACTACCAGAACAACACTGAAGTCATCCAGGGTATCAGAACCTTCCCAACACTGAGGACAGGGGGCAAGTACCTAGCCACCTCGCA,F,T,IGHV5-4*01,IGHD1-3*01,IGHJ4*01,GAAGTGCAGCTGGTGGAGTCTGGGGGA...GGCTTAGTGAAGCCTGGAGGGTCCCTGAAACTCTCCTGTGCAGCCTCTGGATTCACTTTC............AGTAGCTATGCCATGTCTTGGGTTCGCCAGACTCCGGAAAAGAGGCTGGAGTGGGTCGCAACCATTAGTGATGGT......GGTAGTTACACCTACTATCCAGACAATGTAAAG...GGCCGATTCACCATCTCCAGAGACAATGCCAAGAACAACCTGTACCTGCAAATGAGCCATCTGAAGTCTGAGGACACAGCCATGTATTACTGTGCAAGAGAAAGTGGTCGGGAGAGTGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,GAAGTGCAGCTGGTGGAGTCTGGGGGA...GGCTTAGTGAAGCCTGGAGGGTCCCTGAAACTCTCCTGTGCAGCCTCTGGATTCACTTTC............AGTAGCTATGCCATGTCTTGGGTTCGCCAGACTCCGGAAAAGAGGCTGGAGTGGGTCGCAACCATTAGTGATGGT......GGTAGTTACACCTACTATCCAGACAATGTAAAG...GGCCGATTCACCATCTCCAGAGACAATGCCAAGAACAACCTGTACCTGCAAATGAGCCATCTGAAGTCTGAGGACACAGCCATGTATTACTGTGCAAGAGAAAGTGGTNNNNNNNNTGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,TGTGCAAGAGAAAGTGGTCGGGAGAGTGCTATGGACTACTGG,⋯,38,IGHV5-4,,IGHJ4,TGTGCAAGAGAAAGTGGTCGGGAGAGTGCTATGGACTACTGG,CARESGRESAMDYW,GAAGTGCAGCTGGTGGAGTCTGGGGGA...GGCTTAGTGAAGCCTGGAGGGTCCCTGAAACTCTCCTGTGCAGCCTCTGGATTCACTTTC............AGTAGCTATGCCATGTCTTGGGTTCGCCAGACTCCGGAAAAGAGGCTGGAGTGGGTCGCAACCATTAGTGATGGT......GGTAGTTACACCTACTATCCAGACAATGTAAAG...GGCCGATTCACCATCTCCAGAGACAATGCCAAGAACAACCTGTACCTGCAAATGAGCCATCTGAAGTCTGAGGACACAGCCATGTATTACTGTGCAAGAGANNNNNNNNNNNNNNNTGCTATGGACTACTGGGGTCAAGGAACCTCAGTCACCGTCTCCTCAG,IGHV5-4*01,IGHD1-3*01,IGHJ4*01


In [15]:
# Detect novel alleles
novel <- findNovelAlleles(db, germline_db, nproc = 8)

ERROR: Error in findNovelAlleles(db, germline_db, nproc = 8): Not enough sample sequences were assigned to any germline:
  (1) germline_min is too large or
  (2) sequences names don't match germlines.
