In [1]:
import numpy as np
from IPython.display import HTML
from bokeh.plotting import output_notebook, show
import genomes_dnj.lct_interval.series_plots as dm
import genomes_dnj.lct_interval.chrom_plots as cp
output_notebook(hide_banner=True)

<h3>Dense Series</h3>
<div style="width:700px">
<p>
The studies of SNP series statistics showed large variations in the number of active
series at different positions of the autosomal chromosomes, the number of SNPs in those
series and lengths of the series in DNA bases.  These studies also showed that chromosomes
could be partitioned into regions between positions where the number of active series
went to zero.  The studies in these notebooks focus on one of these regions of chromosome 2
where the gene lct is located.
<p>
For chromosome 2, the mean number of SNPs in a series is 10 and the mean
length of a series is 55,000 DNA bases.  But, the lactase persistence region of chromosome
2 includes many series with SNP counts and lengths that far exceed these means and are
expressed by large numbers of the 1000 genomes chromosome 2 samples.
Four of the most exceptional series are 117_1685, 123_1561, 62_1265, and 193_843.
Those series include a total of 495 SNPs  Each of them covers a large part the first
600,000 bases of the studied region of chromosome 2.  That DNA segment includes the
genes rab3gap1, zranb3 and the lower half of r3hdm1.  Over 2,000 of the 5,008 
1000 genomes chromosome 2 samples express at least one of these four series.
All four of them are expressed by 842 of those samples.
<p>
There is no obvious process for generating series that include these large numbers
of SNPs simply from some event that selects for the functional role of a single SNP.
The problem posed by the varying combinations of the series expressed by
different chromosome 2 samples is even more difficult.  But, the series do tend to
be part of hierarchies where some series is specific for the hierarchy and appears
to have played some kind of role in the process that selected that hierarchy for
overexpression.  For example the series 193_843 is the selector series for the 
hierarchies that are associated with the SAS tree.
<p>
The series 28_434 is an example of a more complex selector series.
It acts as a selector series for multiple hierarchies with varying combinations
of the series 117_1685 and 123_1561.  Those hierarchies have been generated by
processes that include fragmentation of series, recombination events, and appearance
of new series that selected some 28_434 subtree for overexpression.
<p>
This notebook provides an introduction to these four series and some of the
events that appear to have caused them to be so widely expressed.
</div>

<h3>Series 193_843</h3>
<div style="width:700px">
<p>
The 843 chromosome 2 samples that express the 193 SNPs in the 628,000 DNA base series
193_843 all also express the overlapping series 123_1561 and 62_1265.  All
but one of them express 117_1685.  The series 193_843 is a major part of the
root of the SAS tree.  But, no samples express just the root of that tree.
The large number of samples expressing 193_843 appear to be the result
of processes during the human expansion from Africa that generated an hierarchy of
overexpressed new series.  But, the process that generated 193_843 itself has not
left any obvious trace.
</div>

In [2]:
plt_obj = dm.superset_yes_no([dm.di_193_843], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [3]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
353244,135758231,618284,117,1685,0.5,842,1.0,79,0.47,26,0.49,143,1.23,167,0.99,83,0.49,116,1.9,228,2.21
353478,135933921,434642,123,1561,0.54,843,1.0,79,0.47,26,0.49,143,1.22,167,0.98,83,0.49,117,1.91,228,2.21
353814,136406646,31432,5,1460,0.5,731,0.87,74,0.5,24,0.52,138,1.36,139,0.94,81,0.55,88,1.66,187,2.09
353269,135766890,509095,62,1265,0.67,843,1.0,79,0.47,26,0.49,143,1.22,167,0.98,83,0.49,117,1.91,228,2.21
353790,136393157,48253,10,1218,0.62,750,0.89,76,0.5,26,0.55,138,1.33,141,0.93,81,0.54,92,1.69,196,2.13
353906,136496493,57432,9,1170,0.51,592,0.7,45,0.38,16,0.43,126,1.54,138,1.16,49,0.41,68,1.58,150,2.07
353849,136447707,42559,4,1149,0.62,711,0.84,50,0.35,20,0.45,134,1.36,144,1.01,78,0.55,90,1.74,195,2.24
353907,136496805,55824,9,1023,0.58,595,0.71,45,0.38,15,0.4,128,1.55,140,1.17,49,0.41,68,1.57,150,2.06
353984,136556805,190280,39,1014,0.51,518,0.61,31,0.3,15,0.46,121,1.69,99,0.95,45,0.43,72,1.91,135,2.13
353935,136511874,21321,5,976,0.48,468,0.56,44,0.47,13,0.44,77,1.19,129,1.37,18,0.19,61,1.79,126,2.2


<h3>Series 62_1265</h3>
<div style="width:700px">
<p>
The identity of 62_1265, the 62 SNP series expressed by 1265 samples as a series independent
of 193_843 is primarily due to chromosome 2 samples that express the series 67_329.
The 371 samples that express 67_329, 123_1561, and 117_1685 without 193_843 include 309 that express
67_329.
</div>

In [4]:
plt_obj = dm.superset_yes_no([dm.di_62_1265, dm.di_117_1685, dm.di_123_1561], [dm.di_193_843], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [5]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
353921,136501840,53819,10,2206,0.13,279,0.75,150,2.41,44,2.28,19,0.51,41,0.73,20,0.32,4,0.24,1,0.04
353244,135758231,618284,117,1685,0.22,371,1.0,214,2.59,62,2.42,24,0.49,44,0.59,21,0.26,4,0.18,2,0.06
353478,135933921,434642,123,1561,0.24,371,1.0,214,2.59,62,2.42,24,0.49,44,0.59,21,0.26,4,0.18,2,0.06
353814,136406646,31432,5,1460,0.16,235,0.63,114,2.17,35,2.15,16,0.51,44,0.93,21,0.4,3,0.22,2,0.09
353269,135766890,509095,62,1265,0.29,371,1.0,214,2.59,62,2.42,24,0.49,44,0.59,21,0.26,4,0.18,2,0.06
353729,136309239,52321,9,887,0.42,369,0.99,213,2.59,62,2.43,24,0.49,44,0.59,21,0.26,4,0.18,1,0.03
353312,135784351,117869,8,718,0.43,312,0.84,172,2.47,50,2.32,19,0.46,44,0.7,21,0.3,4,0.22,2,0.07
353851,136448855,17976,4,612,0.36,218,0.59,105,2.16,33,2.19,16,0.55,40,0.91,20,0.41,3,0.23,1,0.05
353349,135810535,87488,9,545,0.68,369,0.99,213,2.59,62,2.43,24,0.49,44,0.59,21,0.26,4,0.18,1,0.03
353904,136495815,46924,9,378,0.55,208,0.56,100,2.16,29,2.02,14,0.51,41,0.98,20,0.43,3,0.24,1,0.05


<div style="width:700px">
<p>
The overexpression of the 209 SNP series 209_56 expressed by 56 chromosome 2 samples
appears to have resulted from an indepdent selection process for chromosome 2
samples that express 62_1265, 123_1561, and 117_1685 but not 193_843.
This process appears to have started with a more extended series that included
209_56 and 14_48.  The full series of selected SNPs also included 117_1685,
123_1561, 62_1265, 26_1414 and 10_2206.  All of these series have participated
in other independent selection processes that formed different hierarchies.
</div>

In [6]:
plt_obj = dm.superset_yes_no([dm.di_209_56], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [7]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
353921,136501840,53819,10,2206,0.02,51,0.91,36,3.51,10,3.13,5,0.71,0,0.0,0,0.0,0,0.0,0,0.0
353244,135758231,618284,117,1685,0.03,56,1.0,40,3.55,11,3.13,5,0.64,0,0.0,0,0.0,0,0.0,0,0.0
353478,135933921,434642,123,1561,0.04,56,1.0,40,3.55,11,3.13,5,0.64,0,0.0,0,0.0,0,0.0,0,0.0
354130,136653925,107928,24,1504,0.02,37,0.66,26,3.49,6,2.59,5,0.98,0,0.0,0,0.0,0,0.0,0,0.0
353797,136398174,75924,26,1414,0.04,51,0.91,37,3.6,10,3.13,4,0.57,0,0.0,0,0.0,0,0.0,0,0.0
353269,135766890,509095,62,1265,0.04,56,1.0,40,3.55,11,3.13,5,0.64,0,0.0,0,0.0,0,0.0,0,0.0
353729,136309239,52321,9,887,0.06,56,1.0,40,3.55,11,3.13,5,0.64,0,0.0,0,0.0,0,0.0,0,0.0
353764,136364916,22977,5,588,0.1,56,1.0,40,3.55,11,3.13,5,0.64,0,0.0,0,0.0,0,0.0,0,0.0
353349,135810535,87488,9,545,0.1,56,1.0,40,3.55,11,3.13,5,0.64,0,0.0,0,0.0,0,0.0,0,0.0
354189,136704466,27748,5,212,0.17,36,0.64,26,3.59,6,2.66,4,0.8,0,0.0,0,0.0,0,0.0,0,0.0


<div style="width:700px">
<p>
There are 43 samples that express 62_1265 and 117_1685 but not 123_1561 or 193_843.
Those samples include 39 that are the result of the appearance of the series 9_39.
The hierarchy selected by this series is rooted in a genetic event that  recombined series
8_267 that is part of the SAS tree with the series 32_1361 and 81_857 that are part of
the EAS tree root.  Fragments of the series 193_843 and 123_1561
are still visible in the chromosme samples that express series that derive from this event.
</div>

In [8]:
plt_obj = dm.superset_yes_no([dm.di_117_1685, dm.di_62_1265], [dm.di_123_1561], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [9]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
354033,136588031,5647,7,1760,0.02,35,0.81,2,0.42,0,0.0,9,1.71,0,0.0,21,2.3,0,0.0,3,0.78
353244,135758231,618284,117,1685,0.03,43,1.0,2,0.35,0,0.0,13,2.01,0,0.0,25,2.22,0,0.0,3,0.64
354130,136653925,107928,24,1504,0.02,32,0.74,2,0.46,0,0.0,9,1.87,0,0.0,18,2.15,0,0.0,3,0.85
353925,136506375,32564,4,1442,0.02,35,0.81,2,0.42,0,0.0,9,1.71,0,0.0,21,2.3,0,0.0,3,0.78
353791,136393658,92684,32,1361,0.03,43,1.0,2,0.35,0,0.0,13,2.01,0,0.0,25,2.22,0,0.0,3,0.64
353958,136535876,19014,7,1303,0.03,35,0.81,2,0.42,0,0.0,9,1.71,0,0.0,21,2.3,0,0.0,3,0.78
354129,136652953,108222,5,1296,0.02,32,0.74,2,0.46,0,0.0,9,1.87,0,0.0,18,2.15,0,0.0,3,0.85
353269,135766890,509095,62,1265,0.03,43,1.0,2,0.35,0,0.0,13,2.01,0,0.0,25,2.22,0,0.0,3,0.64
353919,136500475,42085,13,1227,0.03,35,0.81,2,0.42,0,0.0,9,1.71,0,0.0,21,2.3,0,0.0,3,0.78
354127,136652491,80281,6,1114,0.03,32,0.74,2,0.46,0,0.0,9,1.87,0,0.0,18,2.15,0,0.0,3,0.85


<h3>Series 117_1685</h3>
<div style="width:700px">
The identity of 117_1685 as a distinct series is largely the result of the
210 chromosomes that express the 74 SNP series 74_210.  Those chromosomes
all express 117_1685 but not 123_1561, 62_1265, or 193_843.  Nor is there any
pattern of residual fragments of any of those series that might provide some
trace of an history of series SNP remodling events.
</div>

In [10]:
plt_obj = dm.superset_yes_no([dm.di_117_1685, dm.di_74_210], [dm.di_28_434, dm.di_123_1561], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [11]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
353244,135758231,618284,117,1685,0.12,210,1.0,61,2.66,12,1.67,29,0.89,3,0.06,66,1.15,14,0.92,25,1.04
353906,136496493,57432,9,1170,0.16,188,0.9,48,2.34,10,1.55,28,0.96,3,0.07,64,1.25,13,0.96,22,1.02
353907,136496805,55824,9,1023,0.19,190,0.9,47,2.27,10,1.54,28,0.95,3,0.07,65,1.25,14,1.02,23,1.06
353984,136556805,190280,39,1014,0.16,167,0.8,30,1.65,9,1.57,28,1.08,3,0.07,63,1.38,13,1.08,21,1.1
353935,136511874,21321,5,976,0.13,131,0.62,48,3.36,10,2.23,21,1.03,3,0.1,30,0.84,7,0.74,12,0.8
353938,136514709,28438,6,820,0.16,130,0.62,47,3.31,10,2.25,21,1.04,3,0.1,30,0.84,7,0.75,12,0.81
353312,135784351,117869,8,718,0.29,210,1.0,61,2.66,12,1.67,29,0.89,3,0.06,66,1.15,14,0.92,25,1.04
353249,135759145,604162,74,210,1.0,210,1.0,61,2.66,12,1.67,29,0.89,3,0.06,66,1.15,14,0.92,25,1.04
353325,135790329,493075,18,131,0.99,130,0.62,0,0.0,2,0.45,26,1.28,3,0.1,60,1.69,14,1.49,25,1.68


<div style="width:700px">
<p>
The series 290_16 is another case of chromosomes expressing only 117_1685.  The
number of chromosomes expressing this series is modest.  But its 290 SNPs are
a particularly large number.
</div>

In [12]:
plt_obj = dm.superset_yes_no([dm.di_290_16], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [13]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
353244,135758231,618284,117,1685,0.01,16,1.0,14,4.35,1,1.0,1,0.45,0,0.0,0,0.0,0,0.0,0,0.0
353906,136496493,57432,9,1170,0.01,10,0.62,9,4.47,0,0.0,1,0.72,0,0.0,0,0.0,0,0.0,0,0.0
353907,136496805,55824,9,1023,0.01,10,0.62,9,4.47,0,0.0,1,0.72,0,0.0,0,0.0,0,0.0,0,0.0
353984,136556805,190280,39,1014,0.01,10,0.62,9,4.47,0,0.0,1,0.72,0,0.0,0,0.0,0,0.0,0,0.0
353935,136511874,21321,5,976,0.01,10,0.62,9,4.47,0,0.0,1,0.72,0,0.0,0,0.0,0,0.0,0,0.0
353729,136309239,52321,9,887,0.02,16,1.0,14,4.35,1,1.0,1,0.45,0,0.0,0,0.0,0,0.0,0,0.0
353938,136514709,28438,6,820,0.01,10,0.62,9,4.47,0,0.0,1,0.72,0,0.0,0,0.0,0,0.0,0,0.0
353312,135784351,117869,8,718,0.02,16,1.0,14,4.35,1,1.0,1,0.45,0,0.0,0,0.0,0,0.0,0,0.0
353764,136364916,22977,5,588,0.03,16,1.0,14,4.35,1,1.0,1,0.45,0,0.0,0,0.0,0,0.0,0,0.0
353349,135810535,87488,9,545,0.03,16,1.0,14,4.35,1,1.0,1,0.45,0,0.0,0,0.0,0,0.0,0,0.0


<h3>Series 28_434</h3>
<div style="width:700px">
The 434 chromosomse 2 samples that express the series 28_434 provide some insight into the history
that generated the four different hierarchies rooted in that series.  Those hierarchies include
varying combinations of 117_1685 and 123_1561.  That selection process included extensive remodling
of series 117_1685 and 123_1561 SNPs, two recombination events, and four significant selection processes.
Each of those processes has selected an independent subtree of 28_434 for overexpression.
This plot shows the results of one of those selection processes.  It generated the hiearchy that
incudes the 25 SNP series 25_27 that is associated with a combination of the series 28_434,
123_1561, and 117_1685.
</div>

In [14]:
plt_obj = dm.superset_yes_no([dm.di_123_1561, dm.di_117_1685], [dm.di_62_1265], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [15]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
354170,136682274,93624,7,1868,0.01,17,0.55,10,3.1,6,5.85,1,0.43,0,0.0,0,0.0,0,0.0,0,0.0
354033,136588031,5647,7,1760,0.01,20,0.65,11,2.9,8,6.62,1,0.36,0,0.0,0,0.0,0,0.0,0,0.0
353244,135758231,618284,117,1685,0.02,31,1.0,21,3.57,8,4.27,2,0.47,0,0.0,0,0.0,0,0.0,0,0.0
353478,135933921,434642,123,1561,0.02,31,1.0,21,3.57,8,4.27,2,0.47,0,0.0,0,0.0,0,0.0,0,0.0
353814,136406646,31432,5,1460,0.02,30,0.97,20,3.51,8,4.42,2,0.49,0,0.0,0,0.0,0,0.0,0,0.0
353729,136309239,52321,9,887,0.03,30,0.97,20,3.51,8,4.42,2,0.49,0,0.0,0,0.0,0,0.0,0,0.0
353851,136448855,17976,4,612,0.05,30,0.97,20,3.51,8,4.42,2,0.49,0,0.0,0,0.0,0,0.0,0,0.0
353764,136364916,22977,5,588,0.05,29,0.94,19,3.45,8,4.57,2,0.5,0,0.0,0,0.0,0,0.0,0,0.0
354061,136603638,28487,16,511,0.03,16,0.52,9,2.96,6,6.21,1,0.46,0,0.0,0,0.0,0,0.0,0,0.0
353614,136115507,269773,28,434,0.07,29,0.94,19,3.45,8,4.57,2,0.5,0,0.0,0,0.0,0,0.0,0,0.0


<div style="width:700px">
The next plot shows another hierarchy expressed by a subset of the 28_434 samples.
The selection process for this hierarchy worked with the 180 SNP series 180_251.  The
251 samples that express 180_251 and 28_434 also express 123_1561 and 13_1696 but not
117_1685.  They are almost all of the samples that express 28_434 but not 117_1685.  They are
a large part of the samples that express 123_1561 but not 117_1685, 62_1265, or 193_843.
</div>

In [16]:
plt_obj = dm.superset_yes_no([dm.di_123_1561, dm.di_28_434], [dm.di_117_1685], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [17]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
354033,136588031,5647,7,1760,0.11,186,0.73,145,5.17,37,4.13,4,0.15,0,0.0,0,0.0,0,0.0,0,0.0
353240,135757320,20184,13,1696,0.15,251,0.98,191,5.05,56,4.63,4,0.11,0,0.0,0,0.0,0,0.0,0,0.0
353478,135933921,434642,123,1561,0.16,256,1.0,195,5.05,56,4.54,5,0.14,0,0.0,0,0.0,0,0.0,0,0.0
353790,136393157,48253,10,1218,0.2,244,0.95,185,5.03,55,4.68,4,0.11,0,0.0,0,0.0,0,0.0,0,0.0
353906,136496493,57432,9,1170,0.12,138,0.54,100,4.81,36,5.42,2,0.1,0,0.0,0,0.0,0,0.0,0,0.0
353935,136511874,21321,5,976,0.14,138,0.54,100,4.81,36,5.42,2,0.1,0,0.0,0,0.0,0,0.0,0,0.0
353729,136309239,52321,9,887,0.29,256,1.0,195,5.05,56,4.54,5,0.14,0,0.0,0,0.0,0,0.0,0,0.0
353851,136448855,17976,4,612,0.37,225,0.88,168,4.95,52,4.8,5,0.15,0,0.0,0,0.0,0,0.0,0,0.0
353764,136364916,22977,5,588,0.43,255,1.0,194,5.05,56,4.56,5,0.14,0,0.0,0,0.0,0,0.0,0,0.0
353614,136115507,269773,28,434,0.59,256,1.0,195,5.05,56,4.54,5,0.14,0,0.0,0,0.0,0,0.0,0,0.0


<div style="width:700px">
<p>
The next plot shows another hierarchy expressed by 73 of the 28_434 samples.
This hierarchy is formed by samples that express 28_434 and 117_1685 but not
123_1561 or 13_1696.  It is expressed by 67 of the 6_68 samples
and all 40 of the 14_40 samples.
</div>

In [18]:
plt_obj = dm.superset_yes_no([dm.di_117_1685, dm.di_28_434], [dm.di_13_1696, dm.di_123_1561], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [19]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
354033,136588031,5647,7,1760,0.03,44,0.6,33,8.04,10,6.08,1,0.13,0,0.0,0,0.0,0,0.0,0,0.0
353244,135758231,618284,117,1685,0.04,73,1.0,56,8.23,15,5.49,2,0.15,0,0.0,0,0.0,0,0.0,0,0.0
353814,136406646,31432,5,1460,0.04,53,0.73,38,7.69,14,7.06,1,0.11,0,0.0,0,0.0,0,0.0,0,0.0
353729,136309239,52321,9,887,0.08,73,1.0,56,8.23,15,5.49,2,0.15,0,0.0,0,0.0,0,0.0,0,0.0
353312,135784351,117869,8,718,0.1,71,0.97,54,8.16,15,5.65,2,0.16,0,0.0,0,0.0,0,0.0,0,0.0
353851,136448855,17976,4,612,0.08,49,0.67,38,8.32,10,5.46,1,0.11,0,0.0,0,0.0,0,0.0,0,0.0
353764,136364916,22977,5,588,0.12,73,1.0,56,8.23,15,5.49,2,0.15,0,0.0,0,0.0,0,0.0,0,0.0
353349,135810535,87488,9,545,0.12,68,0.93,53,8.36,14,5.51,1,0.08,0,0.0,0,0.0,0,0.0,0,0.0
353614,136115507,269773,28,434,0.17,73,1.0,56,8.23,15,5.49,2,0.15,0,0.0,0,0.0,0,0.0,0,0.0
353804,136402778,90652,21,332,0.13,44,0.6,33,8.04,10,6.08,1,0.13,0,0.0,0,0.0,0,0.0,0,0.0


<div style="width:700px">
<p>
This plot shows the hierarchy rooted in the 28_434, 117_1685, and 13_1696 association.  This hierarchy
appears to have resulted from the appearance of the series 22_73.
</div>

In [20]:
plt_obj = dm.superset_yes_no([dm.di_117_1685, dm.di_28_434, dm.di_13_1696], [dm.di_123_1561], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [21]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
353240,135757320,20184,13,1696,0.04,74,1.0,47,4.72,26,7.86,1,0.09,0,0.0,0,0.0,0,0.0,0,0.0
353244,135758231,618284,117,1685,0.04,74,1.0,47,4.72,26,7.86,1,0.09,0,0.0,0,0.0,0,0.0,0,0.0
354130,136653925,107928,24,1504,0.04,64,0.86,41,4.76,22,7.69,1,0.1,0,0.0,0,0.0,0,0.0,0,0.0
353925,136506375,32564,4,1442,0.05,66,0.89,42,4.73,23,7.8,1,0.1,0,0.0,0,0.0,0,0.0,0,0.0
353958,136535876,19014,7,1303,0.05,66,0.89,42,4.73,23,7.8,1,0.1,0,0.0,0,0.0,0,0.0,0,0.0
354129,136652953,108222,5,1296,0.05,64,0.86,41,4.76,22,7.69,1,0.1,0,0.0,0,0.0,0,0.0,0,0.0
354127,136652491,80281,6,1114,0.06,64,0.86,41,4.76,22,7.69,1,0.1,0,0.0,0,0.0,0,0.0,0,0.0
353504,135964764,136368,6,946,0.07,62,0.84,37,4.43,24,8.66,1,0.11,0,0.0,0,0.0,0,0.0,0,0.0
353729,136309239,52321,9,887,0.08,74,1.0,47,4.72,26,7.86,1,0.09,0,0.0,0,0.0,0,0.0,0,0.0
353312,135784351,117869,8,718,0.1,74,1.0,47,4.72,26,7.86,1,0.09,0,0.0,0,0.0,0,0.0,0,0.0


<div style="width:700px">
The plot below shows the series hierarchy for the 35 samples that express the
22 SNP series 22_35.  This hierarchy is the result of an independent process
that has selected the series 123_1561 for overexpression without 117_1685,
62_1265, or 193_843.
</div>

In [22]:
plt_obj = dm.superset_yes_no([dm.di_22_35], min_match=0.5)
plt = plt_obj.do_plot()
show(plt)

In [23]:
HTML(plt_obj.get_html())

index,first,length,snps,alleles,alleles.1,matches,matches.1,afr,afr.1,afx,afx.1,amr,amr.1,eas,eas.1,eur,eur.1,sas,sas.1,sax,sax.1
353240,135757320,20184,13,1696,0.02,35,1.0,21,2.98,4,1.82,1,0.21,0,0.0,0,0.0,5,1.97,4,0.93
353478,135933921,434642,123,1561,0.02,35,1.0,21,2.98,4,1.82,1,0.21,0,0.0,0,0.0,5,1.97,4,0.93
353814,136406646,31432,5,1460,0.02,26,0.74,21,4.01,4,2.45,1,0.28,0,0.0,0,0.0,0,0.0,0,0.0
353729,136309239,52321,9,887,0.04,35,1.0,21,2.98,4,1.82,1,0.21,0,0.0,0,0.0,5,1.97,4,0.93
353851,136448855,17976,4,612,0.04,26,0.74,21,4.01,4,2.45,1,0.28,0,0.0,0,0.0,0,0.0,0,0.0
353483,135939427,422689,22,35,1.0,35,1.0,21,2.98,4,1.82,1,0.21,0,0.0,0,0.0,5,1.97,4,0.93
353326,135791020,702063,219,26,1.0,26,0.74,21,4.01,4,2.45,1,0.28,0,0.0,0,0.0,0,0.0,0,0.0
