In [38]:
import numpy as np
from itertools import combinations
from scipy.special import comb
from collections import defaultdict, OrderedDict
import importlib
import better_pool_generator

PoolGenerator helps to find a good testing for a given number of samples and pools, allowing you to run minimal tests in order to establish which individuals are positive.

Let's start with a simple scenario in which we have only 15 samples and 6 pools.  

In [39]:
importlib.reload(better_pool_generator)
PG = better_pool_generator.PoolGenerator(num_samples=15, num_pools=6)
pools = PG.get_pools()
for p in pools:
    print(p, pools[p])

2 [0, 3, 8, 11, 12]
3 [0, 4, 7, 9, 13]
0 [1, 4, 6, 11, 14]
5 [1, 5, 8, 10, 13]
1 [2, 3, 7, 10, 14]
4 [2, 5, 6, 9, 12]


This gives each of the 6 pools (indexed 0 through 5) with the sample IDs that should be included.  So the first pool should have parts of samples:

In [40]:
pools[0]

[1, 4, 6, 11, 14]

Notice that each sample is in 2 pools.  We could also have found that out directly

In [41]:
PG.show_stats()

We placed each sample in 2 pools.

With this arrangement, we might need 1 follow up tests for each pool that is positive

The pool sizes range from 5 to 5, with an average of 5.0


This also shows that all pools are exactly the same size.  This won't usually happen, but they should be close to the same size--even without round numbers.

# Almost the real deal

Let's do a less trivial example with 192 samples and 12 pools.

In [42]:
PG = better_pool_generator.PoolGenerator(num_samples=192, num_pools=12)
pools = PG.get_pools()
for p in pools:
    print(p, pools[p])

3 [0, 6, 10, 12, 18, 20, 27, 30, 34, 36, 43, 44, 49, 52, 58, 59, 66, 71, 74, 79, 80, 83, 84, 92, 97, 102, 105, 109, 115, 116, 124, 130, 131, 133, 138, 144, 146, 148, 152, 155, 162, 167, 169, 172, 175, 181, 184, 187, 195, 199, 200, 204]
6 [0, 6, 11, 12, 16, 21, 24, 29, 32, 39, 40, 44, 50, 54, 60, 65, 67, 70, 75, 78, 82, 85, 91, 95, 99, 102, 109, 111, 113, 119, 121, 123, 127, 131, 136, 140, 142, 148, 154, 158, 167, 169, 170, 174, 178, 180, 185, 190, 198]
7 [0, 5, 9, 14, 17, 23, 24, 31, 34, 37, 40, 46, 49, 53, 57, 59, 61, 68, 74, 78, 83, 87, 90, 93, 96, 101, 106, 111, 112, 119, 122, 125, 128, 134, 137, 142, 144, 149, 155, 158, 161, 163, 171, 176, 179, 180, 183, 189, 191, 197, 200, 203]
0 [1, 4, 11, 14, 19, 22, 26, 31, 32, 39, 41, 45, 48, 55, 59, 62, 66, 69, 73, 76, 81, 86, 89, 93, 98, 105, 106, 107, 114, 117, 122, 124, 130, 136, 138, 139, 145, 150, 154, 156, 162, 163, 167, 172, 176, 182, 183, 188, 192, 198, 199]
1 [1, 7, 8, 12, 17, 21, 25, 31, 32, 38, 43, 46, 51, 52, 56, 60, 64, 68, 79, 8

ok, it's harder to read with that many samples, so let's check the stats.

In [43]:
PG.show_stats()

We placed each sample in 3 pools.

With this arrangement, we might need 1 follow up tests for each pool that is positive

The pool sizes range from 49 to 52, with an average of 51.5


This time we did have a bit of variability in pool sizes, but not too much.

# The real deal

OK, now let's give it an impossible task--where there are too many tests to uniquely identify them just from testing the pools and see how close it does.

This time, we'll also specify how many pools each sample should be in.  We might want to avoid diluting them too much.

In [44]:
importlib.reload(better_pool_generator)
PG2 = better_pool_generator.PoolGenerator(num_samples=300, num_pools=12, pools_per_sample=3)
pools = PG2.get_pools();
PG2.show_stats()

We placed each sample in 3 pools.

With this arrangement, we might need 2 follow up tests for each pool that is positive

The pool sizes range from 74 to 76, with an average of 75.0


In [45]:
for p in pools:
    print(p, pools[p])

0 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 221, 226, 229, 233, 237, 244, 245, 249, 253, 256, 260, 266, 268, 272, 276, 280, 285, 288, 293, 299]
1 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 222, 224, 230, 234, 238, 241, 243, 251, 253, 256, 263, 265, 270, 273, 279, 282, 287, 291, 293, 298, 299]
2 [0, 10, 11, 12, 13, 14, 15, 16, 17, 18, 55, 56, 57, 58, 59, 60, 61, 62, 63, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 220, 227, 230, 235, 238, 241, 244, 249, 254, 259, 262, 264, 267, 274, 279, 281, 286, 291, 294, 297]
3 [1, 10, 1

In [None]:
PG2.pool_counts + 2

In [None]:
len(PG2.sample_pools)

In [None]:
type(PG2.repeats)

In [None]:
PG2.

In [47]:
importlib.reload(better_pool_generator)
PG2 = better_pool_generator.PoolGenerator(num_samples=3000, num_pools=25, pools_per_sample=4)
pools = PG2.get_pools();
PG2.show_stats()

We placed each sample in 4 pools.

With this arrangement, we might need 1 follow up tests for each pool that is positive

The pool sizes range from 522 to 523, with an average of 522.56


In [48]:
pools

defaultdict(list,
            {3: [0,
              12,
              15,
              22,
              30,
              35,
              39,
              47,
              50,
              61,
              62,
              72,
              75,
              85,
              88,
              93,
              100,
              111,
              115,
              124,
              125,
              132,
              141,
              146,
              150,
              160,
              165,
              169,
              180,
              185,
              187,
              194,
              204,
              209,
              214,
              222,
              229,
              236,
              239,
              245,
              251,
              259,
              265,
              270,
              273,
              284,
              288,
              295,
              301,
              310,
              312,
              322,
        