In [None]:
import numpy as np
from itertools import combinations
from scipy.special import comb
from collections import defaultdict, OrderedDict
import importlib
import better_pool_generator

PoolGenerator helps to find a good testing for a given number of samples and pools, allowing you to run minimal tests in order to establish which individuals are positive.

Let's start with a simple scenario in which we have only 15 samples and 6 pools.  

In [None]:
importlib.reload(better_pool_generator)
PG = better_pool_generator.PoolGenerator(num_samples=15, num_pools=6)
pools = PG.get_pools()
for p in pools:
    print(p, pools[p])

This gives each of the 6 pools (indexed 0 through 5) with the sample IDs that should be included.  So the first pool should have parts of samples:

In [None]:
pools[0]

Notice that each sample is in 2 pools.  We could also have found that out directly

In [None]:
PG.show_stats()

This also shows that all pools are exactly the same size.  This won't usually happen, but they should be close to the same size--even without round numbers.

# Almost the real deal

Let's do a less trivial example with 192 samples and 12 pools.

In [None]:
PG = better_pool_generator.PoolGenerator(num_samples=192, num_pools=12)
pools = PG.get_pools()
for p in pools:
    print(p, pools[p])

ok, it's harder to read with that many samples, so let's check the stats.

In [None]:
PG.show_stats()

This time we did have a bit of variability in pool sizes, but not too much.

# The real deal

OK, now let's give it an impossible task--where there are too many tests to uniquely identify them just from testing the pools and see how close it does.

This time, we'll also specify how many pools each sample should be in.  We might want to avoid diluting them too much.

In [None]:
importlib.reload(better_pool_generator)
PG2 = better_pool_generator.PoolGenerator(num_samples=300, num_pools=12, pools_per_sample=3)
pools = PG2.get_pools();
PG2.show_stats()

In [None]:
for p in pools:
    print(p, pools[p])

Let's throw a bigger example at it to see the speed.  None of the code is highly optimized.

In [50]:
importlib.reload(better_pool_generator)
PG2 = better_pool_generator.PoolGenerator(num_samples=3000, num_pools=25, pools_per_sample=4)
pools = PG2.get_pools();
PG2.show_stats()

We placed each sample in 4 pools.

With this arrangement, we might need up to 0 follow up tests for each pool that is positive

The pool sizes range from 520 to 521, with an average of 520.8
