# Gillespie's algorithm for logistic branching process where all cells are different

I am interested in modelling evolution in cancer cells. The logistic branching process [1] is a stochastic process that follows logistic growth. I'm looking at a slightly modified version that works as follows. Consider a population of $N$ cells. Individual cells divide (into two) at a rate

$b = r - (N-1)I_b$

and die at a rate

$d = s + (N-1)I_d$

If the birth rate is ever negative, the negative excess is added to the death rate. The process will tend to hover around some population average $\hat{N}$. This is simply simulated with gillespies algorithm:

1. Calculate the birth and death rates
2. Get a waiting time from the exponential distribution based on the total rate
3. Select division or death with a probability proportional to their rates.

Simmulating the process for a time t will take a time proportional to $\hat{N}$ as the number of events within a time interval depends on the number of cells. This is not a problem.

Consider a situation where every cell has a (near) unique base birth rate $r$. It is determined by the cells genome, which changes upon each division, and also by time. The variation in time is slow however, so we can ignore detailed effects of it changing in the simulation. The algorithm now becomes:

1. Calculate birth and death rates for each cell
2. Get waiting time
3. Select event

Note that step one grows with the number of cells. Thus simulating for a time $t$ is now $O(\hat{N}^2)$. I need to run these simulations many times to gather statistics. That is embarrasingly parallell and simple to do. But as it stands the simulations themselves are slower than what would be practical. Thus I have two avenues of optimization:

1. Run multiple processes for gathering statistics on separate cores
2. Speed up the calculation of birth and death rates (millions of cells so big vector operation) by running it on a GPU.

It seems unlikely that parallelizing the rate calculation onto the CPU would provide any real-time speedup when running more than one simulation, since the other cores could be more effectively used just running more simulations. But since it is a simple repeated operation, maybe a GPU can speed it up.

### Feasibility
The whole simulator is less than 200 lines of c++ code, designed to be flexible with the details of how the cells rates are determined. So it is not a lot of code to modify.