## Predicting Mammalian Cellular Circuit Behavior from Design Parameters   
#### Chris Coleman, ccol@u.northwestern.edu 
#### EECS 349, Northwestern University

#### Abstract

The task that I’m pursuing is related to my bioengineering research at Northwestern, attempting to understand engineered mammalian cells and their behavior. Specifically, in this project, I use different machine learning approaches to understand how specific design hyperparameters, e.g. quicker degradation or higher doses of certain cellular parts, affect behavior for a given design topology. Engineered mammalian cells can be used as both disease diagnostics and therapeutics, detecting attributes of an extracellular environment and actuating an appropriate biochemical response. For example, an ovarian cancer-detecting cell could sense the unique molecular profile of a specific tumor type and trigger a cascade of intercellular biological events that culminate in a specific response plan to the tumor.

Two distinct cellular topologies were created for this project: both were designed to sense two separate molecules and respond only if both molecules were present (i.e. both were Boolean AND gates). The first cellular design was easily learnable by all models because its output distribution was more uniform than the other design; the best predictor was a 6-layer Neural Network, with a classification accuracy of 98.69%. The second design was much more difficult to learn due to its imbalanced output data; the best predictor was an ensemble of 6 neural networks, achieving a classification accuracy of 92.23%. Despite the similarities between their desired behavior, the features that were important to each design varied considerably. This was interesting yet intuitive: designs underpinned by different biological mechanisms should not rely on their components in entirely similar ways.  

#### Methods

Cellular circuits rarely behave in the way that their digital counterparts do – the stochasticity underlying a population of cells ensures that different cells produce proteins at different rates, which causes their behavior to differ. At a high level, the diagram below shows how a cell can behave as an AND gate:

<img src="diagrams.png" style="width: 600px;"/>

The cells detect whether inputs are present using engineered receptors (the grey Y shapes on the thick black line, which represents the cell membrane) which causes the receptors release their payloads. A combined binding domain and activation domain are required to trigger the creation of the output DNA signal. An extremely important concept is that the receptors are not perfect – sometimes they release their payloads even if their input is not present. Therefore, tuning the amount of receptors we put on a cell membrane for a given design is essential; increasing the amount of receptors increases the level of background noise in the absence of input, but more receptors may be necessary to achieve a necessary output signal for a given design. 

The first design used for this project is analogous to the diagram above. However, the second design relies on the first receptor releasing an unstable binding domain and activation domain complex and the second receptor releasing an enzyme that will stabilize the first receptor’s product - both parts are required to trigger the output signal. For both designs, parameter sets were generated that varied receptor doses, protein production constants, and localization tags (which may inhibit activity of a component by sending it to a different place in the cell). The dataset for each design are comprised of 65,000 samples of distinct parameter pairings generated by an ODE model developed for my research. The output for each sample is a measurement of AND-gate behavior, which is computed via the expression in the on-on state divided by maximum expression in the other three states.  

An important note is that the distribution of the outputs for the two designs are very distinct. The first design has a more uniform distribution than the other, which makes it a more easily learned problem.  Two plots below depict this relationship. The first plot is a boxplot of the designs’ outputs which shows their distributions’ spread and means. From this, we can see that it is much harder to find parameter sets that lead to high-performing cells for the second design. The plot second is a tSNE representation of the cell parameters, which learns an efficient two-dimensional mapping of the high-dimensional parameter input. While the first design has small clusters of high-performing cells, it’s much more difficult to see clusters for the second design.

<img src="metrics.png" style="width: 500px;"/>
<img src="SNE.png">

#### Results

When predicting the output both Design 1 and 2, I used a 60/15/25% training/validation/testing split. I normalized the training/validation and testing data separately; all real valued features’ means and standard deviations were normalized to zero and one respectively and all discrete features were min-max scaled to the range of zero to one. Both designs were analyzed with regression and classification approaches. Since the behavior metric for AND-gate behavior is real-valued, a classification behavioral output was generated by finding the metric’s median and denoting any metric below the median as 0 and above the median as 1. The median for the first design was 3.57, meaning that cells that were classified as zeros were either low-performing or medium-performing and the cells that were classified as ones were high-performing. On the other hand, the median for the second design was 1.37, meaning that cells that were classified as zeros were low-performing and the cells that were classified as ones were either medium-performing or high-performing. The median was used as a classification threshold to generate perfectly balanced classes. 
 
The results of predicting behavior for Design 1’s cells using methods from the sklearn package are shown below. The table shows that random forests performed best with the regression task and a 6-layer neural network performed best with the classification task.

<img src="table1.png" style="width: 600px;"/>

The results of predicting behavior for Design 2’s cells using the same sklearn methods are shown below. All models performed poorly with respect to the regression task but fared better with the more balanced classification task.

<img src="table2.png" style="width: 500px;"/>

To see if classification accuracy could be improved, I utilized python’s Keras package to create my own neural network architectures, which were chosen by several rounds of tinkering. All architectures were trained with early stopping informed by validation loss. The last model trained was an ensemble of the other trained neural network models. The final row is a useful comparison to the performance of Random Forests.


<img src="table3new.png" style="width: 600px;"/>

The table shows that a 6-model ensemble of well-trained neural networks was able to outperform a random forest of 100 estimators in terms of overall accuracy. Another important metric for use in an experimental context is the false positive count of the models. The process for building a cellular circuit can take up over 6 months, meaning that if a design with poor-performing parameters is falsely identified as high-performing, both time and costly resources will be wasted. With that in mind, the second architecture in the table above would be the best model to use when considering false positive rates.

These models also showed that the feature ranges for high-performing cells varied across designs. There were far more receptors preferred in Design 1 versus Design 2, which suggests that Design 2 is more sensitive to receptors leaking their payloads in the absence of input. While there wasn't a strong localization preference for any of the parts in Design 1, high-performing cells for Design 2 frequently had a localization tag on the binding-activation domain complex which made the complex less likely to interact with the "DNA Box." Both of these trends suggest that Design 1 is far more robust than Design 2 - Design 1 is able to perform well in a wide variety of parameter ranges while Design 2 requires specific constraints on the parameters to be satisfied in order for a cell to be a high-performing AND-gate.

#### Conclusion

In this project, I showed that although different cellular topologies have different distributions of output metrics and different preferences for design parameters, it is still possible to classify a given set of parameters as low-performing or high-performing. This ability is extremely useful for understanding the ways in which cells are affected by parameters like receptor doses, protein production constants, and localization tags. A viable and exciting next step would be to follow the approach outlined here for hundreds of cellular designs and constructing a database of neural network weights for those topologies. Then, it might be possible to compare designs via the weights of the models that represent them, asking questions like, "how similar are two high-performing AND-gate designs?", "how dissimilar are this AND-gate and this OR-gate?" and finally, "how can we use our understanding of previous designs to motivate the creation of new designs?".