# Reconstuction of a GSM for *Roseobacter litoralis* and laccase production

## 1. Introduction

### 1.1 Literature review of laccase

Laccase is an enzyme that belongs to the group of multi-copper oxidases. It contains four copper ions and naturally occurs in fungi, bacteria and plants. Laccase requires molecular oxygen as a co-substrate and produces only water as a by-product, making it eco-friendly and thus scientifically interesting (Brijwani et al. 2010). It raises further interest by its wide range of applications in various industries including the food, pharma and textile industry. The use of laccase in these industries are summarized in table 1. 



|Industry | Examples |
| --- | --- |
| Food | Beer production:|
|       |-Removal of oxygen in finished beer|
|      |Baking:|
|      |-increase strength and stability of the dough|
|      |Wine production:|
|      |- selective polyphenol removal|
| Pharma | Development of:|
|            |-Antibiotics|
|           |-Anticancer drugs|
|           |-Melanin synthesis|
|           |-Sedatives |
| Textile | Used in the following processes:|
|          | -Bleaching|
|          | -Dyeing|
|          |-Printing|
|          |-Wash off treatment|
|           |-Finishing |


<div align="center">
Table 1: Application of laccase in different industries.
<div align="center">    
Mayolo-Deloisa et al. 2020, Chaurasia et al. 2017, Garje et al. 2020   
</div>





One property of laccase, which might be especially interesting for future application, is its major role in the polyethylene degradation (Santo et al. 2013). This would provide a natural path to degrade the plastic polluting the oceans and landscapes which would be a breakthrough in environmental biotechnology.

Due to the application of laccase in the food, textile and pharma industry, the global laccase market in 2020 is 2947.7 million USD. Due to the rise of alternative enzymes in the textile industry, the market size is projected to decrease to 2850 million USD by 2026 (Laccase Market Size et al. 2020). The marked size, however, would be expanded by the introduction of new laccase applications.

Laccase has mostly been researched in fungi, although it is also present in some bacterial and plant cells. Laccase is an extracellular enzyme which is produced as a secondary metabolite in filamentous fungi. The laccase gene has been successfully cloned and expressed in organisms such as Aspergillus Niger, Aspergillus oryzae, and Trichoderma reesei (Brijwani et al. 2010).

### 1.2 Literature review of *Roseobacter litoralis*

*Roseobacter litoralis* is found in a wide range of marine habitats including both open ocean to coastal areas. It and can be isolated from seaweed. *R. litoralis* is an aerobic pink-pigmented bacterium containing bacteriochlorophyll, which allows it to perform aerobic anoxygenic photosynthesis (Shiba et al., 1991, Kalhoefer et al., 2011). *R. litoralis* requires a saline environment and can grow on a broad number of different carbon sources and nitrogen sources (Martens et al., 2007). 

Due to its high tolerance to saline stress it is an interesting cell factory for bioremediation of plastic waste in the oceans. This could be possible by heterologous expression of laccase in *R. litoralis*. Furthermore, the genome of *R. litoralis* reveals several potential heavy metal resistance genes, which could be an advantage in bioremediation.

Because of *R. litoralis* diverse metabolism it can be considered for genetic engineering. *R. litoralis* has already been described as a producer of secondary metabolites such as antibiotics (Martens et al., 2007; Brock et al., 2014; Wang et al., 2016b) and some techniques to manipulate its genome have been developed (Borg et al., 2016; Tang et al., 2016). However, compared to other chassis, *Roseobacter litoralis* is a quite unstudied cell factory with no available genome-scale models or metabolic flux analysis. <font color=red>This is why it might be easier to genetically manipulate an better studied organism, so that it shows a high tolerance to saline stress and then introduce the pathway to produce laccase.</font>
<font color=red>(for me that si sth for the discusiion not introductio) **In order to allow genetic engineering of R. litoralis, more theoretical work must be carried out to create a better understanding of its metabolism** </font>

## 2. Problem definition

<font color=red>The aim of this project is to construct a genome-scale model (GSM) for *Rosebacter litoralis*. This GSM is to be used as a framework for the utilization of *R. litoralis* as a cell factory. As *R. litoralis* is not a well-studied organism, a GSM is a step on the way to provide the theoretical basis needed for genetic engineering.\
The GSM will be optimized and validated with available experimental data found in the literature.

<font color=red>As *R. litoralis* is a proposed heterologous producer of laccase, which is interesting for the bioremdiation of plastic waste in marine ecosystems. The constructed GSM will be used to introduce laccase production. The production and yield of laccase in *R. litoralis* will be evaluated and compared to that of *E. coli*.

The aim of this project is to reconstruct a genome-scale model (GSM) for *Rosebacter litoralis*. This includes model optimization, medium optimization and model validation based on experimental data found in the literature. 
Furthermore, the pathway for the production of laccase will be introduced and optimized in *R. litoralis* for bioremediation of plastics in marine ecosystems. 

## 3. Reconstruction of a new GSM for *R. litoralis*


A draft genome scale model was constructed with carveme using LB medium for gapfilling, which is seen is the [GSM](GSM.ipynb) file. The data used for the draft was retrieved from the NCBI genome database for *Roseobacter litoralis* B14.(https://www.ncbi.nlm.nih.gov/nuccore/NZ_LGTP00000000.1)

### 3.1 Memote analysis

In order to analyze the performance of the model, it was tested using Memote. The results are summarized in the [Memote report](Memote_Roseobacter-litoralis-strain-B14.html). The total score of the Memote analysis is 22%. The score of each section is summarized in table 2.

|Section | Score |
|--- | --- |
| Consisentcy | 47% |
| Annotation of reactions | 25% |
| Annotation of metabolites | 25% |
| Annotation of Genes | 0% |
| Annotation of SBO Terms | 0% |
|Total Score | 22%|

<div align="center">
Table 2: Sections and corresponding scores from Memote analysis.   
</div>

The model did not contain any cross-references to any databases evaluated by Memote. Therefore, the annotation test in all these databases are 0%. Due to the lack of cross-references, no inconsitency in the identifiers of genes and reactions will occur. Hence, the percentage of “Uniform Identifier Namespace” for reactions and metabolites are 100%.

In the “Consistency” section, the model generated good results in terms of “Charge Balance” and “Metabolite Connectivity”, meaning that almost all reactions are charge balanced and that all occurring metabolites are part of a reaction.

The model needs improvement in terms of “Mass Balance” as more than 40% of the reactions are not mass balanced. A lack of accounting for cofactors and incorrect definition of transport directions causes some fluxes to operate at either the maximum or minimum level. Further improvements are required to reduce the reactions that carry out unbound fluxes under default model conditions. Balanced reactions and bound fluxes are important to make a reliable prediction of yields.

In the section “Basic Information” it is reported that a total of 1111 genes are implemented in our model, while according to NCBI a total of 4755 coding genes exist in the genome of *R. litoralis*. The genes missing in our model could, if included, possibly increase the performance of our model in the section “Consistency”. The model has a total of 1483 metabolites participating in 2139 reactions, located in 3 compartments: the cytosol, the periplasm and in the extracellular space.

The “Metabolic Coverage” is 1.9 and thus above 1, which according to Memote labels the model to be of high detail. Furthermore, there are six unconserved metabolites, meaning that their net stoichiometries are inconsistent, resulting in an overall negative mass (Gevorgyan et al. 2008). A list of these can be found in [Appendix 1](#appendix_1).


### 3.2 Escher map

In order to visualize the reactions of our model, we loaded the Escher map of the *E. coli* core metabolism and added our model data using Escher Builder. In the [Escher map](escher.html), the reactions of the *E. coli* core metabolism, which are also present *R. litoralis* are shown with blue lines, while the reactions in the map which are not present in *R. litoralis* are highlighted in red. The map provides an overview of which and how many reactions in the *E. coli* core metabolism that are not present in the *R. litoralis* core metabolism. For some of these reactions, there might exist a homologous reaction in *R. litoralis*.

### 3.3 Medium optimization

Due to the lack of experimental data on *R. litoralis* the GSM could not be improved based on fluxes or such. Although a growth rate for *R. litoralis* was obtained in the paper by Piekarski *et al.*. Therefore, we decided to adjust the medium, so the predicted growth rate from the model would fit the experimental growth rate from the paper.

In order to adjust our model to the experimental data, we compared our maximal growth rate on LB medium, which is found in the [GSM](GSM.ipynb) file, to the one found on LB medium by Pierkarski et al. This revealed that our growth rate is much higher than the value found in the literature ($0.70 h^{-1}$ compared to $0.27 h^{-1}$). Therefore, we coded a loop which reduces all the exchange reactions in the medium except for glucose stepwise and calculated the corresponding growth rates. These computations are shown in the [Medium optimization](MediumOptimization.ipynb) file. A plot of the levels of the upper bound of the exchange reactions over the growth rate is shown in figure 1.

It was found that in order to reach a growth rate of 0.27 $h^{-1}$ all exchange reactions included in the medium besides glucose, must be set to an upper bound of 2.5.

Subsequently, we wanted to find the limiting factors for the growth. Therefore, we set all the exchange reactions back to their default values one at a time and found the resulting growth rates. These growth rates were compared to 0.27, to detect if the addition of more of the element in question, would lead to an increased growth rate. It was found that oxygen is the limiting factor as this was the only factor that led to a significant increase in growth rate (0.27 $h^{-1}$ compared to 0.67 $h^{-1}$) when reset to the default value of 10. It is apparent that an increase in the glucose exchange level leads to an increased growth as glucose is the main carbon source. Fe(III) was also found to lead to an slightly increased growth rate, but since it is very small it can be neglected. We also found that our *R. litoralis* model grows anaerobically with a growth rate of 0.14 $h^{-1}$.

<center><img src='Growthrateplot.png'></center>
<div align="center">
Figure 1: Medium exchange reactions levels over obtained growth rates. The value set for the upper bound of the exchange reactions in the medium (except glucose) is seen on the y-axis, and the growth rate in $h^{-1}$ is seen on the x-axis. The black vertical line shows when the desired growth rate is obtained.

### 3.4 Model validation

#### 3.4.1 Literature review on substrate test

As mentioned earlier, it is difficult to find experimental data concerning *R. litoralis* in the literature. Yet we found a substrate test, which included 55 different carbon sources, 12 different nitrogen sources and 1 sulfur source. (Kalhoefer et al., 2011) This data was used for validating our GSM of *R. litoralis*. If our model is accurate, it should be able to predict growth on the same substrates as found in the paper. 

Prior to the substrate test Kalhoefer et al. characterized the genome of *Roseobacter litoralis* OCh149. This information on the genome of *R.litoralis* was used to predict which substrates the microorganism would grow on, before conducting the actual growth experiment.

The experimental setup was to grow *R.litoralis* on minimal media with different carbon and nitrogen sources present. The results were measured at OD600 using a spectrophotometer. The different growth rates were assigned to five categories, symbolizing strong growth, medium growth, little growth, no growth or growth equal or less than the negative control.

For our model validation we reduced those five categories to two: growth and no growth.

#### 3.4.2 Model validation with carbon source experiments

To be able to use the experimental data described above (section 3.4.1), the growth of our model on the different carbon sources was to be computed. The code for validating the GSM can be found in the [model validation](ModelValidation.ipynb) file. The process is described below. 

To test the different carbon sources described in the paper one at a time, carbon sources in the medium were removed. Subsequently, the 55 different carbon sources from Kalhoefer et al. were located, and it was found, that the model missed 11 out of the 55 carbons source exchange reactions. 

In this project, we decided to ignore the missing exchange reactions and only test for those, which are present in the model. These missing exchange reactions could exhibit an error in our model, if it was found that *R. litoralis* should be able to grow on the specific carbon sources. It is also possible that *R. litoralis* does simply not have the genes to express those exchange reactions, meaning that our model does not show an error in this regard. Further literature and computational research need to be conducted to evaluate this problem. The discarded carbon sources are listed in [model validation](ModelValidation.ipynb). 

The remaining 44 carbon sources were separately introduced to the medium, to test if our model could grow on them. However, the model only grew on arginine. 

After further examination of this outcome, we found that the medium was lacking an appropriate nitrogen source which caused this lack of growth. The reason that the model could only grow on arginine is due to its nitrogen content, which caused arginine to work as the nitrogen and carbon source for the model. It was investigated whether any other nitrogen source exchange reaction would lead to a growth on glucose, but it was found that the only nitrogen source that led to growth was arginine.

The solution was to run a new carveme using the minimal medium M9 for gapfilling. Hereby, the medium is now minimal, and *R. litoralis* can grow on NH4 as the sole nitrogen source. This medium lacked 1 exchange reaction in addition to the 11 already identified, resulting in 12 ignored carbon sources. The change to M9 media allowed us to test the growth on the 43 carbon sources. The results of the computational and experimental growth of *R. litoralis* can be seen in table 3.  

In [5]:
import pandas as pd

df = pd.read_csv("growthcsources.csv")
display(df)

Unnamed: 0,Carbon source,Growth in experiment,Growth in model
0,L-Alanine,Yes,Yes
1,L-Arginine,Yes,Yes
2,L-Aspartate,Yes,Yes
3,L-Glutamate,No,Yes
4,L-Glutamine,Yes,Yes
5,Glycine,Yes,Yes
6,L-Histidine,Yes,Yes
7,L-Isoleucine,No,No
8,L-Leucine,No,No
9,L-Lysine,No,Yes


Table 3: Different carbonsources and if growth of <i>R. litoralis</i> is possible experimentally vs. in our model.

#### 3.4.3 Confusion matrix

To visualize the validation of our GSM, a confusion matrix from the data in table 3 was constructed and is presented in figure 2.

![title](Confusionmatrix.png)

<div align="center">
Figure 2: Confusion matrix to describe the performance of our model on the set of carbon sources found in Kalhoefer et al..   
</div>

##### Structure of the matrix
The matrix consists of true positive (TP), false positive (FP), false negative (FN) and true negative (TN). \
The true positives are a measure of how many carbon sources the model predicted it could grow on, for which the experimental data also showed that *Roseobacter* can grow on. \
The false positives are a measure of how many carbon sources the model predicted it could grow on, for which the experimental data showed that *Roseobacter* can not grow on. \
The true negatives are a measure of how many carbon sources the model predicted it could not grow on, which the experimental data also showed that *Roseobacter* can not grow on. \
The false negatives are a measure of how many carbon sources the model predicted it could not grow on, which the experimental data showed that *Roseobacter* can grow on. \
From these values TP, FP, TN and FN, several different measures of the model can be computed. 
##### Precision
The precision is calculated as the true positives divided by all positive model predictions. Thus, it is a measure of how likely a positive prediction is to be true. In the case of our model, the precision is 0.70, which means that in 70% of the cases, our model gave a positive result, where the experimental result was also positive. 
##### Negative predictive 
The negative predictive value is the same as the precision, but for the negative results, i.e. it is a measure of how likely a negative prediction is to be true. In our case it is 0.67, so in 67% of the cases.
##### Sensitivity
The sensitivity is the ratio of positive experimental results detected correctly by the model. In our case it is 88 %, which is quite high, and suggests that the model does quite well with detecting positive results.
##### Specificity
The specificity is the ratio of negative experimental results detected correctly by the model. In our case it is only 0.38, which is very low, and suggests that in only 38% of the cases, the model can correctly detect a negative result. This means that our model has a pretty high rate of false positive results and predicts growth on a lot of carbon sources that it should not be able to grow on. This might be due to the model containing inactive genes, which might be determined by MFA.
##### Accuracy
The accuracy is the ratio of correct model predictions. Thereby it can be used as a measure of overall model performance. It is seen that the model accuracy is 69 %, which is not particularly high and is mostly influenced by the many false-postives. This suggests that there are several improvements that can be made to the model, to improve its performance in predicting the growth of *R. litoralis*.

### 3.6 Suggested experiments

As the amount of experimental data of *Roseobacter litoralis* is lacking, it has been difficult to improve our GSM model. The next step for improving the GSM in the future is therefore to gain more experimental data for this organism. Some useful experiments might be: \
•Flux analysis such as 13C-MFA \
•Growth rates in minimal medium \
•Determination of the biomass composition

The metabolic flux analysis can be used to improve and validate the present model by changing the bound values. Growth rates in minimal medium rather than LB medium will make it easier to validate the model, since the exact medium composition will be known. Determining the biomass composition will improve the model and make calculations including biomass formation or depletion more accurate.

## 4. Computer-Aided Cell Factory Engineering

To introduce laccase to our model, we added a reaction to our M9 gapfilled model which consumes the amino acids. This can be seen in [Yield Comparison](Yield_Comparison.ipynb) file. Our model shows a laccase yield of 0.00225 $\frac{mmol -lacc}{mmol -glc}$ and a maximal laccase production of 0.015361 $mmol gDW^{-1} h^{-1}$ (see table 4). In order to compare these results to other microorganisms, we introduced the same reaction to *E. coli*. In *E. coli*, a laccase yield of 0.00224 $\frac{mmol -lacc}{mmol -glc}$ and a maximal laccase production of 0.0224 $mmol gDW^{-1} h^{-1}$ was achieved. Interestingly, while the yields are comparable to each other, the production differs by 0.007 $mmol gDW^{-1} h^{-1}$.  

To find out the reason for this difference in the laccase production, as a next step, we compared the reactions of the two organisms to see where there are differences in the fluxes between the models. We found out that the *E. coli* model has a total of 2714 reactions and our model has a total of 2141 reactions but only about 1369 reactions are shared between the two models. From these reactions there are 256 reactions where the one model has fluxes for but the other doesn't. A list of these reactions can be found in the [Appendix 2](#appendix_2). Since this is a pretty big number a detailed investigation is not possible in the timeframe of this work.

| |*R. litoralis* | *E. coli* |
|--- |--- | --- |
| **Max. laccase production** | 0.0154 | 0.0224 |
| **Theoretical max. yield** | 0.0023 | 0.0022 |

<div align="center">
Table 4: Comparison of maximal laccase production and yield between our R. litoralis model and E. coli model.   
</div>

## 5. Discussion

When analyzing the model with Memote, a total score of only 22% was achieved. Additionally, the low specificity value of 40% of the validation confusion matrix suggests further improvements on the model. Due to the lack of data on fluxes in *R. litoralis*, the next step to improve the model would be to generate data with experiments, as suggested in section 2.3.6 “Suggested Experiments”. The data retrieved from those experiments should be uploaded in the model and specific fluxes that are not already part of the model should be implemented. 

The Memote report also pointed out that there are no cross references to other databases (no annotations), which should be changed in order to make the model more accessible to a broad range of scientists. 

As mentioned in 3.4.2 our model misses 12 exchange reactions compared to the paper by Kalhoefer et al. It is necessary to further investigate the reason for the lack of those exchange reactions in our model. If *R. litoralis* doesn't show any growth with those exchange reactions in the experiments, then our model is correct by not expressing these reactions. However, does *R. litoralis* show growth on these carbon sources, then those exchange reactions must be implemented in our model.  

During the construction, improvement and validation of the model we assumed that there are no major differences between the B14 strain in the model and the *R. litoralis* strains in the papers we based the medium optimization and model validation on. We further assumed that the composition of the LB medium in the paper of Piekarski et al. and in the model are equal. Those assumptions could cause a deviation in our model. 

There is still a lot of room for improvements, but once further data on the fluxes are implemented, a successful application of the model will be possible. 

## 6. Conclusion

Our aim was to construct, improve and validate a GSM model for *Rosebacter litoralis* and introduce a production pathway for laccase. 

We were able to construct the GSM model, but due to the lack of data on fluxes and morphology we were not able to improve it. Instead we adjusted the medium of our model to match the growth rate of the model to the experimental growth rate. 

Fortunately, we found a paper which investigated the growth rate of *R. litoralis* on different carbon sources, which we could use to construct a confusion matrix and thus validate our model. It was not possible to validate our model in terms of fluxes, since no data could be found. The Memote analysis and the confusion matrix both suggest that further improvements are necessary before the model can be applied. 

We successfully managed to implement the production of laccase in the model, but due to time limitations we were not able to further analyze and improve the laccase yield. 

## References

Borg, Y., Grigonyte, A. M., Boeing, P., Wolfenden, B., Smith, P., Beaufoy, W., ... & Nesbeth, D. N. (2016). Open source approaches to establishing Roseobacter clade bacteria as synthetic biology chassis for biogeoengineering. PeerJ, 4, e2031.

Brijwani, K., Rigdon, A., & Vadlani, P. V. (2010). Fungal Laccases: Production, Function, and Applications in Food Processing. Enzyme Research, 2010, 1-10. doi:10.4061/2010/149748

Brock, N. L., Menke, M., Klapschinski, T. A., & Dickschat, J. S. (2014). Marine bacteria from the Roseobacter clade produce sulfur volatiles via amino acid and dimethylsulfoniopropionate catabolism. Organic & biomolecular chemistry, 12(25), 4318-4323.

Chaurasia, P. K., Bharati, S. L., Sarma, C., et al. “Laccases in Pharmaceutical Chemistry: A Comprehensive Appraisal.” Mini-Reviews in Organic Chemistry, vol. 13, no. 6, 2017, pp. 430–451., doi:10.2174/1570193x13666161019124854.

Garje, A. N. (n.d.). Laccase Enzyme,Laccase Enzyme for Textile Processing,Laccase Biobleaching,Laccase Enzyme Product. Retrieved from https://www.fibre2fashion.com/industry-article/5480/green-revolution-in-textile-processing-by-using-laccases

Gevorgyan, A., Poolman, M. G., Fell,D. A.,  “Detection of stoichiometric inconsistencies in biomolecular models”, Bioinformatics, Volume 24, Issue 19, 1 October 2008, Pages 2245–2251, https://doi.org/10.1093/bioinformatics/btn425

Kalhoefer, D., Thole, S., Voget, S., Lehmann, R., Liesegang, H., Wollher, A., et al. (2011). Comparative genome analysis and genome-guided physiological analysis of Roseobacter litoralis. Bmc Genomics, 12(1), 324. https://doi.org/10.1186/1471-2164-12-324

Laccase Market Size - Global Industry Analysis, Market Share, Growth, Trends, Top Countries Analysis & Top manufacturers, Segmentation and Forecast 2026. (2020, September 14). Retrieved from https://www.marketwatch.com/press-release/laccase-market-size---global-industry-analysis-market-share-growth-trends-top-countries-analysis-top-manufacturers-segmentation-and-forecast-2026-2020-09-14

Martens, T., Gram, L., Grossart, H. P., Kessler, D., Müller, R., Simon, M., ... & Brinkhoff, T. (2007). Bacteria of the Roseobacter clade show potential for secondary metabolite production. Microbial ecology, 54(1), 31-42.

Mayolo-Deloisa, K., González-González, M., & Rito-Palomares, M. (2020). Laccases in Food Industry: Bioprocessing, Potential Industrial and Biotechnological Applications. Frontiers in Bioengineering and Biotechnology, 8. doi:10.3389/fbioe.2020.00222

Piekarski, T., Buchholz, I., Drepper, T., Schobert, M., Wagner-Doebler, I., Tielen, P., & Jahn, D. (2009). Genetic tools for the investigation of Roseobacter clade bacteria. Bmc Microbiology, 9(1), 265. https://doi.org/10.1186/1471-2180-9-265

Santo, M., Weitsman, R., & Sivan, A. (2013). The role of the copper-binding enzyme – laccase – in the biodegradation of polyethylene by the actinomycete Rhodococcus ruber. International Biodeterioration & Biodegradation, 84, 204-210. doi:10.1016/j.ibiod.2012.03.001

Shiba, T. (1991). Roseobacter litoralis gen. nov., sp. nov., and Roseobacter denitrificans sp. nov., Aerobic Pink-Pigmented Bacteria which Contain Bacteriochlorophyll a. Systematic and Applied Microbiology, 14(2), 140-145. doi:10.1016/s0723-2020(11)80292-4 

Tang, K., Yang, Y., Lin, D., Li, S., Zhou, W., Han, Y., ... & Jiao, N. (2016). Genomic, physiologic, and proteomic insights into metabolic versatility in Roseobacter clade bacteria isolated from deep-sea water. Scientific reports, 6, 35528.

Wang, R., Gallant, É., & Seyedsayamdost, M. R. (2016). Investigation of the genetics and biochemistry of roseobacticide production in the Roseobacter clade bacterium Phaeobacter inhibens. MBio, 7(2).





## Appendix

<a id='appendix_1'></a>
### Appendix 1: List of uncoserved metabolites
"h2_e","h2_c","h2_p","h_p","h_e","h_c"

<a id='appendix_2'></a>
### Appendix 2: List of common reaction of E. coli and R.litoralis with fluxes in one model but no fluxes in the other

In [5]:
['3HAD100', '3HAD120', '3HAD140', '3HAD141', '3HAD160', '3HAD180', '3OAR100', '3OAR120', '3OAR121', '3OAR140', '3OAR141', 
 '3OAR160', '3OAR161', '3OAR180', '3OAS100', '3OAS121', '3OAS140', '3OAS141', '3OAS160', '3OAS161', '3OAS180', '4HTHRK', 
 'A5PISO', 'AACPS3', 'AACPS4', 'AACPS9', 'ACACT5r', 'ACACT6r', 'ACACT7r', 'ACCOAC', 'ACGK', 'ACGS', 'ACOAD1f', 'ACOAD5f',
 'ACOAD6f', 'ACOAD7f', 'ACOATA', 'ACODA', 'ACONTa', 'ACONTb', 'ACPPAT141', 'ACPPAT160', 'ADD', 'ADSK', 'AGPAT141', 'AGPAT160',
 'AGPAT161', 'AGPR', 'AKGDH', 'ALAR', 'AMMQLT8', 'AMPMS2', 'APG3PAT141', 'APG3PAT160', 'ARGSL', 'ARGSS', 'ASNS1', 'ASPO5', 
 'ATPM', 'ATPPRT', 'BPNT', 'BTS5', 'CBPS', 'CO2tex', 'CO2tpp', 'CPPPGO', 'CS', 'CTECOAI7', 'CTPS2', 'CYSS', 'CYSTL', 'DALAt2pp',
 'DAPDC', 'DASYN160', 'DASYN161', 'DHAD2', 'DHNAOT4', 'DHNCOAS', 'DHNCOAT', 'DHNPA2r', 'DHPPDA2', 'DHPS2', 'DRPA', 'DURADx', 
 'DURIPP', 'DUTPDP', 'E4PD', 'EAR120x', 'EAR121x', 'EAR140y', 'EAR141x', 'EAR160x', 'EAR161x', 'EAR180x', 'ECOAH5', 'ECOAH6',
 'ECOAH7', 'EDA', 'EDD', 'ETOHtex', 'ETOHtrpp', 'EX_acald_e', 'EX_ala__D_e', 'EX_arg__L_e', 'EX_etoh_e', 'EX_fe3_e', 'EX_gly_e',
 'EX_glyclt_e', 'EX_h2s_e', 'EX_his__L_e', 'EX_lys__L_e', 'EX_meoh_e', 'EX_met__L_e', 'EX_mobd_e', 'EX_nh4_e', 'EX_pheme_e', 
 'EX_thm_e', 'EX_thr__L_e', 'FACOAE100', 'FACOAE160', 'FACOAE161', 'FBA3', 'FE2tex', 'FLDR2', 'G3PAT161', 'G3PD2', 'G5SADs',
 'G5SD', 'GCALDD', 'GLCptspp', 'GLU5K', 'GLYCLTt2rpp', 'GLYCtpp', 'GPDDA4', 'GTHOr', 'GTPCII2', 'H2Otex', 'H2Otpp', 'HACD5', 
 'HACD6', 'HACD7', 'HBZOPT', 'HISTD', 'HISTP', 'HPPK2', 'HSDy', 'HSK', 'HSTPT', 'I2FE2SR', 'I2FE2SS2', 'I2FE2ST', 'I4FE4SR',
 'I4FE4ST', 'ICDHyr', 'ICYSDS', 'IG3PS', 'IGPDH', 'ILETA', 'IPDDI', 'IPDPS', 'K2L4Aabcpp', 'K2L4Aabctex', 'KAS14', 'KDOCT2',
 'KDOPP', 'KDOPS', 'Kt2pp', 'Ktex', 'LEUTAi', 'LPADSS', 'LPLIPAL2G141', 'MALS', 'MCTP1App', 'MECDPDH5', 'MEOHtex', 'MEOHtrpp',
 'METS', 'MG2tex', 'MI1PP', 'MMM', 'MNtex', 'MOAT', 'MOAT2', 'MOBDabcpp', 'MOBDtex', 'MPTG', 'MTHFR2', 'NADH16pp', 'NADH17pp',
 'NAtex', 'NH4tex', 'NH4tpp', 'NI2tex', 'NTD1', 'O2tpp', 'OHPBAT', 'OPHBDC', 'P5CR', 'PA141abcpp', 'PAPSR', 'PDH', 'PDX5PS', 
 'PE160abcpp', 'PE161abcpp', 'PERD', 'PG141abcpp', 'PGCD', 'PGMT', 'PHEMEabcpp', 'PHEMEtiex', 'PItex', 'PMPK', 'PPC', 'PPM2', 
 'PPNCL2', 'PRAMPC', 'PRATPP', 'PRMICI', 'PRPPS', 'PSD160', 'PSD161', 'PSERT', 'PSP_L', 'PSSA160', 'PSSA161', 'PTAr', 'RNDR4',
 'RNTR2c2', 'RNTR3c2', 'RNTR4c2', 'S2FE2ST', 'S4FE4ST', 'SADT2', 'SERAT', 'SHCHD2', 'SHCHF', 'SHSL1', 'SO4tex', 'SUCDi', 
 'T2DECAI', 'TDSK', 'THRS', 'THZPSN3', 'TMDS', 'TMK', 'TMPPP', 'TRDR', 'TRPS1', 'TRPS3', 'TYRL', 'U23GAAT', 'UAGAAT', 'UHGADA',
 'UPP3MT', 'USHD', 'Zn2tex']

['3HAD100',
 '3HAD120',
 '3HAD140',
 '3HAD141',
 '3HAD160',
 '3HAD180',
 '3OAR100',
 '3OAR120',
 '3OAR121',
 '3OAR140',
 '3OAR141',
 '3OAR160',
 '3OAR161',
 '3OAR180',
 '3OAS100',
 '3OAS121',
 '3OAS140',
 '3OAS141',
 '3OAS160',
 '3OAS161',
 '3OAS180',
 '4HTHRK',
 'A5PISO',
 'AACPS3',
 'AACPS4',
 'AACPS9',
 'ACACT5r',
 'ACACT6r',
 'ACACT7r',
 'ACCOAC',
 'ACGK',
 'ACGS',
 'ACOAD1f',
 'ACOAD5f',
 'ACOAD6f',
 'ACOAD7f',
 'ACOATA',
 'ACODA',
 'ACONTa',
 'ACONTb',
 'ACPPAT141',
 'ACPPAT160',
 'ADD',
 'ADSK',
 'AGPAT141',
 'AGPAT160',
 'AGPAT161',
 'AGPR',
 'AKGDH',
 'ALAR',
 'AMMQLT8',
 'AMPMS2',
 'APG3PAT141',
 'APG3PAT160',
 'ARGSL',
 'ARGSS',
 'ASNS1',
 'ASPO5',
 'ATPM',
 'ATPPRT',
 'BPNT',
 'BTS5',
 'CBPS',
 'CO2tex',
 'CO2tpp',
 'CPPPGO',
 'CS',
 'CTECOAI7',
 'CTPS2',
 'CYSS',
 'CYSTL',
 'DALAt2pp',
 'DAPDC',
 'DASYN160',
 'DASYN161',
 'DHAD2',
 'DHNAOT4',
 'DHNCOAS',
 'DHNCOAT',
 'DHNPA2r',
 'DHPPDA2',
 'DHPS2',
 'DRPA',
 'DURADx',
 'DURIPP',
 'DUTPDP',
 'E4PD',
 'EAR120x',
 'EAR121x',