## Get metabolite concentrations in murine, e.coli and yeast cells

Data is already available in Supplementary Table 5 in
Park JO, Rubin SA, Xu YF, Amador-Noguez D, Fan J, Shlomi T, Rabinowitz JD. Metabolite concentrations, fluxes and free energies imply efficient enzyme usage. Nat Chem Biol. 2016 Jul;12(7):482-9. doi: 10.1038/nchembio.2077      Add to Citavi project by DOI. Epub 2016 May 2. PMID: 27159581  

Here, it is justed mapped by id

In [1]:
from pathlib import Path
import pandas as pd
import numpy as np

In [2]:
p = Path.cwd() / 'concentrations_from_cells'

# Concentration data from publication
p_data = p / 'concentrations_from_cells_opark.csv' 

# KEGG ids for the reference metabolites were put together earlier in the study for iPATH visualisation
# To match the data in the paper, id of 3-PG was changed to C00197 (previously C00597 was used)
p_kegg = p / 'compound_kegg_mapping_3.csv'

out_path = p / 'revision_concentrations_in_cells.csv'

In [3]:
# Find concentration from paper data for reference metabolites
compounds = pd.read_csv(p_kegg)
concentration = pd.read_csv(p_data)[['Metabolite[compartment] \ Concentration(M)', 
                                     'KEGG ID',
                                     'Murine cell line (iBMK)',  
                                     'E.coli',
                                     'S.cerevisiae']]

# Reference metabolites for which there is no data are dropped via 'inner' merge
merged_df = compounds.merge(concentration, left_on='kegg_id', right_on='KEGG ID', how='inner')

# Convert to micromolar and save
result = merged_df[['name_short', 'hmdb_primary', 'Murine cell line (iBMK)', 'E.coli', 'S.cerevisiae']]
result[['Murine cell line (iBMK)', 'E.coli', 'S.cerevisiae']] *= 1000000
result.to_csv(out_path, index=False)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  result[['Murine cell line (iBMK)', 'E.coli', 'S.cerevisiae']] *= 1000000
