### You can run this notebook at Colab by clicking here:

<a target="_blank" href="https://colab.research.google.com/github/NMRLipids/databank-template/blob/museum/scripts/plotQuality.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>

# Show ranking tables of simulations against experimental data

This notebook shows different kinds of rankings of simulations against experimental data.
This reading ranking tables from `Data/Ranking/` folder.
Before using this, run `python makeRanking.py` in `Scripts/BuildDatabank` folder.

# Initialize NMRlipids databank

In [1]:
# This installs NMRlipids Databank on COLAB environment,
# you can use the same commands on your local machine if you run it locally.

import sys
import os

os.environ["NMLDB_ROOT_PATH"] = os.path.abspath(".." + os.sep + ".." + os.sep + "Databank")

if 'google.colab' in sys.modules:
    !git clone https://github.com/NMRlipids/MuseumDatabank
    %cd Databank
    !sed -i '/numpy/s/^/# /' Scripts/DatabankLib/requirements.txt
    !pip3 install .
    os.environ["NMLDB_ROOT_PATH"] = "/content/Databank"

In [2]:
# These two lines include core Databank routines and Databank API
from DatabankLib import *
from DatabankLib.core import *
from DatabankLib.databankLibrary import *
# this is for showing statistics tables
from DatabankLib.jpyroutines import showTable

# This initializes the databank and stores the information of all simulations into a list.
# Each list item contains the information from README.yaml file of the given simulation.
systems = initialize_databank()

Databank initialized from the folder: /big/comcon1/repo/Databank/Data/Simulations


# Show ranking of simulations based on total quality, tail quality and headgroup quality

In [3]:
Fragments = ['total','tails','headgroup','FormFactor']

for SortBasedOn in Fragments:
    print('Sorted based on ', SortBasedOn, ' quality')

    FFrankingPath = os.path.join(
        NMLDB_ROOT_PATH,
        'Data', 'Ranking',
        f'SYSTEM_{SortBasedOn}_Ranking.json')

    with open(FFrankingPath) as json_file:
        FFranking = json.load(json_file)

    showTable(FFranking, 'TotalQuality')

Sorted based on  total  quality


**FAILURE:** no FF defined for the system

'c04/cdb/c04cdb61143bb72ba2ddee11599e6bbfaf6cb211/a31f42800216d84d2242eae2af8ac2cbd6b033ee/'

**FAILURE:** no FF defined for the system

'a7f/9f3/a7f9f36feaa77791483dadb0157f86a6a519f2c3/a30f01e8771c44dc59b78753d1dd27ca67920e38/'

Unnamed: 0,headgroup,tails,total,Forcefield,Molecules,Number of molecules,Temperature,ID,FFQuality
0,0.7,0.87,0.81,OPLS4,POPC:SOL,(200:4000),300.0,803,
1,0.7,0.87,0.81,OPLS3e,POPC:SOL,(200:4000),300.0,815,
2,0.73,0.84,0.81,OPLS3e,POPC:SOL,(200:8859),300.0,805,0.15
3,0.73,0.84,0.8,OPLS4,POPC:SOL,(200:9000),300.0,814,0.15
4,0.7,0.85,0.8,OPLS4,POPC:SOL,(200:2400),300.0,804,
5,0.71,0.8,0.77,OPLS4,POPC:SOL,(200:1500),300.0,802,
6,0.75,0.78,0.77,OPLS3e,POPC:SOL,(200:1000),300.0,796,
7,0.66,0.74,0.72,C36_Slipids_Hybrid,POPC:SOL,(128:6400),298.0,830,0.76
8,0.68,0.72,0.71,OPLS4,POPC:SOL,(200:1000),300.0,817,
9,0.67,0.71,0.7,MacRog,POPC:SOL,(1024:51200),298.15,658,0.65


Sorted based on  tails  quality


**FAILURE:** no FF defined for the system

'c04/cdb/c04cdb61143bb72ba2ddee11599e6bbfaf6cb211/a31f42800216d84d2242eae2af8ac2cbd6b033ee/'

**FAILURE:** no FF defined for the system

'a7f/9f3/a7f9f36feaa77791483dadb0157f86a6a519f2c3/a30f01e8771c44dc59b78753d1dd27ca67920e38/'

Unnamed: 0,headgroup,tails,total,Forcefield,Molecules,Number of molecules,Temperature,ID,FFQuality
0,0.7,0.87,0.81,OPLS4,POPC:SOL,(200:4000),300.0,803,
1,0.7,0.87,0.81,OPLS3e,POPC:SOL,(200:4000),300.0,815,
2,0.7,0.85,0.8,OPLS4,POPC:SOL,(200:2400),300.0,804,
3,0.73,0.84,0.81,OPLS3e,POPC:SOL,(200:8859),300.0,805,0.15
4,0.73,0.84,0.8,OPLS4,POPC:SOL,(200:9000),300.0,814,0.15
5,0.71,0.8,0.77,OPLS4,POPC:SOL,(200:1500),300.0,802,
6,0.75,0.78,0.77,OPLS3e,POPC:SOL,(200:1000),300.0,796,
7,0.01,0.78,0.52,Slipids,POPE:SOL,(500:25000),310.0,414,0.1
8,0.66,0.74,0.72,C36_Slipids_Hybrid,POPC:SOL,(128:6400),298.0,830,0.76
9,0.01,0.73,0.49,Berger,POPC:SOL,(256:10342),300.0,115,0.94


Sorted based on  headgroup  quality


**FAILURE:** no FF defined for the system

'a7f/9f3/a7f9f36feaa77791483dadb0157f86a6a519f2c3/a30f01e8771c44dc59b78753d1dd27ca67920e38/'

**FAILURE:** no FF defined for the system

'c04/cdb/c04cdb61143bb72ba2ddee11599e6bbfaf6cb211/a31f42800216d84d2242eae2af8ac2cbd6b033ee/'

Unnamed: 0,headgroup,tails,total,Forcefield,Molecules,Number of molecules,Temperature,ID,FFQuality
0,0.78,0.55,0.63,OPLS4,DMPC:SOL,(200:2000),314.0,810,
1,0.75,0.78,0.77,OPLS3e,POPC:SOL,(200:1000),300.0,796,
2,0.73,0.84,0.81,OPLS3e,POPC:SOL,(200:8859),300.0,805,0.15
3,0.73,0.84,0.8,OPLS4,POPC:SOL,(200:9000),300.0,814,0.15
4,0.71,0.8,0.77,OPLS4,POPC:SOL,(200:1500),300.0,802,
5,0.7,0.87,0.81,OPLS4,POPC:SOL,(200:4000),300.0,803,
6,0.7,0.87,0.81,OPLS3e,POPC:SOL,(200:4000),300.0,815,
7,0.7,0.85,0.8,OPLS4,POPC:SOL,(200:2400),300.0,804,
8,0.69,0.59,0.62,OPLS4,DMPC:SOL,(200:1000),314.0,811,
9,0.69,0.64,0.66,CHARMM36,DMPC:SOL,(200:1000),314.0,800,


Sorted based on  FormFactor  quality


**FAILURE:** no FF defined for the system

'c04/cdb/c04cdb61143bb72ba2ddee11599e6bbfaf6cb211/a31f42800216d84d2242eae2af8ac2cbd6b033ee/'

**FAILURE:** no FF defined for the system

'a7f/9f3/a7f9f36feaa77791483dadb0157f86a6a519f2c3/a30f01e8771c44dc59b78753d1dd27ca67920e38/'

Unnamed: 0,FFQuality,Forcefield,Molecules,Number of molecules,Temperature,ID,headgroup,tails,total
0,0.02,Lipid14,POPC:SOL,(72:2234),303.0,154,,,
1,0.07,ECC-lipids,POPC:SOL,(128:6400),313.0,360,,,
2,0.07,Lipid17,POPC:SOL:CHOL,(64:4000:16),298.15,662,,,
3,0.08,Slipids,POPC:SOL,(122:4880),303.0,376,,,
4,0.1,Poger GROMOS 53A6_L,DPPC:SOL,(128:5841),323.0,593,,,
5,0.1,Poger GROMOS 53A6_L,DPPC:SOL,(128:5841),323.0,365,,,
6,0.1,Slipids,POPE:SOL,(500:25000),310.0,414,0.01,0.78,0.52
7,0.13,ECC-lipids,POPC:SOL,(128:6400),313.0,103,,,
8,0.14,Lipid17,POPC:SOL,(64:3200),298.15,715,0.49,0.73,0.65
9,0.15,OPLS4,POPC:SOL,(200:9000),300.0,814,0.73,0.84,0.8


# Show ranking separately for each lipid based on total, sn-1, sn-2, or headgroup quality

In [4]:
Fragments = ['total','sn-1','sn-2','headgroup']

for SortBasedOn in Fragments:
    for lipid in lipids_set:
        print('Quality of',SortBasedOn,' of ',lipid)
        FFrankingPath = os.path.join(
            NMLDB_ROOT_PATH,
            'Data', 'Ranking',
            f'{lipid}_{SortBasedOn}_Ranking.json')
    
        try:
            with open(FFrankingPath) as json_file:
                FFranking = json.load(json_file)
        except:
            print('File not found')
            continue
            
        showTable(FFranking, lipid)

Quality of total  of  Lipid(DRPC)
File not found
Quality of total  of  Lipid(DOPE)
File not found
Quality of total  of  Lipid(SM16)
File not found
Quality of total  of  Lipid(SLiPC)
File not found
Quality of total  of  Lipid(SDPE)
File not found
Quality of total  of  Lipid(SDG)
File not found
Quality of total  of  Lipid(DMPC)
File not found
Quality of total  of  Lipid(GB3)
File not found
Quality of total  of  Lipid(PAzePCdeprot)
File not found
Quality of total  of  Lipid(POPG)
File not found
Quality of total  of  Lipid(SAPI25)
File not found
Quality of total  of  Lipid(DOPS)
File not found
Quality of total  of  Lipid(SM18)
File not found
Quality of total  of  Lipid(DLIPC)
File not found
Quality of total  of  Lipid(DPPG)
File not found
Quality of total  of  Lipid(DDOPC)
File not found
Quality of total  of  Lipid(POPE)
File not found
Quality of total  of  Lipid(DPPC)
File not found
Quality of total  of  Lipid(POPC)
File not found
Quality of total  of  Lipid(DOPC)
File not found
Quality o

# Show form factor ranking only for systems with a specific lipid (cholesterol as an example)

In [5]:
### This is showing the ranking only for systems containing cholesterol

FFrankingPath = os.path.join(
    NMLDB_ROOT_PATH, 'Data', 'Ranking', 'SYSTEM_FormFactor_Ranking.json')
lipid = 'CHOL'

with open(FFrankingPath) as json_file:
    FFranking = json.load(json_file)

NewRank = []
for i in FFranking:
    if lipid in i['system']['COMPOSITION']:
        NewRank.append(i)
   
showTable(NewRank,'TotalQuality')

Unnamed: 0,FFQuality,Forcefield,Molecules,Number of molecules,Temperature,ID,headgroup,tails,total
0,0.07,Lipid17,POPC:SOL:CHOL,(64:4000:16),298.15,662,,,
1,0.17,Lipid17,POPC:SOL:CHOL,(256:16000:64),298.15,666,,,
2,0.17,Lipid17,POPC:SOL:CHOL,(1024:64000:256),298.15,667,,,
3,0.23,MacRog,POPC:SOL:CHOL,(1024:64000:256),298.15,677,,,
4,0.23,MacRog,POPC:SOL:CHOL,(256:16000:64),298.15,706,,,
5,0.26,MacRog,POPC:SOL:CHOL,(64:3600:8),298.15,665,,,
6,0.26,Lipid17,POPC:SOL:CHOL,(64:3600:8),298.15,680,,,
7,0.27,Berger and Modified Höltje model for cholesterol,POPC:CHOL:SOL,(110:18:8481),298.0,589,0.02,0.4,0.27
8,0.33,Berger and Modified Höltje model for cholesterol,POPC:CHOL:SOL,(84:44:6794),298.0,206,0.04,0.07,0.06
9,0.36,MacRog,POPC:SOL:CHOL,(256:14400:32),298.15,695,,,


# Show sn-1 ranking only for systems with POPC and CHOLESTEROL

In [6]:
### This is showing the ranking only for systems containing POPC and cholesterol

FFrankingPath = os.path.join(
    NMLDB_ROOT_PATH, 'Data', 'Ranking', 'POPC_sn-1_Ranking.json')
lipid = 'CHOL'

with open(FFrankingPath) as json_file:
    FFranking = json.load(json_file)

NewRank = []
for i in FFranking:
    if lipid in i['system']['COMPOSITION']:
        NewRank.append(i)
   
showTable(NewRank,'POPC')


Unnamed: 0,sn-1,sn-2,headgroup,total,Forcefield,Molecules,Number of molecules,Temperature,ID
0,0.82,0.5,0.03,0.45,Berger and Modified Höltje model for cholesterol,POPC:CHOL:SOL,(120:8:7290),298,305
1,0.6,0.49,0.71,0.6,slipids,CHOL:POPC:SOL,(256:256:20334),298,82
2,0.37,0.71,0.03,0.37,Berger and Modified Höltje model for cholesterol,POPC:CHOL:SOL,(110:18:8481),298,589
3,0.17,0.24,0.07,0.16,Berger and Modified Höltje model for cholesterol,POPC:CHOL:SOL,(64:64:10314),298,299
4,0.12,0.18,0.08,0.13,Berger and Modified Höltje model for cholesterol,POPC:CHOL:SOL,(50:78:5782),298,15
5,0.08,0.26,0.1,0.15,Berger and Modified Höltje model for cholesterol,POPC:CHOL:SOL,(84:44:6794),298,206
