 
# <center> Stats on possibilities with Ingrahams dataset </center>

**Takes:** J. Ingrahams datasplit file from (1) which is based on the structural database CATH4.2 (2) and use this for a basis for splitting  data for this simulation dataset.
In addition, it extends the filtering of PDBs as most proteins in Ingraham's already prefiltered CATH4.2 dataset are not useable for simulations. The filtering of pdbs is based on the below filtering conditions. 
    
**Outputs**  a json `CATH4.2_INITIAL_STATUS.json` that shows the state of all proteins in the Ingraham data split and has the below format.
It is additionally updated in the `data/CATH4.2_ALL_STATUS.json` by the notebook `track_simulations.ipynb` with information of which proteins has successfully been simulated or not : 
    
```
# The dictionary in the json has the format: 
dictionary = {pdbID:  {
                'cath_nodes': str, 
                'size': int, # len of residues 
                'method':str #'NMR' or 'X-ray',
                'chains':int, 
                'split':str, # 'train', 'test' or 'validation'
                'prefilter status': str # 'removed' or 'kept',
                'prefilter reasons': str # 'N.A', 'resolution' 'membrane protein', ... etc,
                'simulation status':str #'N.A', 'finished' or 'unsuccessfull',
                'unsuccessfull reason':str #'>RMSD', 'NaN'...etc
               }}

```

    
The conditions used to filter out proteins are as follows: 

    - is a membrane protein
    - unsuitable experimental method  for solving the structure (unsuitable only in the sense of usefull for simulations)
    - low X-ray resolution 
    - to high RMSD (>3.5) between models in NMR models 
    - contains unknown amino acids
    - gaps/unresolved parts inside the chain of a protein (in terminals these are allowed)
    - unsuitable ligand(s) (unsuitable only in the sense of usefull for automation of simulations)
    - too big a protein (>500 residues)
    - any chain too small in the  protein (<50 residues). Meaning it is unlikely to process a (somewhat) 3D structure by itself

**OBS!** It should be stressed that neither of the filtering conditions descriped above necessarily say anything about the quality of the given structure, but are here used as a way to avoid issues when automating simulations. 
                                               

```
(1) 
@inproceedings{ingraham2019generative,
author = {Ingraham, J. and Garg, V. K. and Barzilay, R. and Jaakkola, T.},
title = {Generative Models for Graph-Based Protein Design},
journal = {Advances in Neural Information Processing Systems}
year = {2019}
}
``` 

``` 
(2) 
@article{CATH,
author = {Sillitoe, I. and Bordin, N. and Dawson, N. and Waman, V. P. and Ashford, P. and Scholes, H. M. and Pang, C. S. M. and Woodridge, L. and Rauer, C. and Sen, N. and Abbasian, M. and Le Cornu, S. and Lam, S. D. and Berka, K. and V., Ivana H. and Svobodova, R. and Lees, J. and Orengo, C. A.},
title = "{CATH: increased structural coverage of functional space}",
journal = {Nucleic Acids Research},
year = {2020},
doi = {10.1093/nar/gkaa1079}}
```

  

In [60]:
import sys,os,json,time, gzip, time
import numpy as np

# Bio packages
import mdtraj as md
import Bio.PDB.mmtf as BioMMTF
from Bio.PDB.Polypeptide import PPBuilder
import mmtf


# STATUS OF FILTERING

In [100]:
with open('CATH4.2_INITIAL_STATUS.json', 'r') as fp:
    init_data = json.load(fp)  

print('\n\nOVERVIEW:\n=============')    
print(f"\nTotal pdbs in Ingraham's original dataset:\n\t {len(init_data)}")

filtered_out = [pdb for pdb in init_data if init_data[pdb]['prefilter status']=='removed']
print(f"\nTotal number of pdbs  which are filtered out from simulation dataset:\n\t {len(filtered_out)} ")
simulation_suitable = [pdb for pdb in init_data if init_data[pdb]['prefilter status']=='kept']
print(f"\nTotal number of pdbs  which are deemed initially suitable to keep in simulation dataset:\n\t {len(simulation_suitable)} ")


# prefilter reasons 
print('\n\n\nREASONS for FILTERING OUT PDBS :\n================================')    

obsolete =  [pdb for pdb in init_data if 'obsolete' in init_data[pdb]['prefilter reasons']]
print(f"\tObsolete in pdb database : {len(obsolete)}")

membranes =  [pdb for pdb in init_data if 'membrane protein' in init_data[pdb]['prefilter reasons']]
print(f"\tMembrane protein: {len(membranes)}")

exp =  [pdb for pdb in init_data if 'unsuitable exp method' in init_data[pdb]['prefilter reasons']]
print(f"\tUnsuitable experimental method : {len(exp)}")

resolution =  [pdb for pdb in init_data if 'low resolution' in init_data[pdb]['prefilter reasons']]
print(f"\tResolution of X-ray structure to high: {len(resolution)}")

NMR =  [pdb for pdb in init_data if 'NMR rmsd to high' in init_data[pdb]['prefilter reasons']]
print(f"\tRMSD between NMR models >3.5 : {len(NMR)}")

aa =  [pdb for pdb in init_data if 'unknown_aa' in init_data[pdb]['prefilter reasons']]
print(f"\tContains unknown amino acid : {len(aa)}")

chain =  [pdb for pdb in init_data if 'one chain too short' in init_data[pdb]['prefilter reasons']]
print(f"\tAny one chain in structure >50 residues : {len(chain)}")

gap =  [pdb for pdb in init_data if 'gap in chain' in init_data[pdb]['prefilter reasons']]
print(f"\tGap in chain (unresolved parts in the middel of the structure) : {len(gap)}")

ligand =  [pdb for pdb in init_data if 'unsuitable ligand' in init_data[pdb]['prefilter reasons']]
print(f"\tUsuitable ligand for automatic simulation : {len(ligand)}")

big =  [pdb for pdb in init_data if 'too big' in init_data[pdb]['prefilter reasons']]
print(f"\tTotal number of residues >500 : {len(big)}")

small =  [pdb for pdb in init_data if 'too small' in init_data[pdb]['prefilter reasons']]
print(f"\tTotal number of residues <50 : {len(small)}")

error =  [pdb for pdb in init_data if 'error in reading pdb' in init_data[pdb]['prefilter reasons']]
print(f"\tError in reading pdb: {len(error)}")



OVERVIEW:

Total pdbs in Ingraham's original dataset:
	 18783

Total number of pdbs  which are filtered out from simulation dataset:
	 16588 

Total number of pdbs  which are deemed initially suitable to keep in simulation dataset:
	 2195 



REASONS for FILTERING OUT PDBS :
	Obsolete in pdb database : 7
	Membrane protein: 286
	Unsuitable experimental method : 4
	Resolution of X-ray structure to high: 2649
	RMSD between NMR models >3.5 : 1718
	Contains unknown amino acid : 130
	Any one chain in structure >50 residues : 990
	Gap in chain (unresolved parts in the middel of the structure) : 8435
	Usuitable ligand for automatic simulation : 7700
	Total number of residues >500 : 10164
	Total number of residues <50 : 95
	Error in reading pdb: 23


# Functions used to filter out pdbs

In [61]:

def load_membrane_protein_list():
    '''
    Read membrane protein database into list of pdb IDs.
    Using Stephen White's MPstruc database for finding membrane proteins: 
    https://blanco.biomol.uci.edu/mpstruc/?expandTblOnLoad=1
    load xml over all protein structures that are membrane proteins
    '''
    
    import xml.etree.ElementTree as ET
    mpstruc = '/home/trz846/MD_ML/CATH4.3/helper_files/membrane_protein_database_mpstruc.xml'
    tree = ET.parse(mpstruc)
    root = tree.getroot()
    # load pdb IDs of all  membrane proteins
    membrane_proteins = [pdb.text for pdb in root.iter('pdbCode')]
    
    
    return membrane_proteins



def membrane_protein(pdb_name):
    '''check if categorised as membrane protein in MPStruc'''
    
    pdb_name = pdb_name.upper() 
    if pdb_name in membrane_proteins:
        return True
    else:
        return False

def unsuitable_experimental_method(mmtf_record):
    '''check that method used to solve protein structure is allowed''' 
    # define allowed methods == good for MD simulations
    allowed_exp = ['ELECTRON MICROSCOPY', 
                    'SOLID-STATE NMR', 
                    'SOLUTION NMR',
                    'X-RAY DIFFRACTION',
                    'ELECTRON CRYSTALLOGRAPHY']

    # get list of methods used for solving structure
    used_exp = mmtf_record.experimental_methods
    
    # see if any of them is among suitable. 
    is_allowed = [exp for exp  in used_exp if exp in allowed_exp]

    if  is_allowed: return False
    else:  return True 


def low_resolution(mmtf_record):
    '''check if resolution lower than 2.5 AA if experimental method is x-ray crystallization'''
    
    try: # can only get resolution for some exp methods 
        resolution = mmtf_record.resolution 
        if resolution < 2.5: return False 
        elif resolution >2.5: return True
        else: return False
    
    except: # lazy as ... I know... 
        return False 


    


def polymeric(biopython_structure):
    count_chains = len(biopython_structure)
    return count_chains


def aa_selenocystein(mmtf_record):
    '''Test if selenocystein is in the protein'''
    protein_entities = [entity for entity in  mmtf_record.entity_list if entity['type'] == 'polymer']
    for chain in protein_entities:
        sequence = chain['sequence']
        seleno_cys_present = False
        if 'U' in sequence:
            seleno_cys_present = True
            break
    
    return seleno_cys_present 


def unknown_aa(mmtf_record):
    '''filters out proteins structures that contains amino acids that are not 
    uniqly determined, such as X,J,B etc
    '''
    
    known_aa = ['A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S'
                ,'T','Y','V','W', 'U'] # U = Selenocystein.
    
    bad_aa = False
    protein_entities = [entity for entity in  mmtf_record.entity_list if entity['type'] == 'polymer']
    for chain in protein_entities:
        sequence = chain['sequence']
        bad_aa = [aa for aa in sequence if aa not in known_aa]
        if len(bad_aa) > 0: 
            bad_aa = True
            break 
            
    if bad_aa: return True
    else: return False
    
    
    
def missing_aa_inside_protein(biopython_structure):
    '''check if any missing/unsolved residues in structure in the 
    middle of the sequence. Checks based on structure not on jumps
    in sequence (the latter can happen when lab cuts out part of seq)
    ''' 
    
    ppb = PPBuilder()
    for i, chain in enumerate(biopython_structure):
        pps = ppb.build_peptides(chain)
        if len(pps) > 1:
            # chain breaks in structure (not in sequence) if missing residues in pps => pps>1
            return True

        elif len(pps)==1:
            return False


    
def too_big(biopython_structure, max_len):
    '''checks if protein is larger than defined maximum lenght (in residues)
    '''
    
    size = get_size(biopython_structure)
    if size > max_len: return True
    else: return False
    
    
def too_small(biopython_structure, min_len):
    '''checks if protein is smaller than defined minimum lenght (in residues)
    and thus will not have any structure in inself. 
    '''
    
    size = get_size(biopython_structure)
    if size < min_len: return True
    else: return False

    
def get_size(biopython_structure):
    '''get total residue size of all chains in protein
    ''' 
    size = 0
    for chain in biopython_structure:
        size+=len(chain)
    return size


def polymeric_with_single_chain_too_short(biopython_structure):
    '''returns false if polymeric i.e. multiple chains 
    OBS! does not work for NMR - it reads each model as 1 chain 
    '''
    
    # if homo-polymer then each of the identical chains are put in 1 entity, i.e. 1 entity contains all homomers in pdb
    any_chain_too_short = False
    for chain in biopython_structure:
        if len(chain) < 50:
            any_chain_too_short = True
    return any_chain_too_short



def get_biopython_structure(mmtf_record):
    '''Uses biopython loader to get (first) model in pdb structure
    Needed for some processing.
    '''
    
    structure = BioMMTF.get_from_decoded(mmtf_record)
    first_model = structure.get_list()[0]
    return first_model


def unsuitable_ligands(mmtf_record):
    '''Check which ligands exist in pdb structure and assess whether 
    structure should be discarded based on it
    '''
    
    
    standard_crystallization_ligands = [
        'water', 'ACETATE ION', 'ACETIC ACID',"ADENOSINE-5'-DIPHOSPHATE",
        'ALPHA-D-MANNOSE','AMMONIUM ION','S-ADENOSYL-L-HOMOCYSTEINE',
        "ADENOSINE MONOPHOSPHATE","S-ADENOSYLMETHIONINE","BENZOIC ACID",
        'CHLORIDE ION', 'CITRIC ACID',"CITRATE ANION","BETA-CAROTENE",
        "COENZYME A","ACETYL COENZYME *A",'DI(HYDROXYETHYL)ETHER' ,
        "ETHANOL",'GLYCEROL', "ALPHA-D-GLUCOSE", "HEXAETHYLENE GLYCOL",
        "GLYCINE",'BETA-D-GLUCOSE',"2-(ACETYLAMINO)-2-DEOXY-A-D-GLUCOPYRANOSE",
        "BETA-D-GALACTOSE","GUANOSINE-5'-DIPHOSPHATE","GUANOSINE-5'-TRIPHOSPHATE",
        "GLUTAMIC ACID",'FLAVIN-ADENINE DINUCLEOTIDE','FLAVIN MONONUCLEOTIDE',
        'FORMIC ACID',"HEME",'ALPHA-L-FUCOSE','IMIDAZOLE',"ISOPROPYL ALCOHOL",
        'BETA-D-MANNOSE','BETA-MERCAPTOETHANOL','MALONATE ION', "MALONIC ACID",
        'D-MALATE','N-ACETYL-D-GLUCOSAMINE' ,'NICOTINAMIDE-ADENINE-DINUCLEOTIDE',
        "NADPH DIHYDRO-NICOTINAMIDE-ADENINE-DINUCLEOTIDE PHOSPHATE","NITRATE",
        'PENTAETHYLENE GLYCOL','PHOSPHATE ION',"PYRIDOXAL-5'-PHOSPHATE 161",
        "OXYGEN MOLECULE",'SULFATE ION',"SUCROSE","SUCCINIC ACID",
        'TETRAETHYLENE GLYCOL', 'TRIETHYLENE GLYCOL','1,2-ETHANEDIOL',
        '(4S)-2-METHYL-2,4-PENTANEDIOL' ,'2-AMINO-2-HYDROXYMETHYL-PROPANE-1,3-DIOL',
        '4-(2-HYDROXYETHYL)-1-PIPERAZINE ETHANESULFONIC ACID',
        '2-(N-MORPHOLINO)-ETHANESULFONIC ACID',"ADENOSINE-5'-TRIPHOSPHATE",
        "NADP NICOTINAMIDE-ADENINE-DINUCLEOTIDE PHOSPHATE"]
    
    not_allowed_metals = ['CALCIUM ION','COPPER (II) ION',  'FE (II) ION',
                          'FE (III) ION', "IODIDE ION",'MAGNESIUM ION',
                          'MANGANESE (II) ION','POTASSIUM ION', 'ZINC ION',]



    
    
    # define if ligand is unsuitable/not allowed
    unsuitable = False

    # get all ligands/non_proteins in pdb
    non_proteins = [entity for entity in  mmtf_record.entity_list if entity['type'] == 'non-polymer']

    # if len(non_proteins) > 0: # if = 0 => no other molecyles than protein
    for entity in non_proteins:
        ligand = entity['description']
        if ligand  in not_allowed_metals or ligand not in standard_crystallization_ligands:
            unsuitable = True

    return unsuitable  



def rmsd_of_NMR_models_to_high(pdb_name, method):
    '''takes all models of NMR model and calculated the RMSD between them 
    '''
    
    if 'SOLUTION NMR' in method or 'SOLID-STATE NMR' in method:
        try:
            traj = md.load(f'/home/trz846/protein_dynamics/data/pdbs_raw/{pdb_name}.pdb')
            distances = np.empty((traj.n_frames, traj.n_frames))
            for i in range(traj.n_frames):
                distances[i] = md.rmsd(traj, traj, i)
            max_rmsd = np.max(distances)
            if max_rmsd > 0.45:
                return True
            else: 
                return False 
        except:
            return True    
    else:
        return False

    
    
def rmsd_of_NMR_models(pdb_name, method):
    '''takes all models of NMR model and calculated the RMSD between them 
    '''
    
    if 'SOLUTION NMR' in method or 'SOLID-STATE NMR' in method:
        try:
            traj = md.load(f'/home/trz846/protein_dynamics/data/pdbs_raw/{pdb_name}.pdb')
            distances = np.empty((traj.n_frames, traj.n_frames))
            for i in range(traj.n_frames):
                distances[i] = md.rmsd(traj, traj, i)
            max_rmsd = np.max(distances)
            return float("{:.2f}".format(max_rmsd))
        except:
            return np.nan
    else:
        return np.nan

def get_resolution(mmtf_record, method  ):
    '''check if resolution lower than 2.0 AA if experimental method is x-ray
    crystallization
    '''
    
    if 'X-RAY DIFFRACTION' in method or 'ELECTRON MICROSCOPY' in method or 'ELECTRON CRYSTALLOGRAPHY' in method : 
        # can only get resolution for some exp methods 
        resolution = mmtf_record.resolution 
        return float("{:.2f}".format(resolution))
    else: 
        return np.nan
               
def add_prefiltered(pdb_name, all_data):
    '''used to create ALL_CATH json database '''
    all_data[pdb_name]['prefilter status'] = 'removed'
    all_data[pdb_name]['simulation status'] = 'N.A.'
    all_data[pdb_name]['simulation reason'] = 'N.A.'
    
    return all_data


def mmtf_fetch(pdb, cache_dir='/home/trz846/graph_transformers/data/ingraham_data/cath/mmtf/'):
    """ Retrieve mmtf record from PDB with local caching. 
    From J. Ingraham's code """
    
    mmtf_file = cache_dir + pdb + '.mmtf.gz'
    if not os.path.isfile(mmtf_file):
        url = 'http://mmtf.rcsb.org/v1.0/full/' + pdb + '.mmtf.gz'
        mmtf_file = download_cached(url, mmtf_file)

    if mmtf_file is not None:
        mmtf_record = mmtf.parse_gzip(mmtf_file)
    else:
        mmtf_record = None  # if pdb obsolete

    return mmtf_record

def download_cached(url, target_location):
    """ Download with caching. Modified From J. Ingraham's code """
#     target_dir = os.path.dirname(target_location)
#     if not os.path.isfile(target_location):
#         if not os.path.exists(target_dir):
#             os.makedirs(target_dir)

    # Use MMTF for speed
    try:
        response = urllib.request.urlopen(url)
        size = int(float(response.headers['Content-Length']) / 1e3)
        print('Downloading {}, {} KB'.format(target_location, size))
        with open(target_location, 'wb') as f:
            f.write(response.read())
    except:
        print('pdb is likely obsolete, cannot download:', url)
        target_location = None

    return target_location


# Create dictionary of filtered pdbs from Ingraham's datasplilt 

In [62]:
# Define filtering settings 
MAX_LENGTH = 500
MIN_LENGTH = 50 



# initiate data container
all_data = {}


# load list of membrane proteins
membrane_proteins = load_membrane_protein_list()


# get split into data class from cath types
cath_split = './Ingrahams_chain_set_splits.json'
with open(cath_split) as f:
    pdbs = json.load(f)
    
    
for ds_type in ['train', 'validation', 'test']:
    # get all pdbs in each dataset 
    split_pdbs_w_chains = pdbs[ds_type]
    
    # remove the chain specific split of data from Ingraham
    split_pdbs = [[pdb_chain[:-2],pdb_chain[-1]] for pdb_chain in split_pdbs_w_chains]
    
    
    # loop over all pdbs in given data split
    for idx, (pdb_name, chain) in enumerate(split_pdbs):
        
        all_data[pdb_name] = {'split':ds_type, 'prefilter reasons':[]}
        all_data[pdb_name]['cath_nodes'] = pdbs['cath_nodes'][pdb_name+'.'+chain]
        
        # assume not to filter out, will be changed if filtering conditions are met
        filter_out = False

        # load structure via mmtf parser 
        cache_dir = '/home/trz846/graph_transformers/data/ingraham_data/cath/mmtf/'
        mmtf_pdb = mmtf_fetch(pdb_name, cache_dir)
        try:
                    # skip if pdb has become obsolete in pdb database
            if  mmtf_pdb is None:
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['size'] = np.nan
                all_data[pdb_name]['chains'] = np.nan
                all_data[pdb_name]['method'] = 'N.A'
                all_data[pdb_name]['prefilter reasons'].append('obsolete')
                filter_out = True
                continue

            # load as biopython object as works better for polychains and missing aa's
            biopython_structure =  get_biopython_structure(mmtf_pdb)

            # add basics about pdb
            all_data[pdb_name]['size'] = get_size(biopython_structure)
            all_data[pdb_name]['chains'] = polymeric(biopython_structure)
            methods = mmtf_pdb.experimental_methods
            all_data[pdb_name]['method'] = methods
    #         all_data[pdb_name]['rmsd NMR'] = rmsd_of_NMR_models(pdb_name, methods)
    #         all_data[pdb_name]['resolution'] =  get_resolution(mmtf_pdb, methods )

            if membrane_protein(pdb_name):
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['prefilter reasons'].append('membrane protein')
                filter_out = True
            if unsuitable_experimental_method(mmtf_pdb):
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['prefilter reasons'].append('unsuitable exp method')
                filter_out = True
            if low_resolution(mmtf_pdb): 
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['prefilter reasons'].append('low resolution')
                filter_out = True

            if rmsd_of_NMR_models_to_high(pdb_name, all_data[pdb_name]['method'] ):
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['prefilter reasons'].append('NMR rmsd to high')
                filter_out = True

            if unknown_aa(mmtf_pdb):
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['prefilter reasons'].append('unknown_aa')
                filter_out = True

            if polymeric_with_single_chain_too_short(biopython_structure):
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['prefilter reasons'].append('one chain too short')
                filter_out = True
            if missing_aa_inside_protein(biopython_structure):
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['prefilter reasons'].append('gap in chain')
                filter_out = True
            if unsuitable_ligands(mmtf_pdb):
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['prefilter reasons'].append('unsuitable ligand')
                filter_out = True
            if too_big(biopython_structure, max_len = MAX_LENGTH):
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['prefilter reasons'].append('too big')
                filter_out = True
            if too_small(biopython_structure, min_len = MIN_LENGTH):
                all_data = add_prefiltered(pdb_name, all_data)
                all_data[pdb_name]['prefilter reasons'].append('too small')
                filter_out = True

            all_data[pdb_name]['rmsd NMR'] = rmsd_of_NMR_models(pdb_name, methods)
            all_data[pdb_name]['resolution'] =  get_resolution(mmtf_pdb, methods )

            if not filter_out:
                all_data[pdb_name]['prefilter status'] = 'kept'
                all_data[pdb_name]['simulation status'] = '-'
                all_data[pdb_name]['unsuccessfull reason'] = '-'
                all_data[pdb_name]['size'] = get_size(biopython_structure)
                all_data[pdb_name]['chains'] = polymeric(biopython_structure)
                all_data[pdb_name]['method'] = mmtf_pdb.experimental_methods
        except:
            all_data = add_prefiltered(pdb_name, all_data)
            all_data[pdb_name]['prefilter reasons'] = 'error in reading pdb'
            all_data[pdb_name]['size'] = np.nan
            all_data[pdb_name]['chains'] = np.nan
            all_data[pdb_name]['method'] = 'N.A'
            print(f'\n ERROR for {pdb_name}')
       
    
    
    
with open('CATH4.2_INITIAL_STATUS.json', 'w') as fp:
    json.dump(all_data, fp)  


 ERROR for 5iz7

 ERROR for 5gka

 ERROR for 3j7y

 ERROR for 3jb9

 ERROR for 3jb9

 ERROR for 3jb9
pdb is likely obsolete, cannot download: http://mmtf.rcsb.org/v1.0/full/5d95.mmtf.gz





 ERROR for 3jcu
pdb is likely obsolete, cannot download: http://mmtf.rcsb.org/v1.0/full/2kea.mmtf.gz
pdb is likely obsolete, cannot download: http://mmtf.rcsb.org/v1.0/full/5a0j.mmtf.gz





 ERROR for 3j26





 ERROR for 3jam





 ERROR for 2n1f

 ERROR for 3jb8





 ERROR for 5tj5

 ERROR for 5gae

 ERROR for 5gae

 ERROR for 5gae

 ERROR for 5gae

 ERROR for 5jzg

 ERROR for 3jci
pdb is likely obsolete, cannot download: http://mmtf.rcsb.org/v1.0/full/3fsp.mmtf.gz

 ERROR for 3j7y

 ERROR for 3j7y

 ERROR for 3j7y

 ERROR for 3j7y

 ERROR for 3j7y

 ERROR for 3j7y

 ERROR for 3j7y

 ERROR for 3j7y
pdb is likely obsolete, cannot download: http://mmtf.rcsb.org/v1.0/full/4otw.mmtf.gz

 ERROR for 3jc2

 ERROR for 3zeu
pdb is likely obsolete, cannot download: http://mmtf.rcsb.org/v1.0/full/4eln.mmtf.gz

 ERROR for 3jcu

 ERROR for 3jck

 ERROR for 5h1s

 ERROR for 5h1s

 ERROR for 5ll6

 ERROR for 3j27

 ERROR for 3j27

 ERROR for 5k0u

 ERROR for 3jam

 ERROR for 3jam




pdb is likely obsolete, cannot download: http://mmtf.rcsb.org/v1.0/full/3kwu.mmtf.gz






 ERROR for 3j7a

 ERROR for 3j7a





 ERROR for 3j4u

 ERROR for 3j7y

 ERROR for 2k9y





 ERROR for 3j7y

 ERROR for 3j7y

 ERROR for 3j7y

 ERROR for 3j7y

 ERROR for 3j7a

 ERROR for 3jcu
