# IDP comparison tool

This notebook takes two IDP ensembles as an input, and computes the statistical tool evaluating both their global and local residue-speficic differences, together with an overall distance for the entire ensembles. Please read carefully the instructions below on how ensemble data can be given as an input. **Python version has to be set to 3.8**, and the modules listed on top of each notebook (and on readme file) have to be installed.

In [None]:
# Load notebooks with required functions
from ipynb.fs.full.build_frames import *
from ipynb.fs.full.sample_independent_replicas import *
from ipynb.fs.full.multiframe_conversion import *
from ipynb.fs.full.wmatrix import *
from ipynb.fs.full.wvector import *
from ipynb.fs.full.graphical_representation import *

import os
import time
import sys
import shutil
import warnings # Optional
#warnings.filterwarnings("ignore") # Optional

The function ``comparison_tool`` takes two IDP ensembles as an arguments and performs the entire comparison analysis by properly chaining the appropriate functions. This tool works with 3 folders:
* ``ensemble_1_path``: the path to a folder where data for the first IDP ensemble is included.
* ``ensemble_2_path``: the path to a folder where data for the second IDP ensemble is included.
* ``results_path``: a path to an *empty* directory where results must be saved. If ``None``, it will be automatically created.

``ensemble_1_path`` and ``ensemble_2_path`` must be two *different* folders, containing each:
* One .xtc file per replica, together with a .pdb file with topology information **or**,
* One multiframe .pdb file per replica **or**,
* If ensembles are given as a list of .pdb files (one file per conformation), one **folder** per replica, containing each one .pdb file per conformation.

Even if ``comparison_tool`` should be prepared to distinguish files when multiple extensions or formats are included in ``ensemble_1_path`` or ``ensemble_2_path``, we recommend to choose one of the accepted input versions and include *only* its corresponding data into the folders. Consequently, we ask you not to choose ``ensemble_1_path`` or ``ensemble_2_path`` as a folder where multiple or redundant ensembles are located, but to create a specific directory per ensemble.

The function will ask you some information to be sure that it understands correctly the input you are giving. It will also ask you whether it should create any number of independent replicas from a given ensemble if just one replica is provided. It will finally ask you how many CPUs it may use, and launch the computation aferwards.

During computation (which may take some time depending on the number of conformations and the protein lenght), the function will create and move folders within ``ensemble_1_path``, ``ensemble_2_path`` and ``results_path``. Some of them are temporary and will be deleted at the end of computation but, if an error occurs or if you stop the computation, we ask you to delete all folders and files that you did not introduce as an input before rerunning. In the same way, if you wish to repeat the comparison or re-use one of the ensembles, please be sure that ``ensemble_1_path`` or ``ensemble_2_path`` does not contain any extra file, and choose a different ``results_path`` for a new computation (or delete the files inside the previous one if no longer needed). To facilitate this task, you can set ``results_path`` to ``None``, and a new folder with the names of both ensembles will be created at each computation, in the same directory where ``ensemble_1_path`` is located. In summary, we kindly ask you to make sure that ``results_path`` is *empty* before running the function.

The two remaining arguments that the function needs are:
* ``ensemble_1_name``: The name to be given to the first ensemble, which will appear in the final plot but also in all files that intervene in computation. Therefore, we ask you to avoid spaces and unautorized characters. Example: 'My_ensemble_1'.
* ``ensemble_2_name``: The name to be given to the second ensemble, with the same considerations.

In ``results_path`` you will find the output of the entire algorithm:
* A .pdf file containing the visual representation (in matrix form) of the computed global and local differences. If you wish to modify any of the plot parameters (e.g. font size), this can be done inside ``wmatrix_plot`` function of ``graphical_representation`` notebook.
* The folder named ``all_coordinates``, where the computed coordinates of each replica are stocked in .hdf5 files.
* The two folders named ``intra_ensemble_matrices`` and ``intra_ensemble_vectors``, where the matrices accounting for intra-ensemble global (resp. local) distances are stocked, in .npy files.
* The two folders named ``inter_ensemble_matrices`` and ``inter_ensemble_vectors``, where the matrices accounting for inter-ensemble global (resp. local) distances are stocked, in .npy files.

As you know this is a **beta version** of our IDP comparison tool practical implementation. This is why we apologize if any unexpected error or computational issue may appear. We will quickly repare them if you encounter any. Some examples are provided at the end of the script, to illustrate how computation can be implemented.

We really appreciate your help and time as beta tester, so many thanks and happy IDP comparison :)

In [None]:
def comparison_tool(ensemble_1_path, ensemble_1_name, ensemble_2_path, ensemble_2_name, results_path = None):
     
    if results_path == ensemble_1_path or results_path == ensemble_2_path:
        sys.exit("Please choose results_path different from ensemble_1_path and ensemble_2_path.")
    if ensemble_1_path == ensemble_2_path:
        sys.exit("Please place each ensemble in one different folder.")
    if results_path is None and not os.path.exists("/".join([os.path.abspath(os.path.join(ensemble_1_path, os.pardir)),"_".join(['results',ensemble_1_name,ensemble_2_name])])):
        os.mkdir("/".join([os.path.abspath(os.path.join(ensemble_1_path, os.pardir)),"_".join(['results',ensemble_1_name,ensemble_2_name])]))
        results_path = "/".join([os.path.abspath(os.path.join(ensemble_1_path, os.pardir)),"_".join(['results',ensemble_1_name,ensemble_2_name])])
     
    # Initial parameters
    var_dict = {'multiframe' : 'n', 'check_folder' : True, 'do_xtc_1' : False, 'N_rep_1' : 1, 'ignore_uncertainty_1' : False, 'do_pdb_1' : False,
                'do_xtc_2' : False,  'N_rep_2' : 1, 'ignore_uncertainty_2' : False, 'do_pdb_2' : False, 'N1' : 1, 'N2' : 1,
                'ensemble_1_name' : ensemble_1_name, 'ensemble_2_name' : ensemble_2_name, 'ensemble_1_path' : ensemble_1_path, 'ensemble_2_path' : ensemble_2_path}
    
    var_dict['xtc_files_1'] = [file for file in os.listdir(ensemble_1_path)  if file.endswith(".xtc")] 
    var_dict['pdb_files_1'] = [file for file in os.listdir(ensemble_1_path)  if file.endswith(".pdb")]
    var_dict['folders_1'] = [file for file in os.listdir(ensemble_1_path)  if os.path.isdir("/".join([ensemble_1_path,file]))]
    
    var_dict['xtc_files_2'] = [file for file in os.listdir(ensemble_2_path)  if file.endswith(".xtc")]
    var_dict['pdb_files_2'] = [file for file in os.listdir(ensemble_2_path)  if file.endswith(".pdb")]
    var_dict['folders_2'] = [file for file in os.listdir(ensemble_2_path)  if os.path.isdir("/".join([ensemble_2_path,file]))]
    
    print("\n----------------------------------------------------------------------------------\n")
    print(' \(·_·)                                                                  \(·_·)')
    print('   ) )z      Welcome to the beta version of this IDP comparison tool       ) )z')
    print("   / \\                                                                     / \\ \n")
    print("Before launching the computation, let me check I understood everything correctly...")
    print("\n----------------------------------------------------------------------------------\n")
    
    # File processing
   
    for which_ens in ['1','2']:
        
        print("".join(["For the ensemble named ",var_dict["_".join(['ensemble',which_ens,'name'])],', I found ',
                       str(len(var_dict["_".join(['xtc_files',which_ens])])),' .xtc file(s), ',str(len(var_dict["_".join(['pdb_files',which_ens])])),' .pdb file(s) and ',
                       str(len(var_dict["_".join(['folders',which_ens])])),' folder(s).']))
        
        if len(var_dict["_".join(['xtc_files',which_ens])]) + len(var_dict["_".join(['folders',which_ens])]) + len(var_dict["_".join(['pdb_files',which_ens])]) == 0:
            sys.exit("".join(['Folder for ', var_dict["_".join(['ensemble',which_ens,'name'])], ' ensemble is empty...']))
        
        # .xtc files with a .pdb topology file
    
        if len(var_dict["_".join(['xtc_files',which_ens])]) >= len(var_dict["_".join(['pdb_files',which_ens])]) and len(var_dict["_".join(['pdb_files',which_ens])]) == 1:
            print('\nShould I interprete this input as:\n')
            for j in range(len(var_dict["_".join(['xtc_files',which_ens])])):
                print("".join([str(var_dict["_".join(['xtc_files',which_ens])][j]),' : ',str(j+1),'-th independent replica of ',var_dict["_".join(['ensemble',which_ens,'name'])],',']))
            print("".join([str(var_dict["_".join(['pdb_files',which_ens])][0]),' : topology file for all ',var_dict["_".join(['ensemble',which_ens,'name'])],' replicas.']))
            ens_input = input('...? (y/n)')
            if ens_input == 'n':
                var_dict['multiframe'] = input("Should I ignore .xtc files and consider the .pdb file as a multiframe file? (y/n)")
            else:
                var_dict["_".join(['do_xtc',which_ens])] = True
                var_dict["_".join(['xtc_root_path',which_ens])] = var_dict["_".join(['ensemble',which_ens,'path'])]
                var_dict['check_folder'] = False
                
        # multiframe .pdb files
   
        if var_dict['multiframe'] == 'y' or (len(var_dict["_".join(['pdb_files',which_ens])]) >= 1 and len(var_dict["_".join(['xtc_files',which_ens])]) == 0):
            print('\nShould I interprete this input as:\n')
            for j in range(len(var_dict["_".join(['pdb_files',which_ens])])):
                if j < len(var_dict["_".join(['pdb_files',which_ens])])-1:
                    end = ','
                else:
                    end = '.'
                print("".join([str(var_dict["_".join(['pdb_files',which_ens])][j]),' : ',str(j+1),'-th independent replica of ',var_dict["_".join(['ensemble',which_ens,'name'])],end]))
            ens_input = input('...? (y/n)')
            if ens_input == 'y':
                print('Replicas have been given as multiframe .pdb files, which are not supported.')
                print("Converting files to .xtc + topology .pdb...\n ")
                if not os.path.exists("/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'converted_files'])):
                    os.mkdir("/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'converted_files']))
                for file_j in var_dict["_".join(['pdb_files',which_ens])]:
                    multiframe_pdb_to_xtc(pdb_file = "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],file_j]), save_path = "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'converted_files']), prot_name = var_dict["_".join(['ensemble',which_ens,'name'])])
                    print("".join(['Done for ',file_j]))
                var_dict["_".join(['do_xtc',which_ens])] = True
                var_dict["_".join(['xtc_root_path',which_ens])] = "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'converted_files'])
                var_dict["_".join(['xtc_files',which_ens])] = [file for file in os.listdir(var_dict["_".join(['xtc_root_path',which_ens])]) if file.endswith(".xtc")]
                var_dict["_".join(['pdb_files',which_ens])] = [file for file in os.listdir(var_dict["_".join(['xtc_root_path',which_ens])]) if file.endswith(".pdb")]
                var_dict['check_folder'] = False
                
        # folder with .pdb files
     
        if len(var_dict["_".join(['folders',which_ens])]) >= 1 and var_dict['check_folder'] == True:
            print('\nShould I interprete this input as:\n')
            for j in range(len(var_dict["_".join(['folders',which_ens])])):
                if j < len(var_dict["_".join(['folders',which_ens])])-1:
                    end = ','
                else:
                    end = '.'
                print("".join([var_dict["_".join(['folders',which_ens])][j],' folder contains: ',str(j+1),'-th independent replica of ',var_dict["_".join(['ensemble',which_ens,'name'])],end]))
            ens_input = input('...? (y/n)')
            if ens_input == 'y':
                var_dict["_".join(['do_pdb',which_ens])] = True
    
        if var_dict["_".join(['do_pdb',which_ens])] == False and var_dict["_".join(['do_xtc',which_ens])] == False:
            sys.exit("".join(['\n Sorry, I did not understood the input. Please follow the guidelines described in the function documentation to create ',eval("_".join(['ensemble',which_ens,'name'])),' folder.\n']))    
            
        print("\n----------------------------------------------------------------------------------\n")
            
        # Sample independent replicas if needed (this will be done after building frames)
        
        if (len(var_dict["_".join(['xtc_files',which_ens])]) == 1 and var_dict["_".join(['do_xtc',which_ens])] == True) or (len(var_dict["_".join(['folders',which_ens])]) == 1 and var_dict["_".join(['do_pdb',which_ens])] == True):
            print("".join(['Only one replica is available for ensemble ',var_dict["_".join(['ensemble',which_ens,'name'])],'.']))
            print("It is possible do extract independent replicas by subsampling from the available one.")
            print("This may not be appropiate if the ensemble corresponds to a MD simulation.")
            subsampling = input("Should I extract independent replicas? (y/n)")
            if subsampling == 'y':
                N_rep = input('Ok. Please take into account the number of conformations and choose how many independent replicas should I extract: (integer)')
                if int(N_rep) <= 0:
                    sys.exit('The number of replicas to sample must be a positive integer.')
                var_dict["_".join(['N_rep',which_ens])] = int(N_rep)
                print("".join(["After computing reference systems, I will extract ",str(N_rep), ' indepedent replicas for ensemble ',var_dict["_".join(['ensemble',which_ens,'name'])],'.\n']))
                print("\n----------------------------------------------------------------------------------\n")
            else:
                var_dict["_".join(['ignore_uncertainty',which_ens])]  = True
                print("\n----------------------------------------------------------------------------------\n")
                
    
    n_cores = int(input("Everything seems OK! Please specify the number of CPUs you would like to use:"))
    print("\n----------------------------------------------------------------------------------\n")
    print("3..."); time.sleep(1); 
    print("2..."); time.sleep(1)
    print("1..."); time.sleep(1)
    print("Go!"); time.sleep(0.2)
    print("\n----------------------------------------------------------------------------------")
    
    # Build frames and save coordinates
    
    for which_ens in ['1','2']:
        
        print('\nBuilding reference frames for ' + var_dict["_".join(['ensemble',which_ens,'name'])] + '...\n')
        
        if not os.path.exists("/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates'])):
            os.mkdir("/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates']))
      
        if var_dict["_".join(['do_xtc',which_ens])] == True:
            
            for j in range(len(var_dict["_".join(['xtc_files',which_ens])])):
                
                if int(var_dict["_".join(['N_rep',which_ens])]) == 1:
                    pname = var_dict["_".join(['ensemble',which_ens,'name'])] + '_' + str(j)
                else:
                    pname = var_dict["_".join(['ensemble',which_ens,'name'])]
                
                print('\nComputing for ' + str(j+1) + '-th replica...\n'); time.sleep(0.5)
                
                define_frames(xtc_file = "/".join([var_dict["_".join(['xtc_root_path',which_ens])],var_dict["_".join(['xtc_files',which_ens])][j]]), top_file = "/".join([var_dict["_".join(['xtc_root_path',which_ens])],var_dict["_".join(['pdb_files',which_ens])][0]]),
                          pdb_folder = None, num_cores = n_cores, prot_name = pname, save_to =  "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates']),
                             name_variable = 'ipynb.fs.full.build_frames')
            
        if var_dict["_".join(['do_pdb',which_ens])] == True:
            
            for j in range(len(var_dict["_".join(['folders',which_ens])])):
                
                if int(var_dict["_".join(['N_rep',which_ens])]) == 1:
                    pname = var_dict["_".join(['ensemble',which_ens,'name'])] + '_' + str(j)
                else:
                    pname = var_dict["_".join(['ensemble',which_ens,'name'])]
                
                print('\n Computing for ' + str(j+1) + '-th replica...\n')
                define_frames(xtc_file = None, top_file = None,
                          pdb_folder = "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],var_dict["_".join(['folders',which_ens])][j]]), num_cores = n_cores,
                              prot_name = pname, save_to =  "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates']),
                              name_variable = 'ipynb.fs.full.build_frames')
                
        # Sample independent replicas if needed
        
        if int(var_dict["_".join(['N_rep',which_ens])]) > 1:
            
            if not os.path.exists("/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates_ind_replicas'])):
                os.mkdir("/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates_ind_replicas']))
            
            sample_ind_replicas(prot_name = var_dict["_".join(['ensemble',which_ens,'name'])], coordinates_path = "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates']), 
                                dihedrals_path = "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates']),
                                save_to = "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates_ind_replicas']),
                                N_replicas = int(var_dict["_".join(['N_rep',which_ens])]))
    
        # Collect computed frames
        
        if os.path.exists("/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates_ind_replicas'])):
            coor_path = "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates_ind_replicas'])
            var_dict["_".join(['coor_path',which_ens])] = coor_path
        else: 
            coor_path = "/".join([var_dict["_".join(['ensemble',which_ens,'path'])],'coordinates'])
        
        var_dict["_".join(['list_global_replicas',which_ens])] = sorted([file for file in os.listdir(coor_path) if file.endswith('coordinates.hdf5')])
        var_dict["_".join(['list_local_replicas',which_ens])] = sorted([file for file in os.listdir(coor_path) if file.endswith('dihedrals.hdf5')])
        
        if len(var_dict["_".join(['list_global_replicas',which_ens])])!=len(var_dict["_".join(['list_local_replicas',which_ens])]):
            print('An error ocurred during frames computation. The number of coordinates and dihedrals files must be the same.')
            print('Computation proceeds by taking the minimum number of replicas.')
            NR = min(len(var_dict["_".join(['list_global_replicas',which_ens])]),len(var_dict["_".join(['list_local_replicas',which_ens])]))
        else:
            NR = len(var_dict["_".join(['list_global_replicas',which_ens])])
        
        if NR == 1:
                 var_dict["_".join(['ignore_uncertainty',which_ens])] = True
        
        var_dict['N' + which_ens] = int(NR)
        
        # Compute intra - ensemble differences
        
        if not os.path.exists("/".join([results_path,'intra_ensemble_wmatrices'])):
            os.mkdir("/".join([results_path,'intra_ensemble_wmatrices']))
        if not os.path.exists("/".join([results_path,'intra_ensemble_wvectors'])):
            os.mkdir("/".join([results_path,'intra_ensemble_wvectors']))
        if not os.path.exists("/".join([results_path,'all_coordinates'])):
            os.mkdir("/".join([results_path,'all_coordinates']))
        
        if var_dict["_".join(['ignore_uncertainty',which_ens])] == False:
                
            for j in range(1,NR):
                
                wmat = w_matrix(prot_1 = var_dict["_".join(['list_global_replicas',which_ens])][0].split('_coordinates.hdf5')[0], prot_2 = var_dict["_".join(['list_global_replicas',which_ens])][j].split('_coordinates.hdf5')[0] , N_centers = 2000, N_cores = n_cores, data_path = coor_path, name_variable = 'ipynb.fs.full.wmatrix')
                os.chdir("/".join([results_path,'intra_ensemble_wmatrices']))
                np.save("_".join([var_dict["_".join(['ensemble',which_ens,'name'])],'0',str(j),'wmatrix.npy']), wmat)
            
                wvec = w_vector(prot_1 = var_dict["_".join(['list_local_replicas',which_ens])][0].split('_dihedrals.hdf5')[0], prot_2 = var_dict["_".join(['list_local_replicas',which_ens])][j].split('_dihedrals.hdf5')[0] , N_centers = 2000, N_cores = n_cores, data_path = coor_path, name_variable = 'ipynb.fs.full.wvector')
                os.chdir("/".join([results_path,'intra_ensemble_wvectors']))
                np.save("_".join([var_dict["_".join(['ensemble',which_ens,'name'])],'0',str(j),'wvector.npy']), wvec)
            
        for file in os.listdir(coor_path):
            shutil.move("/".join([coor_path,file]),"/".join([results_path,'all_coordinates']))
        os.rmdir(coor_path)            
    
    # Compute inter - ensemble differences
    
    coor_path = "/".join([results_path,'all_coordinates'])
    
    m = np.min([var_dict['N1'], var_dict['N2']])
    a = np.arange(var_dict['N1']); b = np.arange(var_dict['N2'])
    l = [(i,j) for i in a for j in b if i!=j]
    pairs = [(i,i) for i in range(m)]

    if len(a) > len(b):
        for k in range(m,len(a)):
            l = [(a[k],j) for j in b]
            pairs.append(l[int(np.random.choice(np.arange(len(l)), 1)[0])])
    
    if len(a) < len(b):
        for k in range(m,len(b)):
            l = [(j, b[k]) for j in a]
            pairs.append(l[int(np.random.choice(np.arange(len(l)), 1)[0])])
    
    if not os.path.exists("/".join([results_path,'inter_ensemble_wmatrices'])):
        os.mkdir("/".join([results_path,'inter_ensemble_wmatrices']))
    
    if not os.path.exists("/".join([results_path,'inter_ensemble_wvectors'])):
        os.mkdir("/".join([results_path,'inter_ensemble_wvectors']))
    
    for j in range(len(pairs)):
        
        wmat = w_matrix(prot_1 = var_dict['list_global_replicas_1'][pairs[j][0]].split('_coordinates.hdf5')[0], prot_2 = var_dict['list_global_replicas_2'][pairs[j][1]].split('_coordinates.hdf5')[0] , N_centers = 2000, N_cores = n_cores, data_path = coor_path, name_variable = 'ipynb.fs.full.wmatrix')
        os.chdir("/".join([results_path,'inter_ensemble_wmatrices']))
        np.save("_".join([ensemble_1_name,ensemble_2_name,str(j),'wmatrix.npy']), wmat)
            
        wvec = w_vector(prot_1 = var_dict['list_local_replicas_1'][pairs[j][0]].split('_dihedrals.hdf5')[0], prot_2 = var_dict['list_local_replicas_2'][pairs[j][1]].split('_dihedrals.hdf5')[0] , N_centers = 2000, N_cores = n_cores, data_path = coor_path, name_variable = 'ipynb.fs.full.wvector')
        os.chdir("/".join([results_path,'inter_ensemble_wvectors']))
        np.save("_".join([ensemble_1_name,ensemble_2_name,str(j),'wvector.npy']), wvec)
    
    if os.path.exists("/".join([var_dict["_".join(['ensemble_1_path'])],'coordinates'])):
        shutil.rmtree("/".join([var_dict["_".join(['ensemble_1_path'])],'coordinates']))
    if os.path.exists("/".join([var_dict["_".join(['ensemble_2_path'])],'coordinates'])):
        shutil.rmtree("/".join([var_dict["_".join(['ensemble_2_path'])],'coordinates']))      
    
    print("\n----------------------------------------------------------------------------------\n")
    print("Computation done! Here is the result, which has been saved as pdf:")
    print("\n----------------------------------------------------------------------------------\n")
    
    # Print the results
    
    if len(os.listdir("/".join([results_path,'intra_ensemble_wmatrices']))) == 0:
        ind_mat = None
    else:
        ind_mat = "/".join([results_path,'intra_ensemble_wmatrices'])
    
    if len(os.listdir("/".join([results_path,'intra_ensemble_wvectors']))) == 0:
        ind_vec = None
    else:
        ind_vec = "/".join([results_path,'intra_ensemble_wvectors'])
    
    wmatrix_plot(prot_name_1 = ensemble_1_name, prot_name_2 = ensemble_2_name,
             wmatrix_path = "/".join([results_path,'inter_ensemble_wmatrices']), 
             wmatrix_ind_folder = ind_mat,
             wvector_path = "/".join([results_path,'inter_ensemble_wvectors']),
             wvector_ind_folder = ind_vec,
             save_path = results_path)

## Executing the function
#### The ensemble is given as a folder per replica containing one .pdb file per conformation

In [None]:
path_to_notebook = '/home/jgonzale/IDP_analysis'

histatin_filtered_path = "/".join([path_to_notebook,'Examples','histatin_filtered'])
histatin_pool_path = "/".join([path_to_notebook,'Examples','histatin_pool'])

comparison_tool(ensemble_1_path = histatin_filtered_path,
                ensemble_1_name = 'histatin_filtered', 
                ensemble_2_path = histatin_pool_path,
                ensemble_2_name = 'histatin_pool', 
                results_path = None)

#### The ensemble is given as one .xtc file per replica with one .pdb file containing topology information

In [None]:
path_to_notebook = '/home/jgonzale/IDP_analysis'

a7_c36idp_path = "/".join([path_to_notebook,'Examples','a7_c36idp'])
a7_c36m_path = "/".join([path_to_notebook,'Examples','a7_c36m'])

comparison_tool(ensemble_1_path = a7_c36idp_path,
                ensemble_1_name = 'a7_c36idp', 
                ensemble_2_path = a7_c36m_path,
                ensemble_2_name = 'a7_c36m', 
                results_path = None)

#### The ensemble is given as a multiframe .pdb file per replica

In [None]:
path_to_notebook = '/home/jgonzale/IDP_analysis'

tau_1_path = "/".join([path_to_notebook,'Examples','CHARMM36m*-tip3p'])
tau_2_path = "/".join([path_to_notebook,'Examples','CHARMM36m-tip3p'])

comparison_tool(ensemble_1_path = tau_1_path,
                ensemble_1_name = 'CHARMM36m*-tip3p', 
                ensemble_2_path = tau_2_path,
                ensemble_2_name = 'CHARMM36m-tip3p', 
                results_path = None)

In [None]:
comparison_tool(ensemble_1_path = '/home/jgonzale/Documents/test_jupy/ens_1',
                ensemble_1_name = 'Ensemble_1', 
                ensemble_2_path = '/home/jgonzale/Documents/test_jupy/ens_2',
                ensemble_2_name = 'Ensemble_2', 
                results_path = None)