In [5]:
from omsdetector import MofCollection 
import warnings

## Creating a MOF Collection
### Loading from path names.

We can build a **MofCollection** from a list of paths to MOF CIF files. We can also specify the **analysis_folder** where all the results of the analyses perfromed on this MOF collection will be stored. The default value is analysis_folder = 'analysis_folder'.

In [7]:
path_list = ['cif_files_example/JUTCUW_clean.cif', 
             'cif_files_example/PIYZAZ_clean.cif', 
             'cif_files_example/FIGXAU_clean.cif']

a_mof_collection = MofCollection(path_list = path_list, 
                                 analysis_folder="analysis_folder_example")

print("There are {} MOFs in the collection.".format(len(a_mof_collection)))

Loading CIF files...
100.0 %
All Done.
There are 3 MOFs in the collection.


In [8]:
print(a_mof_collection)

--------------------------------------------------
This collection holds information for 3 MOFs.
Analysis folder is: /run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/analysis_folder_example

List of cif files in collection:

cif_files_example/JUTCUW_clean.cif
cif_files_example/PIYZAZ_clean.cif
cif_files_example/FIGXAU_clean.cif
--------------------------------------------------


### Loading CIFs from folder.

It is often conveniet to load all the CIF files located in a specific folder. 

We can do this by using the **from_folder()** method to create the MofCollection object.

In [9]:
a_mof_collection = MofCollection.from_folder(collection_folder="cif_files_example", 
                                             analysis_folder="analysis_folder_example")

print("There are {} MOFs in the collection.".format(len(a_mof_collection)))

Loading CIF files...
100.0 %
All Done.
There are 50 MOFs in the collection.


### Loading selected CIFs from folder.

We might have a large number of CIFs in a folder but are only interested in examing a small subset of them. To accomplish this we can create a MofCollection from a folder as before but this time provide the additional argument __name_list__ which is a list of the names of MOFs we are interested in. 

For example:

In [10]:
a_mof_collection_name_list = MofCollection.from_folder(collection_folder="cif_files_example", 
                                                       analysis_folder="analysis_folder_example",
                                                       name_list=['JUTCUW_clean.cif', 
                                                                  'PIYZAZ_clean.cif', 
                                                                  'FIGXAU_clean.cif'])

print("There are {} MOFs in the collection.".format(len(a_mof_collection_name_list)))

--------------------------------------------------
Using only MOFs in the name list.
--------------------------------------------------
Loading CIF files...
100.0 %
All Done.
There are 3 MOFs in the collection.


## Filtering a Collection
Before analyzing the MOFs in the collection we created we might want to filter the collection and keep only MOFs with certain characteristics, for example MOFs that contain certain metal atoms.

We can do this using the **filter_collection()** method on the collection object. The CIF files that match the filter will be included to this collection. 

### Keep CIF files in original location

By default the cif files will be still pointing to the original location. This might be useful if we want to analyze a subset of MOFs without copying over the files.

In [11]:
co_coll = a_mof_collection.filter_collection(using_filter={"metal_species":["Co"]})

--------------------------------------------------

Validating property : "metal_species"
FIGXEY_clean 14.0 %                                                                                                       



Validated 100 %                                                                                                           
--------------------------------------------------
Filtering collection.

11 MOFs were matched using the provided filter.
Returning a new collection using the matched MOFs.
Loading CIF files...
100.0 %
All Done.
--------------------------------------------------


### Define a new CIF folder when filtering
If the keyword **new_collection_folder** is set when filtering a collection the CIF files will be coppied to that folder and paths for the CIF files of the collection will be updated to point to the new location.

In [12]:
co_coll = a_mof_collection.filter_collection(using_filter={"metal_species":["Co"]}, 
                                             new_collection_folder="Co_mofs")

--------------------------------------------------

Validating property : "metal_species"
Validated 100 %                                                                                                           
--------------------------------------------------
Filtering collection.

11 MOFs were matched using the provided filter.
Returning a new collection using the matched MOFs.
Loading CIF files...
100.0 %
All Done.
--------------------------------------------------
--------------------------------------------------
The cif files for this collection will be copied to the specified folder:
"/run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/Co_mofs"
The cif paths will be updated.
--------------------------------------------------


### Copy filtered collection later

The third option is to create a filtered collection and explicitly copy the files at a later stage.

In [13]:
co_coll = a_mof_collection.filter_collection(using_filter={"metal_species":["Co"], "non_metal_species":["C"]})
print(co_coll)

--------------------------------------------------

Validating properties : "metal_species, non_metal_species"
Validated 100 %                                                                                                           
--------------------------------------------------
Filtering collection.

11 MOFs were matched using the provided filter.
Returning a new collection using the matched MOFs.
Loading CIF files...
100.0 %
All Done.
--------------------------------------------------
--------------------------------------------------
This collection holds information for 11 MOFs.
Analysis folder is: /run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/analysis_folder_example

List of cif files in collection:

cif_files_example/OGIYAF_clean.cif
cif_files_example/QATPUX_clean.cif
cif_files_example/REGJIW01_clean.cif
cif_files_example/REGJIW02_clean.cif
cif_files_example/RENWEM01_clean.cif
cif_files_example/RENWEM_clean.cif
cif_files_example/YUCNEQ_clean.cif
cif_files

In [14]:
co_coll.copy_cifs(target_folder="Co_mofs")
print(co_coll)

--------------------------------------------------
The cif files for this collection will be copied to the specified folder:
"/run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/Co_mofs"
The cif paths will be updated.
--------------------------------------------------
--------------------------------------------------
This collection holds information for 11 MOFs.
Analysis folder is: /run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/analysis_folder_example

List of cif files in collection:

/run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/Co_mofs/OGIYAF_clean.cif
/run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/Co_mofs/QATPUX_clean.cif
/run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/Co_mofs/REGJIW01_clean.cif
/run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/Co_mofs/REGJIW02_clean.cif
/run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/Co_mofs/RENWEM01_clean.cif
/run/media/em

The same operations can be performed for any results that might be present using the **new_analysis_folder** keyword  or the **copy_results()** function.

## Analyze a MOF Collection

Once we have a MOF collection we can run the **analyse_mofs()** method on it, which will detect all the open metal sites (OMS) in the collection. 

In [15]:
a_mof_collection.analyse_mofs()

--------------------------------------------------
Running OMS Analysis...
--------------------------------------------------
1 batch requested. 
Overwrite is set to False. 
Storing results in analysis_folder_example/oms_results. 
--------------------------------------------------

Validating property : "load_balancing_index"
Validated 100 %                                                                                                           
--------------------------------------------------
Checking if results for any of the MOFs exist...
Will not skip any MOFs
--------------------------------------------------
Batch 1 has 50 MOFs
--------------------------------------------------
Batch 1 Finished.                                                                                                                             
Validating property : "has_oms"
Validated 100 %                                                                                                           

Analy

### How to overwrite results

If we try to re-run the analysis code it will by default only analyze MOFs for which no results can be found. This makes it easy to resume a calculation that ended prematurely.

In [16]:
a_mof_collection.analyse_mofs()

--------------------------------------------------
Running OMS Analysis...
--------------------------------------------------
1 batch requested. 
Overwrite is set to False. 
Storing results in analysis_folder_example/oms_results. 
--------------------------------------------------

Validating property : "load_balancing_index"
Validated 100 %                                                                                                           
--------------------------------------------------
Checking if results for any of the MOFs exist...
Skipping 50 MOFs because results were found. 
--------------------------------------------------
Batch 1 has 0 MOFs
--------------------------------------------------
Batch 1 Finished.                                                                                                    
Validating property : "has_oms"
Validated 100 %                                                                                                           

Analysis

To control this behavior and force all the MOFs for which results exist to be reanalyzed we can set the keyword **overwrite** to True.

In [17]:
a_mof_collection.analyse_mofs(overwrite=True)

--------------------------------------------------
Running OMS Analysis...
--------------------------------------------------
1 batch requested. 
Overwrite is set to True. 
Storing results in analysis_folder_example/oms_results. 
--------------------------------------------------

Validating property : "load_balancing_index"
Validated 100 %                                                                                                           
--------------------------------------------------
--------------------------------------------------
Batch 1 has 50 MOFs
--------------------------------------------------
Batch 1 Finished.                                                                                                                             
Validating property : "has_oms"
Validated 100 %                                                                                                           

Analysis Finished. Time required:10.82 sec
-----------------------------------

### Run analysis in parallel

Since every MOF can be analyzed separately we can parallelize the analysis by splitting the structures in batches and running each batch as a separate process. The number of batches is specified using the **num_batches** keyword for which the default value is 1. The structures are first ordered based on the square of the number of atoms and then split into batches. This ensures that all the batches will run roughly in the same time, which results in a time efficient completion of the analysis.

In [18]:
a_mof_collection.analyse_mofs(num_batches=3, overwrite=True)

--------------------------------------------------
Running OMS Analysis...
--------------------------------------------------
3 batches requested. 
Overwrite is set to True. 
Storing results in analysis_folder_example/oms_results. 
--------------------------------------------------

Validating property : "load_balancing_index"
Validated 100 %                                                                                                           
--------------------------------------------------
--------------------------------------------------
Batch 1 has 25 MOFs
Batch 2 has 15 MOFs
Batch 3 has 10 MOFs
--------------------------------------------------
Batch 1 Finished.|**| Batch 2 Finished.|**| Batch 3 Finished.                                                                                                                                                                             
Validating property : "has_oms"
Validated 100 %                                                     

## Summarizing Results

We can get a table that summarizes the findings usingt he __summarize_results()__ function

In [19]:
a_mof_collection.summarize_results()


Validating property : "has_oms"
Validated 100 %                                                                                                           
--------------------------------------------------
Number of total MOFs: 50
Number of total MOFs with open metal sites: 37
Number of total unique sites: 74
Number of total unique open metal sites: 45
--------------------------------------------------
Summary Table

    MOFs  MOFs_with_OMS  Metal Sites  OMS MOFs_with_OMS(%)   OMS (%)
Co    11              6           18    9          54.55 %   50.00 %
Cd     9              4           13    4          44.44 %   30.77 %
Ni     5              5            6    6         100.00 %  100.00 %
Gd     3              2            3    2          66.67 %   66.67 %
Zn     3              2            4    3          66.67 %   75.00 %
Er     3              2            4    2          66.67 %   50.00 %
Cu     3              3            3    3         100.00 %  100.00 %
Fe     2              0   

### Filter collection using results

Finally, we can use the filter function to isolate, for example, MOFs that have Co metal sites and that contain open metal sites. We can also copy the cif files and result files for this subset to new locations by providing values for the **new_collection_folder** and **new_analysis_folder** keywords.

In [20]:
co_oms = a_mof_collection.filter_collection(using_filter={"metal_species":["Co"], "has_oms":True},
                                            new_collection_folder='co_oms',
                                            new_analysis_folder='co_oms_analysis')

print(co_oms)

--------------------------------------------------

Validating properties : "metal_species, has_oms"
Validated 100 %                                                                                                           
--------------------------------------------------
Filtering collection.

6 MOFs were matched using the provided filter.
Returning a new collection using the matched MOFs.
Loading CIF files...
100.0 %
All Done.
--------------------------------------------------
--------------------------------------------------
The cif files for this collection will be copied to the specified folder:
"/run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/co_oms"
The cif paths will be updated.
--------------------------------------------------
--------------------------------------------------
The result files for this collection will be copied to the specified folder:
/run/media/emmhald/tbDrive/build_msi/open_metal_detector/examples/co_oms_analysis
The analysis folder wi