In [2]:
from ./omsdetector import MofCollection 
import warnings

SyntaxError: invalid syntax (141552257.py, line 1)

## Creating a MOF Collection
### Loading from path names.

We can build a **MofCollection** from a list of paths to MOF CIF files. We can also specify the **analysis_folder** where all the results of the analyses perfromed on this MOF collection will be stored. The default value is analysis_folder = 'analysis_folder'.

In [None]:
path_list = ['cif_files_example/HKUST-1_ASR_FIQCEN_clean.cif', 
             'cif_files_example/MgMOF-74_ASR_RAVVUH_clean.cif', 
             'cif_files_example/MOF-5_ASR_MIBQAR_clean.cif']

a_mof_collection = MofCollection(path_list = path_list, 
                                 analysis_folder="analysis_folder_example")

print("There are {} MOFs in the collection.".format(len(a_mof_collection)))

In [None]:
print(a_mof_collection)

### Loading CIFs from folder.

It is often conveniet to load all the CIF files located in a specific folder. 

We can do this by using the **from_folder()** method to create the MofCollection object.

In [None]:
a_mof_collection = MofCollection.from_folder(collection_folder="cif_files_example", 
                                             analysis_folder="analysis_folder_example")

print("There are {} MOFs in the collection.".format(len(a_mof_collection)))

### Loading selected CIFs from folder.

We might have a large number of CIFs in a folder but are only interested in examing a small subset of them. To accomplish this we can create a MofCollection from a folder as before but this time provide the additional argument __name_list__ which is a list of the names of MOFs we are interested in. 

For example:

In [None]:
a_mof_collection_name_list = MofCollection.from_folder(collection_folder="cif_files_example", 
                                                       analysis_folder="analysis_folder_example",
                                                       name_list=['HKUST-1_ASR_FIQCEN_clean.cif', 
                                                                  'MgMOF-74_ASR_RAVVUH_clean.cif', 
                                                                  'MOF-5_ASR_MIBQAR_clean.cif'])

print("There are {} MOFs in the collection.".format(len(a_mof_collection_name_list)))

## Filtering a Collection
Before analyzing the MOFs in the collection we created we might want to filter the collection and keep only MOFs with certain characteristics, for example MOFs that contain certain metal atoms.

We can do this using the **filter_collection()** method on the collection object. The CIF files that match the filter will be included to this collection. 

### Keep CIF files in original location

By default the cif files will be still pointing to the original location. This might be useful if we want to analyze a subset of MOFs without copying over the files.

In [None]:
co_coll = a_mof_collection.filter_collection(using_filter={"metal_species":["Co"]})

### Define a new CIF folder when filtering
If the keyword **new_collection_folder** is set when filtering a collection the CIF files will be coppied to that folder and paths for the CIF files of the collection will be updated to point to the new location.

In [None]:
co_coll = a_mof_collection.filter_collection(using_filter={"metal_species":["Co"]}, 
                                             new_collection_folder="Co_mofs")

### Copy filtered collection later

The third option is to create a filtered collection and explicitly copy the files at a later stage.

In [None]:
co_coll = a_mof_collection.filter_collection(using_filter={"metal_species":["Co"], "non_metal_species":["C"]})
print(co_coll)

In [None]:
co_coll.copy_cifs(target_folder="Co_mofs")
print(co_coll)

The same operations can be performed for any results that might be present using the **new_analysis_folder** keyword  or the **copy_results()** function.

## Analyze a MOF Collection

Once we have a MOF collection we can run the **analyse_mofs()** method on it, which will detect all the open metal sites (OMS) in the collection. 

In [None]:
a_mof_collection.analyse_mofs()

### How to overwrite results

If we try to re-run the analysis code it will by default only analyze MOFs for which no results can be found. This makes it easy to resume a calculation that ended prematurely.

In [None]:
a_mof_collection.analyse_mofs()

To control this behavior and force all the MOFs for which results exist to be reanalyzed we can set the keyword **overwrite** to True.

In [None]:
a_mof_collection.analyse_mofs(overwrite=True)

### Run analysis in parallel

Since every MOF can be analyzed separately we can parallelize the analysis by splitting the structures in batches and running each batch as a separate process. The number of batches is specified using the **num_batches** keyword for which the default value is 1. The structures are first ordered based on the square of the number of atoms and then split into batches. This ensures that all the batches will run roughly in the same time, which results in a time efficient completion of the analysis.

In [None]:
a_mof_collection.analyse_mofs(num_batches=3, overwrite=True)

## Summarizing Results

### Summary for each metal type

We can get a table that summarizes the findings for each metal type using the __summarize_results()__ function.

In [None]:
a_mof_collection.summarize_results()

### Summary for each MOF

We can obtain a DataFrame for the OMS for each MOF using the __mof_oms_df__ variable of the __MofCollection__ object.

In [None]:
mofs_df = a_mof_collection.mof_oms_df
print(mofs_df)

We can then use standard pandas operations to select MOFs, for example, with certain metals or all MOFs that have OMS.

In [None]:
print("MOFs that contain Cu")
print(mofs_df[mofs_df["Metal Types"].str.contains("Cu")])
print("\nMOFs that have OMS")
print(mofs_df[mofs_df["Has OMS"] == "Yes"])

### Filter collection using results

Finally, we can use the filter function to isolate, for example, MOFs that have Co metal sites and that contain open metal sites. We can also copy the cif files and result files for this subset to new locations by providing values for the **new_collection_folder** and **new_analysis_folder** keywords.

In [None]:
co_oms = a_mof_collection.filter_collection(using_filter={"metal_species":["Co"], "has_oms":True},
                                            new_collection_folder='Co_oms',
                                            new_analysis_folder='Co_oms_analysis')

print(co_oms)