# Tutorial 2: Phase search
Dara is equipped with a tree-search-based algorithm to predict the phases present in a given pattern. 

In this tutorial, we will try to understand the phases in one sample from pairwise reaction between `GeO2` and `ZnO`.

In [1]:
from dara import search_phases
from pathlib import Path

In [2]:
pattern_path = "tutorial_data/GeO2-ZnO_700C_60min.xrdml"

# three elements are present in the sample
chemical_system = "Zn-Ge-O" 

## Step 1: Prepare reference phases

Dara pre-builds an index of all the unique and low-energy phases in ICSD and COD databases. It also implements a method to download CIF structures from COD data server so that there is no need to obtain the offline database.

Before every search, we will need to gather all the reference phases in the chemical system for the search algorithm. Dara provides `ICSDDatabase` and `CODDatabase` to do the filtering.

In this example, we will use `CODDatabase` to download all the phases in the chemical system of `Ge-Zn-O`.

In [3]:
from dara.structure_db import CODDatabase

# The COD database contains methods to filter phases in the chemical system
cod_database = CODDatabase()

# gather reference phases and save them to a directory called "cifs"
all_cod_ids = cod_database.get_cifs_by_chemsys(chemical_system, dest_dir="cifs")



Downloading CIFs from COD...:   0%|          | 0/43 [00:00<?, ?it/s]

2024-06-26 00:11:57,630 INFO dara.structure_db Saving downloaded CIFs to dara_downloaded_cifs
Skipping high-energy phase: 1528389 (Ge, 96): e_hull = 0.1494
Skipping high-energy phase: 9013109 (Ge, 64): e_hull = 0.3137
2024-06-26 00:11:57,643 INFO dara.structure_db Skipping common gas: O2
2024-06-26 00:11:57,643 INFO dara.structure_db Skipping common gas: O2
2024-06-26 00:11:57,644 INFO dara.structure_db Skipping common gas: O2
2024-06-26 00:11:57,644 INFO dara.structure_db Skipping common gas: O2
2024-06-26 00:11:57,644 INFO dara.structure_db Skipping common gas: O2
2024-06-26 00:11:57,644 INFO dara.structure_db Skipping common gas: O2
2024-06-26 00:11:57,645 INFO dara.structure_db Skipping common gas: O2
2024-06-26 00:11:57,645 INFO dara.structure_db Skipping common gas: O2
2024-06-26 00:11:57,645 INFO dara.structure_db Skipping common gas: O2
2024-06-26 00:11:57,645 INFO dara.structure_db Skipping common gas: O2
2024-06-26 00:11:57,646 INFO dara.structure_db Skipping common gas: O2
2

## Step 2: Search for phases

After preparing the reference phases, we can start the phase search. The algorithm will need the reference phases and the pattern to search for the phases.

In [15]:
# gather all the phases in the "cifs" directory
all_cifs = list(Path("cifs").glob("*.cif"))

search_results = search_phases(
    pattern_path=pattern_path,
    phases=all_cifs,
    wavelength="Cu",
    instrument_name="Aeris-fds-Pixcel1d-Medipix3",
)

2024-06-26 00:14:42,671 INFO dara.search.tree Detecting peaks in the pattern.
2024-06-26 00:15:01,929 INFO dara.search.tree The wmax is automatically adjusted to 57.88.
2024-06-26 00:15:01,931 INFO dara.search.tree The intensity threshold is automatically set to 10.00 % of maximum peak intensity.
2024-06-26 00:15:01,931 INFO dara.search.tree Creating the root node.
2024-06-26 00:15:01,931 INFO dara.search.tree Refining all the phases in the dataset.
2024-06-26 00:15:15,907 INFO dara.search.tree Finished refining 16 phases, with 7 phases removed.
[36m(_remote_expand_node pid=64087)[0m 2024-06-26 00:15:15,932 INFO dara.search.tree Expanding node cde0f362-338b-11ef-adb6-5e65b5588e19 with current phases [], Rwp = None
[36m(_remote_expand_node pid=64088)[0m 2024-06-26 00:15:17,974 INFO dara.search.tree Expanding node d7491a24-338b-11ef-ba03-5e65b5588e19 with current phases [RefinementPhase(path=PosixPath('cifs/ZnO_186_(cod_9004178)-0.cif'), params={})], Rwp = 49.03
[36m(_remote_expand_

## Step 3: Result analysis
The returned search result will be a list of `SearchResult` object.

In [16]:
search_results

[SearchResult(refinement_result=RefinementResult(lst_data=LstResult(raw_lst='Rietveld refinement to file(s) GeO2-ZnO_700C_60min.xy\nBGMN version 4.2.23, 4416 measured points, 121 peaks, 24 parameters\nStart: Wed Jun 26 00:15:20 2024; End: Wed Jun 26 00:15:22 2024\n19 iteration steps\n\nRp=9.96%  Rpb=19.06%  R=10.98%  Rwp=12.04% Rexp=2.69%\nDurbin-Watson d=0.10\n1-rho=1.99%\n\nGlobal parameters and GOALs\n****************************\nQGeO2152cod23003650=0.4771+-0.0021\nQZnO186cod90041780=0.3870+-0.0024\nQZn2GeO4148cod90146310=0.1359+-0.0013\nEPS2=-0.002856+-0.000013\n\nLocal parameters and GOALs for phase GeO2152cod23003650\n******************************************************\nSpacegroupNo=152\nHermannMauguin=P3_121\nXrayDensity=4.276\nRphase=10.88%\nUNIT=NM\nA=0.499110+-0.000024\nC=0.564766+-0.000033\nk1=0.0100000\nB1=0.00500000\nGEWICHT=0.2793+-0.0012\nGrainSize(1,1,1)=84.1811\nAtomic positions for phase GeO2152cod23003650\n---------------------------------------------\n  3     0.

In this pattern, we only have one solution found with `Rwp = 12.04 %`.

In [17]:
for i in range(len(search_results)):
    print(f"Rwp of solution {i} = {search_results[i].refinement_result.lst_data.rwp} %")

Rwp of solution 0 = 12.04 %


Each `SearchResult` has a `visualize` method to visualize the refined pattern and missing/extra peaks in the solution.

In [18]:
search_results[0].visualize()

You can also view all the alternative phases in one solution from `SearchResult.phases` attribute.

In [25]:
print("Phases found in solution 0:")
for i, phases_ in enumerate(search_results[0].phases):
    print(f"    - Phase {i}: {[phase.path.name for phase in phases_]}")

Phases found in solution 0:
    - Phase 0: ['GeO2_152_(cod_2300365)-0.cif', 'GeO2_154_(cod_9007477)-0.cif']
    - Phase 1: ['ZnO_186_(cod_9004178)-0.cif']
    - Phase 2: ['Zn2GeO4_148_(cod_9014631)-0.cif']


From the result, you can see that for the phase `GeO2`, the algorithm identifies two similar phases with slightly different spacegroups (SG 152 & 154)