# Finding the starting lattice parameters using GSAS-II

In the prior tutorial, we found the starting lattice parameters using MILK.
We can also use other Rietveld packages such as GSAS-II.

In this additional notebook we will find the starting lattice parameters using GSAS-II.

**Note, this example uses the lead sulphate data data found here: https://github.com/lanl/spotlight/tree/master/examples/lead_sulphate**

**If running the notebook, you need to copy the data files to the directory you run the notebook.**

**Note, this notebook was tested using GSAS-II revision r5609.**

# Define cost function

We define the following cost function that sets up the lead sulphate example from the GSAS-II tutorial.

In [1]:
import GSASIIscriptable as gsasii
import GSASIIlattice as lattice
import io
import multiprocess
import os
import shutil
import sys
import time
from mystic import models
from spotlight import filesystem

class CostFunction(models.AbstractFunction):
    
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        
        # if True then create the subdir and copy data files there
        self.initialized = False
        
        # if True then print GSAS-II output
        self.debug = False
    
        # define a list of detectors
        self.detectors = [dict(data_file="./PBSO4.xra",
                               detector_file="./INST_XRY.prm",
                              min_two_theta=16.0,
                              max_two_theta=158.4),
                         dict(data_file="./PBSO4.cwn",
                              detector_file="./inst_d1a.prm",
                              min_two_theta=19.0,
                              max_two_theta=153.0)]

        # define a list of phases
        self.phases = [dict(phase_file="./PbSO4-Wyckoff.cif",
                            phase_label="PBSO4")]
    
    def function(self, p):

        # get start time of this step for stdout
        t0 = time.time()
        
        # create run dir
        dir_name = f"opt_{multiprocess.current_process().name}"
        if not self.initialized:
            filesystem.mkdir(dir_name)
            for detector in self.detectors:
                filesystem.cp([detector["data_file"], detector["detector_file"]], dest=dir_name)
            for phase in self.phases:
                filesystem.cp([phase["phase_file"]], dest=dir_name)
            self.initialized = True
                
        # create a text trap and redirect stdout
        # this is just to make the stdout easier to follow
        if not self.debug:
            silent_stdout = io.StringIO()
            sys.stdout = sys.stderr = silent_stdout
            
        # create a GSAS-II project
        gpx = gsasii.G2Project(newgpx=f"{dir_name}/lead_sulphate.gpx")

        # add histograms
        for det in self.detectors:
            gpx.add_powder_histogram(det["data_file"], det["detector_file"])

        # add phases
        for phase in self.phases:
            gpx.add_phase(phase["phase_file"], phase["phase_label"],
                          histograms=gpx.histograms())

        # turn on background refinement
        args = {
                "Background": {
                    "no. coeffs" : 3,
                    "refine": True,
                }
        }
        for hist in gpx.histograms():
            hist.set_refinements(args)

        # refine
        gpx.do_refinements([{}])
        gpx.save(f"{dir_name}/step_1.gpx")
        
        # create a GSAS-II project
        gpx = gsasii.G2Project(f"{dir_name}/step_1.gpx")
        gpx.save(f"{dir_name}/step_2.gpx")

        # change lattice parameters
        for phase in gpx["Phases"].keys():

            # ignore data key
            if phase == "data":
                continue

            # handle PBSO4 phase
            elif phase == "PBSO4":
                cell = gpx["Phases"][phase]["General"]["Cell"]
                a, b, c = p
                t11, t22, t33 = cell[1] / a, cell[2] / b, cell[3] / c
                gpx["Phases"][phase]["General"]["Cell"][1:] = lattice.TransformCell(
                    cell[1:7], [[t11, 0.0, 0.0],
                                [0.0, t22, 0.0],
                                [0.0, 0.0, t33]])

            # otherwise raise error because refinement plan does not support this phase
            else:
                raise NotImplementedError("Refinement plan cannot handle phase {}".format(phase))

        # turn on unit cell refinement
        args = {
            "set": {
                "Cell" : True,
            }
        }

        # refine
        gpx.set_refinement(args)
        gpx.do_refinements([{}])
        
        # now restore stdout and stderr
        if not self.debug:
            sys.stdout = sys.__stdout__
            sys.stderr = sys.__stderr__

        # get minimization statistic
        stat = gpx["Covariance"]["data"]["Rvals"]["Rwp"]

        # print a message to follow the results
        print(f"Our R-factor is {stat} and it took {time.time() - t0}s to compute")

        return stat

GSAS-II binary directory: /Users/cmbiwer/anaconda3/envs/spotlight-gsas2/GSASII/bindist


## An ensemble using only the cost function

Below, we present an example of using an ensemble of optimizers in parallel with GSAS-II to find the global minimum.
**Note, this will take awhile. It depends on the number of processors available on your machine.**
We set the number of function calls to 50 to limit the amount of time it will take.

In [2]:
from mystic import tools
from mystic.solvers import BuckshotSolver
from mystic.solvers import NelderMeadSimplexSolver
from mystic.termination import VTR
from pathos.pools import ProcessPool as Pool

# set the ranges
target = [8.474, 5.394, 6.954]
lower_bounds = [x * 0.95 for x in target]
upper_bounds = [x * 1.05 for x in target]
    
# get number of parameters in model
ndim = len(target)
    
# set random seed so we can reproduce results
tools.random_seed(0)
    
# create a solver
solver = BuckshotSolver(dim=ndim, npts=8)
    
# set multi-processing pool
solver.SetMapper(Pool().map)
    
# since we have an search solver
# we specify what optimization algorithm to use within the search
# we tell the optimizer to not go more than 50 evaluations of our cost function
subsolver = NelderMeadSimplexSolver(ndim)
subsolver.SetEvaluationLimits(50, 50)
solver.SetNestedSolver(subsolver)
    
# set the range to search for all parameters
solver.SetStrictRanges(lower_bounds, upper_bounds)
    
# find the minimum
solver.Solve(CostFunction(ndim), VTR())
    
# print the best parameters
print(f"The best solution is {solver.bestSolution} with Rwp {solver.bestEnergy}")
print(f"The reference solutions is {target}")
ratios = [x / y for x, y in zip(target, solver.bestSolution)]
print(f"The ratios of to the reference values are {ratios}")

Our R-factor is 51.451971374756155 and it took 7.922705888748169s to compute
Our R-factor is 50.417913808304995 and it took 8.017292022705078s to compute
Our R-factor is 51.28232042340777 and it took 8.144327878952026s to compute
Our R-factor is 51.091942862938346 and it took 8.279250860214233s to compute
Our R-factor is 51.27628865991811 and it took 9.38934588432312s to compute
Our R-factor is 50.08923603492271 and it took 9.453613042831421s to compute
Our R-factor is 50.35337155042529 and it took 9.481461763381958s to compute
Our R-factor is 50.81434198032311 and it took 10.025512218475342s to compute
Our R-factor is 51.38764331916333 and it took 8.434493780136108s to compute
Our R-factor is 51.20099807240431 and it took 9.807249069213867s to compute
Our R-factor is 50.44817967359154 and it took 9.921422719955444s to compute
Our R-factor is 50.95111290919368 and it took 10.085067987442017s to compute
Our R-factor is 51.1257501038701 and it took 9.93289589881897s to compute
Our R-fact

This concludes the example using GSAS-II.