# Preparations at home

## Installation of the package

 1. (Only needed on Windows) Install the Windows Subsystem for Linux following [these instructions](https://learn.microsoft.com/en-us/windows/wsl/setup/environment). Then execute the following steps (including for starting the notebook) within a wsl environent. You can start one by typing `wsl` in a Powershell.
 2. Take a look at the script at the script at the location that is downloaded *via* `wget` and confirm it is safe. Then download and execute the installation script by executing:    
    ```bash
    
        wget https://github.com/Niolon/XHARPy/raw/refs/heads/main/install.sh && bash install.sh
        
    ```
    Do **not** execute via `sudo`, doing that potentially makes your conda environments unusable.

## Starting the notebook 


## Test the installation
Execute the following cells by clicking on the cell and pressing `Shift + Enter`. The way that they are set up is not how you should do things, but they are optimised to get a reproducible result relatively quickly. If all of the cell output a passed underneath: Congratulations you are ready for the workshop and have a running version of XHARPy on your computer.




# Introduction: What does XHARPy do?

 - Hirshfeld Atom Refinement (with partitioned valence densities)
 - atomic form factor calculation
 - Doing these things with periodic densities evaluated on rectangular grids with separately evaluated frozen core densities

# Prerequisites

## The two minute introduction to python programming

As you might imagine this is not a python course, just a short glossary of the syntax that will be used.

Values from the right are assigned to a variable on the left with the `=` operator

In [24]:
a = 1

To save text we need to start and end with the same type of quotation mark

In [25]:
a_string = "String content"

A dictionary is started and ended with curly braces and pairs of key: values separated by a colon. They are used for settings in this tutorial if you know how to write JSON you can basically transfer what you know.

In [26]:
my_dict = {"setting_name": 1, "another_setting_name": "another value"}

we access / assign / modify a value in a dict using square brackets

In [27]:
b = my_dict["setting_name"]
my_dict["a_third_setting"] = True

A function has a name and round brackets, in which you input arguments by position and/or by name. It can produce one or more *return values* that can be assigned to variables

In [28]:
print("Output")  # no return value

Output


In [29]:
result = dict(name="Alice", age=30, city="New York")
print(result)  # we can use variables as arguments

{'name': 'Alice', 'age': 30, 'city': 'New York'}


We get more functionality / functions by importing them from installed packages

In [30]:
import numpy as np
import shutil
from pathlib import Path

# Using XHARPy
First we will import the functions needed to road the data, execute the refinement and write the results back to disk. At the first time you import something you should get a warning about JAX using the CPU. This is expected (otherwise you need a 64 bit capable GPU, and refinement on the GPU is untested).

In [31]:
# import functions for reading the intensities
from xharpy import shelxl_hkl2pd  # For a SHELX HKL 4 (that needs to be merged !!!!!!)
from xharpy import xd_hkl2pd  # For a XD hkl containing intensities
from xharpy import fcf2hkl_pd # For reading the observed intensities from an fcf4 (do not correct the extinction against the IAM).

# import function to read other data
from xharpy import cif2data, lst2constraint_dict

# import functions needed for refinement
from xharpy import create_construction_instructions, refine

# import function to write out the results
from xharpy import write_cif, write_res, write_fcf, add_density_entries_from_fcf

# import functions to create a tsc file
from xharpy import cif2tsc

# Path is a really convenient way to work with pathes in python. (should already be present from example above)
from pathlib import Path

All functions within XHARPy should have a working in-function documentation (so a docstring), which you can access via python's help function.

In [32]:
help(refine)

Help on function refine in module xharpy.refine:

refine(cell: jax.Array, symm_mats_vecs: Tuple[jax.Array, jax.Array], hkl: pandas.core.frame.DataFrame, construction_instructions: List[xharpy.structure.common.AtomInstructions], parameters: jax.Array, wavelength: Optional[float] = None, refinement_dict: dict = {}, computation_dict: dict = {}) -> Tuple[jax.Array, jax.Array, Dict[str, Any]]
    Refinement routine. The routine will refine for the given intensities
    against wR2(F^2).

    Parameters
    ----------
    cell : jnp.ndarray
        array with the lattice constants (Angstroem, Degree)
    symm_mats_vecs : Tuple[jnp.ndarray, jnp.ndarray]
        size (K, 3, 3) array of symmetry matrices and (K, 3) array of translation
        vectors for all symmetry elements in the unit cell
    hkl : pd.DataFrame
        pandas DataFrame containing the reflection data. Needs to have at least
        five columns: h, k, l, intensity, weight. Alternatively weight can be
        substituted by 

## Refining with Quantum Espresso: CaF<sub>2</sub>
Note that there is no logic between the assignment of calculation program for the density used in this tutorial: The aim is two show the two backend-engines and two different structures.

Let us start to work on the first dataset by creating references to the folders needed

In [33]:
folder_caf2 = Path("CaF2")
output_folder_caf2 = folder_caf2 / "xharpy_output"
output_folder_caf2.mkdir(exist_ok=True)

### Loading the data and merging the hkl
There is a lot of different objects that will be created from the cif file using the `cif2data` function.

In [34]:
atom_table_caf2, cell_caf2, cell_esd_caf2, symm_mats_vecs_caf2, symm_strings_caf2, wavelength_caf2 = cif2data(
    folder_caf2 / "CaF2.cif", 0
)

First it is instructive to take a quick look at the imported `atom_table_caf2`. This an object which has a nice representation in the Jupyter notebook (for reference: it is a pandas DataFrame).

In [35]:
atom_table_caf2

Unnamed: 0,label,type_symbol,fract_x,fract_y,fract_z,U_iso_or_equiv,adp_type,occupancy,site_symmetry_order,calc_flag,...,U_23,U_13,U_12,U_11_esd,U_22_esd,U_33_esd,type_description,type_scat_dispersion_real,type_scat_dispersion_imag,type_scat_source
0,Ca01,Ca,0.5,0.5,0.5,0.00331,Uani,0.020833,48,d,...,0.0,0.0,0.0,5e-05,5e-05,5e-05,Ca,0.1378,0.1937,International Tables Vol C Tables 4.2.6.8 and ...
1,F002,F,0.25,0.75,0.75,0.00491,Uani,0.041667,24,d,...,0.0,0.0,0.0,8e-05,8e-05,8e-05,F,0.006,0.0061,International Tables Vol C Tables 4.2.6.8 and ...


We can load and merge the hkl, check that the space group is correct. XHARPy will execute with unmerged reflections. However the refinement will get slow and the result will not be correct.

In [47]:
hkl_caf2 = fcf2hkl_pd(folder_caf2/  'CaF2.fcf')

The final file, we need is an SHELXL lst file for the special position constraints. These can also be generated manually (as described [here](https://xharpy.readthedocs.io/en/latest/library/library_symm_con.html)). However creating them from an lst is more convenient. For reference, this is what this function is looking for. It might be possible to recreate this without ShelXl (make sure to create the spaces around the `*` signs): 
```
 Special position constraints for Ca01
 x =  0.5000   y =  0.5000   z =  0.5000   U22 = 1.0 * U11   U33 = 1.0 * U11   
 U23 = 0   U13 = 0   U12 = 0   sof = 0.02083   
 Input constraints retained (at least in part) for  sof

 Special position constraints for F002
 x =  0.2500   y =  0.7500   z =  0.7500   U22 = 1.0 * U11   U33 = 1.0 * U11   
 U23 = 0   U13 = 0   U12 = 0   sof = 0.04167   
 Input constraints retained (at least in part) for  sof
```

In [37]:
constraints_caf2 = lst2constraint_dict(folder_caf2 / 'CaF2.lst')

### Setting up the refinement

In [38]:
refinement_dict_caf2 = {
    'f0j_source': 'qe',
    'reload_step': 1,
    'core': 'constant', # should be constant, for very low grid sizes the scaling between core and valence density can be refines using scale.
}

In [39]:
construction_instructions_caf2, parameters_caf2 = create_construction_instructions(
    atom_table=atom_table_caf2,
    constraint_dict=constraints_caf2,
    refinement_dict=refinement_dict_caf2
)

In [40]:
computation_dict_caf2 = {
    'symm_equiv': 'once', # XHARPy specific
    'mpicores': 2, # Sets the number of mpi cores. Make sure that n(MPI) * n(OMP) < N(Threads)
    'control': {
        'prefix': 'CaF2',
        'pseudo_dir': './CaF2/pseudo/',
    },
    'system': {
        'ibrav': 1,
        'a': float(cell_caf2[0]),
        'ecutwfc': 100,
        'ecutrho': 400,
    },
    'paw_files': {
        'Ca': 'Ca.paw.upf',
        'F': 'F.paw.upf',
    },
    'k_points':{
        'mode': 'automatic',
        'input': '1 1 1 0 0 0'
    }
}

In [41]:
parameters_caf2, var_cov_mat_caf2, information_caf2 = refine(
    cell=cell_caf2, 
    symm_mats_vecs=symm_mats_vecs_caf2,
    hkl=hkl_caf2,
    construction_instructions=construction_instructions_caf2,
    parameters=parameters_caf2,
    wavelength=wavelength_caf2,
    refinement_dict=refinement_dict_caf2,
    computation_dict=computation_dict_caf2
)

Started refinement at  2025-03-31 18:01:01.366118
Preparing
  calculating core density for Ca from Ca.paw.upf
  calculating core density for F from F.paw.upf
  calculating first atomic form factors
  theoretical calculation started at  2025-03-31 18:01:01.433969
  convergence has been achieved with energy of -1011.09002157 Ry
  theoretical calculation ended at  2025-03-31 18:02:49.362487
  partitioning started at  2025-03-31 18:02:49.362596


  self.pid = os.fork()
  self.pid = os.fork()
  self.pid = os.fork()
  self.pid = os.fork()


  partitioning ended at  2025-03-31 18:03:16.231424
  building least squares function
  setting up gradients
step 0: Optimizing scaling
  wR2: 0.041019, number of iterations: 3
  minimizing least squares sum
  wR2: 0.028286, number of iterations: 9
step 1: atom_positions are converged. No new structure factor calculation.
  minimizing least squares sum
  wR2: 0.028286, number of iterations: 0
step 2: atom_positions are converged. No new structure factor calculation.
  minimizing least squares sum
  wR2: 0.028286, number of iterations: 0
step 3: atom_positions are converged. No new structure factor calculation.
  minimizing least squares sum
  wR2: 0.028286, number of iterations: 0
step 4: atom_positions are converged. No new structure factor calculation.
  minimizing least squares sum
  wR2: 0.028286, number of iterations: 0
step 5: atom_positions are converged. No new structure factor calculation.
  minimizing least squares sum
  wR2: 0.028286, number of iterations: 0
step 6: atom_pos

In [42]:
write_cif(
    output_cif_path=output_folder_caf2 / 'xharpy.cif',
    cif_dataset='xharpy',
    shelx_cif_path=folder_caf2 / 'CaF2.cif',
    shelx_dataset=0,
    cell=cell_caf2,
    cell_esd=cell_esd_caf2,
    symm_mats_vecs=symm_mats_vecs_caf2,
    hkl=hkl_caf2,
    construction_instructions=construction_instructions_caf2,
    parameters=parameters_caf2,
    var_cov_mat=var_cov_mat_caf2,
    refinement_dict=refinement_dict_caf2,
    computation_dict=computation_dict_caf2,
    information=information_caf2
)



In [43]:
write_fcf(
    fcf_path=output_folder_caf2 / 'xharpy.fcf',
    fcf_dataset='xharpy',
    fcf_mode=4,
    cell=cell_caf2,
    hkl=hkl_caf2,
    construction_instructions=construction_instructions_caf2,
    parameters=parameters_caf2,
    wavelength=wavelength_caf2,
    refinement_dict=refinement_dict_caf2,
    symm_strings=symm_strings_caf2,
    information=information_caf2,
);

In [44]:
write_fcf(
    fcf_path=output_folder_caf2 / 'xharpy_6.fcf',
    fcf_dataset='xharpy_6',
    fcf_mode=6,
    cell=cell_caf2,
    hkl=hkl_caf2,
    construction_instructions=construction_instructions_caf2,
    parameters=parameters_caf2,
    wavelength=wavelength_caf2,
    refinement_dict=refinement_dict_caf2,
    symm_strings=symm_strings_caf2,
    information=information_caf2,
);

In [45]:
write_res(
    out_res_path=output_folder_caf2 / 'xharpy_6.res',
    in_res_path=folder_caf2 / 'CaF2.lst',
    cell=cell_caf2,
    cell_esd=cell_esd_caf2,
    construction_instructions=construction_instructions_caf2,
    parameters=parameters_caf2,
    wavelength=wavelength_caf2
)

In [46]:
add_density_entries_from_fcf(
    output_folder_caf2 /  'xharpy.cif',
    str(output_folder_caf2 /  'xharpy_6.fcf')
)

## Refining with GPAW: Ice VI

## Working outside of Jupyter