In [1]:
from mpi4py import MPI
import coqui

# Create CoQui MPI handler and set logging verbosity in the beginning 
coqui_mpi = coqui.MpiHandler()
coqui.set_verbosity(coqui_mpi, output_level=1)

--------------------------------------------------------------------------
Ignoring value for oob_tcp_if_exclude on ccqlin065 (10.250.112.0/20: Did not find interface matching this subnet).
(You can safely ignore this message.)
--------------------------------------------------------------------------


# Preparing CoQuí inputs from DFT outputs

<figure style="text-align: center;"> <img src="../images/coqui_workflow_wan90.png" alt="Workflow of CoQui" width="60%"> </figure>

As shown in the highlighted portion of the CoQuí workflow, this notebook introduces the **entry point of a CoQuí calculation**: how to prepare the inputs for many-body simulations in CoQuí.  

To set up a many-body calculation in CoQuí, two kinds of information are essential:
1. **Crystal metadata**, such as the k-point mesh and unit cell information.
2. **Single-particle Bloch orbitals** that provide the basis for representing the interacting problem.

The single-particle orbitals in (1) are most commonly obtained from a mean-field calculation such as density functional theory (DFT), Hartree–Fock (HF), or another single-particle method. These orbitals provide both a **physically motivated basis** for the material of interest and, in many-body perturbation theory, a **reference state** for perturbative expansions. 

> 💡 Alternatively, one may also start from a **pre-optimized localized basis** (e.g. Gaussian-type orbitals) or other basis sets tailored to the problem at hand. CoQuí is flexible in this regard: what matters is that the chosen orbitals and crystal metadata are expressed in the standardized format understood by CoQuí.

Often, a subset of these Bloch orbitals is further transformed into **maximally localized Wannier functions (MLWFs)** [[1](https://journals.aps.org/prb/abstract/10.1103/PhysRevB.65.035109), [2](https://journals.aps.org/prb/abstract/10.1103/PhysRevB.56.12847)]. MLWFs serve two crucial roles:
- They give a **compact real-space representation** that allows **interpolation** of spectral properties to dense k-meshes.  
- They define **physically motivated subspaces** needed for embedding methods such as DMFT or EDMFT.

In practice, a practical obstacle is that each electronic-structure package (QE, VASP, PySCF, …) stores these quantities in its own format. CoQuí therefore provides **custom converters** that translate external outputs into a CoQuí-readable format. Currently, [Quantum ESPRESSO](https://www.quantum-espresso.org) (QE) and [PySCF](https://pyscf.org) are supported. In this tutorial, we focus on the QE interface.

### What this notebook covers

1. **Converting DFT outputs into a CoQuí-readable format**  
   – using the `pw2coqui.x` interface to translate Quantum ESPRESSO (QE) results into a standardized HDF5 file.
   
3. **Declaring a physical system with `coqui.Mf`**  
   – the `Mf` class encapsulates the essential single-particle and structural information (orbitals, k-mesh, unit cell) that declare the system in CoQuí.  
4. **Constructing MLWFs with CoQuí’s Wannier90 interface**  
   
### Learning Goals
By the end of this notebook, you will be able to:
- Convert QE `scf`/`nscf` results into a CoQuí-ready HDF5 file using `pw2coqui.x`.
- Initialize and inspect a `coqui.Mf` object to declare a problem.
- Construct MLWFs with CoQuí’s Wannier90 interface.


### Prerequisites
- Basic familiarity with writing and running input files for [Quantum ESPRESSO](https://www.quantum-espresso.org) and [Wannier90](https://wannier90.readthedocs.io/en/latest/).

> 💡 For more on QE, see the [official tutorials](https://www.quantum-espresso.org/tutorials/).  
> 💡 For more on Wannier90, see the [documentation](https://wannier90.readthedocs.io/en/latest/) and [tutorials](https://docs.epw-code.org/_downloads/f2f39ac545ca12dc2838f7359b3633a9/Mon.4.Pizzi.pdf).

## 🔹 Converting QE outputs for CoQuí

<figure style="text-align: center;"> 
    <img src="../images/coqui_dft_converter.png" alt="Workflow of input preparation for CoQuí" width="60%"> 
</figure>

Here we show how to convert the Quantum ESPRESSO (QE) outputs into a format recognized by CoQuí using the [customized converter **`pw2coqui.x`**](https://github.com/AbInitioQHub/coqui/tree/main/qe_converter). 

### ↔️ Running the QE converter 
Starting from a finished QE scf/nscf run, execute the converter: 
```bash
pw2coqui.x -in {prefix}.pw2coqui.in
```
The input file `{prefix}.pw2coqui.in` requires only minimal parameters: 
``` text
&input_pw2coqui
  prefix = "{prefix}"
  outdir = "{outdir}"
/
```
where `{prefix}` and `{outdir}` must match those used in the QE scf/nscf calculations.

### 📂 Output and what it contains
Running the converter produces:

- **`{prefix}.coqui.h5`** – a standardized HDF5 file containing crystal metadata (k-point mesh, lattice vectors, pseudopotential info, etc.).  

In addition, CoQuí expects access to the QE wavefunction directory created earlier by the `nscf` run:

- **`{outdir}/{prefix}.save/`** – produced by QE itself, this directory stores the Kohn–Sham (KS) orbitals $\phi^{\mathbf{k}}_i(\mathbf{r})$ in the `wfc*.hdf5` files. These orbitals serve directly as the **Bloch single-particle basis**, the other essential inputs to a CoQuí calculation. 

Together, these two pieces form the **complete CoQuí input**.

> ⚠️ **Important**  
> Do **not** delete or move the {prefix}.save/ tree after conversion. CoQuí needs both the lightweight .coqui.h5 metadata file and the heavy orbital data in {prefix}.save/. Removing either will break subsequent steps.

## 🔹 Declaring a physical system using `Mf` class

<figure style="text-align: center;"> <img src="../images/coqui_workflow_mf.png" alt="Workflow of CoQui's Wannier90 interface" width="60%"> <figcaption><em>Figure&nbsp;1:</em> Workflow of CoQuí's Wannier90 interface. </figcaption> </figure>

Every CoQuí calculation begins by declaring a *physical system*. This specifies the ingredients required to construct the non-interacting Hamiltonian:
$$
(H_{0})^{\textbf{k}}_{ij} = \int d\textbf{r} \phi^{\textbf{k}*}_{i}(\textbf{r}) \Big [ \frac{\nabla^{2}}{2} +  V_{\mathrm{ext}}(\textbf{r}) \big] \phi^{\textbf{k}}_{j}(\textbf{r})
$$

To build this object, CoQuí needs the **two essential inputs introduced earlier**:
1. **Crystal metadata** — structure, pseudopotentials, k-mesh, cutoffs, etc.  
2. **Single-particle basis functions** $\phi^{\mathbf{k}}_{i}(\mathbf{r})$ — the orbitals used to expand the many-body Hamiltonian (Kohn–Sham orbitals, Gaussian-type orbitals, grid-based functions, …).

### 🧱 The `Mf` container

In CoQuí, the declaration of a simulated system is encapsulated in the **`Mf` class**, one of CoQuí’s core building blocks. It is a read-only container that standardizes the interface between external DFT codes and CoQuí: 
```python
# Mf for the target system
params = {
  "prefix": "svo",                        # QE prefix (matches {prefix}.save)
  "outdir": "data/qe_inputs/svo/222/out", # QE outdir containing {prefix}.save/
  "nbnd": 40                              # number of bands read from QE outputs
}
mf = coqui.make_mf(coqui_mpi, params=params, mf_type="qe")
```
The construction of `Mf` is done via: 
```python
coqui.make_mf(mpi: coqui.MpiHandler, params: dict, mf_type: str) -> coqui.Mf
```
with arguments:
- `mpi`: an `MpiHandler` object managing MPI context
- `params`: dictionary of parameters used for the construction of `Mf`.
- `mf_type`: string specfying the DFT backend. Currently, CoQuí supports: 
  -  [Quantum ESPRESSO](https://www.quantum-espresso.org): `mf_type = "qe"`
  -  [PySCF](https://pyscf.org): `mf_type = "pyscf"` 

**Key parameters:**
- `params["prefix"]` - the `prefix` used in Quantum ESPRESSO scf/nscf calculations.
- `params["outdir"]` - the `outdir` used in Quantum ESPRESSO scf/nscf calculations. This, together with `prefix`, tells CoQuí where to look for the DFT output files.
- `params["nbnd"]` — number of bands imported from DFT outputs (affects cost/accuracy).

Once `Mf` is created, all subsequent CoQuí routines become agnostic to how the mean-field data was generated. At this point, you have declared your system, and the same downstream interface applies regardless of the underlying DFT code.
  
> 💡**Note**: Once initialized, the `Mf` object is **read-only** and can be **reused** across the full range of CoQuí features — GW calculations, embedding routines, post-processing, etc.

> 💡**Note**: The resulting `Mf` retains the `MpiHandler` passed to `make_mf`, so any function that receives `Mf` as an argument will automatically have access to the same `MpiHandler`.

### ▶️ Hands-on 1: Declare Your First Mean-Field Container of CoQuí

Now it’s your turn to create an `Mf` object for **SrVO$_3$**. Copy and run the Python snippet above to initialize the container. Then:  

1. **Describe the systmes from the log ouptut**. How many KS orbitals read, number of k-points, number of spins, etc.
2. **Cross-check with QE input/output (optional)**  
   - Open `data/qe_inputs/svo/222/out/svo.nscf.in` and `data/qe_inputs/svo/222/out/svo.nscf.out`.  
   - Verify that the information reported by CoQuí matches the original QE settings. 
3. **Experiment with the `"nbnd"` parameter**  
   - Remove `"nbnd"` from the parameter dictionary. This sets `Mf` to include *all* KS orbitals available from QE. What is that number?  
   - Reintroduce `"nbnd"` with different values within this range. Does the log correctly reflect the number of orbitals included?
  
> 💡 The `"nbnd"` parameter provides a convenient way to restrict the basis size without rerunning the DFT calculation.

In [2]:
# Mf for the target system
params = {
  "prefix": "svo",                        # QE prefix (matches {prefix}.save)
  "outdir": "data/qe_inputs/svo/222/out", # QE outdir containing {prefix}.save/
  "nbnd": 40                              # number of bands read from QE outputs
}
mf = coqui.make_mf(coqui_mpi, params=params, mf_type="qe")

# remove "nbnd"
del params["nbnd"]
mf = coqui.make_mf(coqui_mpi, params=params, mf_type="qe")

# set "nbnd" = 5
params["nbnd"] = 5
mf = coqui.make_mf(coqui_mpi, params=params, mf_type="qe")

  Quantum ESPRESSO reader
  -----------------------
  Number of spins                = 1
  Number of polarizations        = 1
  Number of bands                = 40
  Monkhorst-Pack mesh            = (2,2,2)
  K-points                       = 8 total, 4 in the IBZ
  Number of electrons            = 41.0
  Electron density energy cutoff = 360.000 a.u. | FFT mesh = (45,45,45)
  Wavefunction energy cutoff     = 51.704 a.u. | FFT mesh = (23,23,23), Number of PWs = 6859

  Quantum ESPRESSO reader
  -----------------------
  Number of spins                = 1
  Number of polarizations        = 1
  Number of bands                = 100
  Monkhorst-Pack mesh            = (2,2,2)
  K-points                       = 8 total, 4 in the IBZ
  Number of electrons            = 41.0
  Electron density energy cutoff = 360.000 a.u. | FFT mesh = (45,45,45)
  Wavefunction energy cutoff     = 51.704 a.u. | FFT mesh = (23,23,23), Number of PWs = 6859

  Quantum ESPRESSO reader
  -----------------------
  Numbe

## 🔹 Construct MLWFs via CoQuí's the Wannier90 interface

<figure style="text-align: center;"> <img src="../images/coqui_workflow_wan90.png" alt="Workflow of CoQui's Wannier90 interface" width="60%"> <figcaption><em>Figure&nbsp;1:</em> Workflow of CoQuí's Wannier90 interface. </figcaption> </figure>

Maximally localized Wannier functions (MLWFs) [[1](https://journals.aps.org/prb/abstract/10.1103/PhysRevB.65.035109).[2](https://journals.aps.org/prb/abstract/10.1103/PhysRevB.56.12847)] are a key intermediate between different components in CoQuí. They serve two essential purposes:
- Providing a **compact real-space representation** that allows **interpolation** of spectral properties to dense k-meshes.  
- Defining **physically motivated subspaces** needed for embedding methods such as DMFT or EDMFT.

In practice, MLWFs are usually obtained by **coupling DFT codes with Wannier90**, either in standalone mode (through file-based workflows with `.win`, `.amn`, `.mmn` files) or in library mode (where Wannier90 routines are called directly by the host code). 

**CoQuí instead provides a direct Python interface to Wannier90.** 
This lets you run the entire Wannierization procedure directly inside a CoQuí script. It keeps the workflow unified, avoids redundant file handling, and ensures that MLWF construction integrates seamlessly with later CoQuí steps. Because the interface is not tied to a specific DFT code, it also enables MLWFs to be constructed from electronic structures obtained *beyond DFT* within CoQuí.

---
### Example: Calling Wannier90 from CoQuí
```python
from mpi4py import MPI
import coqui

# Create CoQui MPI handler and set logging verbosity 
coqui_mpi = coqui.MpiHandler()
coqui.set_verbosity(coqui_mpi, output_level=1)

# 1) Build a mean-field handle from existing QE outputs
params = {
  "prefix": "svo",                        # QE prefix (matches {prefix}.save)
  "outdir": "data/qe_inputs/svo/222/out", # QE outdir containing {prefix}.save/
  "nbnd": 40                              # number of bands read from QE outputs
}
mf = coqui.make_mf(coqui_mpi, params=params, mf_type="qe")

# 2) Run Wannier90 through CoQuí
w90_params = {
  "prefix": "svo",     # equivalent to wannier90's seedname 
}
coqui.wannier90(mf=mf, params=w90_params)
```
In this example, we first declare the **simulated system** (`Mf`) from QE outputs, then call `coqui.wannier90`. This routine runs Wannier90 in library mode, performs the Wannierization, and stores the MLWFs in a CoQuí HDF5 archive.

#### ❓ **What happens within `coqui.wannier90`?**  
Under the hood, CoQuí executes the standard Wannier90 library-mode workflow:
1. **Preprocessing** – read the `{prefix}.win` file and initialize internal data structures (wannier_setup).
2. **Interface step** – compute overlaps and projections between Bloch states on neighboring k-points (the role of `pw2wannier90.x` in file-based mode).
3. **Disentanglement (if requested)** – optimize the subspace when more bands are present than needed for Wannierization.
4. **Localization** – iteratively minimize the spread functional to obtain maximally localized Wannier functions.
5. **Standard Wannier90 outputs** – a collection of text/binary files (`*.wout`, `*.chk`, etc.) are produced, just as in standalone mode.

**CoQuí adds one more step:**  
6. **HDF5 export** – the Wannier90 outputs are automatically read back and stored in a **standardized HDF5 archive**. This HDF5 serves as the **canonical format** for subsequent CoQuí routines, ensuring that MLWF data can be passed seamlessly to interpolation, cRPA, DMFT/EDMFT, and other workflows.

>💡**Note**: The `{prefix}.win` file must be prepared in advance by the user.  
> It only needs to contain the **Wannierization parameters** (e.g. number of Wannier functions, projections, excluded bands, disentanglement settings).  
> Crystal metadata such as lattice vectors, and k-points are already read from `coqui.Mf` and don't need to be repeated in the `{prefix}.win` file.

## ▶️ Hands-on 2: MLWFs of SrVO$_{3}$

Construct the MLWFs via CoQuí's Wannier90 interface for SrVO$_3$. 

Step 1: Copy the `.win` file to the current directory by executing the cell below. 

In [3]:
%%bash
cp data/qe_inputs/svo/222/mlwf/svo.win .

Step 2: Execute the following Python snippet to call Wannier90. 
```python
# 1) Build a mean-field handle from existing QE outputs
params = {
  "prefix": "svo",                    # QE prefix (matches {prefix}.save)
  "outdir": "data/qe_inputs/svo/222/out", # QE outdir containing {prefix}.save/
  "nbnd": 40                          # number of bands read from QE outputs
}
mf = coqui.make_mf(coqui_mpi, params=params, mf_type="qe")

# 2) Run Wannier90 through CoQuí
w90_params = {
  "prefix": "svo",     # equivalent to wannier90's seedname 
}
coqui.wannier90(mf=mf, params=w90_params)
```
1. How many MLWFs do you get?
2. What are their spreads and centers? 

In [4]:
# 1) Build a mean-field handle from existing QE outputs
params = {
  "prefix": "svo",                    # QE prefix (matches {prefix}.save)
  "outdir": "data/qe_inputs/svo/222/out", # QE outdir containing {prefix}.save/
  "nbnd": 40                          # number of bands read from QE outputs
}
mf = coqui.make_mf(coqui_mpi, params=params, mf_type="qe")

# 2) Run Wannier90 through CoQuí
w90_params = {
  "prefix": "svo",     # equivalent to wannier90's seedname 
}
coqui.wannier90(mf=mf, params=w90_params)

  Quantum ESPRESSO reader
  -----------------------
  Number of spins                = 1
  Number of polarizations        = 1
  Number of bands                = 40
  Monkhorst-Pack mesh            = (2,2,2)
  K-points                       = 8 total, 4 in the IBZ
  Number of electrons            = 41.0
  Electron density energy cutoff = 360.000 a.u. | FFT mesh = (45,45,45)
  Wavefunction energy cutoff     = 51.704 a.u. | FFT mesh = (23,23,23), Number of PWs = 6859

*************************************************
       Running Wannier90 in library-mode         
*************************************************

 *---------------------------------- K-MESH ----------------------------------*
 +----------------------------------------------------------------------------+
 |                    Distance to Nearest-Neighbour Shells                    |
 |                    ------------------------------------                    |
 |          Shell             Distance (Ang^-1)          Mu

Step 3: Execute the cell below to clean up Wannier90 outputs 

In [6]:
%%bash
# clean up Wannier90 outputs
if ls svo_* 1> /dev/null 2>&1; then
    rm svo_*
fi

## ▶️ Hands-on 3: MLWFs of SrVO$_{3}$ with large energy window

In this exercise you will modify `svo.win` to construct MLWFs for both V $d$ and O $p$ states, using a **large energy window** with disentanglement. 

Modify `svo.win` to construct MLWFs for V $d$ + O $p$ in a large energy window. 
**Steps**  
1. Edit `svo.win` to include the O $p$ bands and the full V $d$ shell.  
2. Run `coqui.wannier90` again to generate the new MLWFs.  
3. Compare the results with the MLWFs obtained from a smaller energy window.  

**Checklist**
- The spreads of the MLWFs should be **noticeably smaller** than those from the small-window case.  
- Each oxygen atom should have **three well-localized $p$ orbitals**

> 💡 Note: The positions of the oxygen atoms can be found in `data/qe_inputs/svo/222/out/svo.nscf.out`

In [7]:
%%bash
cp data/qe_inputs/svo/222/mlwf_dp/svo.win .

In [8]:
# 1) Build a mean-field handle from existing QE outputs
params = {
  "prefix": "svo",                    # QE prefix (matches {prefix}.save)
  "outdir": "data/qe_inputs/svo/222/out", # QE outdir containing {prefix}.save/
  "nbnd": 40                          # number of bands read from QE outputs
}
mf = coqui.make_mf(coqui_mpi, params=params, mf_type="qe")

# 2) Run Wannier90 through CoQuí
w90_params = {
  "prefix": "svo",     # equivalent to wannier90's seedname 
}
coqui.wannier90(mf=mf, params=w90_params)

  Quantum ESPRESSO reader
  -----------------------
  Number of spins                = 1
  Number of polarizations        = 1
  Number of bands                = 40
  Monkhorst-Pack mesh            = (2,2,2)
  K-points                       = 8 total, 4 in the IBZ
  Number of electrons            = 41.0
  Electron density energy cutoff = 360.000 a.u. | FFT mesh = (45,45,45)
  Wavefunction energy cutoff     = 51.704 a.u. | FFT mesh = (23,23,23), Number of PWs = 6859

*************************************************
       Running Wannier90 in library-mode         
*************************************************

 *---------------------------------- K-MESH ----------------------------------*
 +----------------------------------------------------------------------------+
 |                    Distance to Nearest-Neighbour Shells                    |
 |                    ------------------------------------                    |
 |          Shell             Distance (Ang^-1)          Mu

In [9]:
%%bash
# clean up Wannier90 outputs
if ls svo_* 1> /dev/null 2>&1; then
    rm svo_*
fi