# YamboConvergence: automated GW/BSE convergence

The highest level workflow is represented by the ``YamboConvergence`` workchain, 
which implements the full automation of the convergence algorithm described in [Bonacci, M., Qiao, J., Spallanzani, N. et al. Towards high-throughput many-body perturbation theory: efficient algorithms and automated workflows. npj Comput Mater 9, 74 (2023)](https://doi.org/10.1038/s41524-023-01027-2). 
Simulations are organized on the fly, without any external user intervention. 
The purpose of this new proposed convergence algorithm is to obtain an accurate converged 
result doing the least possible number of calculations. This is possible if a reliable description of the convergence space is achieved, resulting also in a 
precise guess for the converged point, i.e. the converged parameters. The description of the space is performed by fitting some calculations that the workchain runs. 
A simple functional form of the space is assumed:

$f(\textbf{x}) = \prod_i^N \left( \frac{A_i}{x_i^{\alpha_i}} + b_i \right)$

In this way it is straightforward to compute first and second partial derivatives, and impose constraints on them to find the converged region of the space. 
The algorithm is specifically designed to solve the coupled convergence between 
summation over empty states (``BndsRnXp`` or ``BndsRnXs`` and ``GbndRnge`` for example) and PW expansion (``NGsBlkXp`` or ``NGsBlkXs``), but it can be used also to 
accelerate convergence tests with respect to the ``k-point mesh`` or ``FFTGvecs``, as we shall see later.

Each simulation is performed by calling the `YamboWorkflow` workchain, ensuring the automation at all levels.

In [2]:
from aiida import orm, load_profile
load_profile()

Profile<uuid='b35700dae723411ea16ebc82d58f16bc' name='mb'>

In [3]:
from aiida.plugins import WorkflowFactory

In [4]:
YamboConvergence = WorkflowFactory('yambo.yambo.yamboconvergence')

## minimal inputs needed for protocols

In [5]:
options = {
    'pwcode_id': 'pw-7.1@hydralogin', 
    'pseudo_family':"PseudoDojo/0.4/PBE/SR/standard/upf",
    'yamboprecode_id':'p2y-5.1@hydralogin',
    'yambocode_id':'yambo-5.1@hydralogin',
    'protocol':'fast',
    #'parent_id':274, #not necessary to set; take your previously nscf id (pk) to skip the DFT part.
    'structure_id':161,
}

In [7]:
from aiida_quantumespresso.common.types import ElectronicType

In [8]:
builder = YamboConvergence.get_builder_from_protocol(
            pw_code = options['pwcode_id'],
            preprocessing_code = options['yamboprecode_id'],
            code = options['yambocode_id'],
            protocol=options['protocol'],
            protocol_qe=options['protocol'],
            structure=orm.load_node(options['structure_id']),
            overrides={},
            #parent_folder=load_node(options['parent_id']).outputs.remote_folder,
            electronic_type=ElectronicType.INSULATOR, #default is METAL: smearing is used
            calc_type='gw', #or 'bse'; default is 'gw'
)

Summary of the main inputs:
BndsRnXp = 200
GbndRnge = 200
NGsBlkXp = 6 Ry
FFTGvecs = 18 Ry


kpoint mesh for nscf: [6, 6, 2]




In [9]:
#You can also try different protocols:
    
YamboConvergence.get_available_protocols()

{'fast': {'description': 'Fast protocol for a GW convergence: grid -> poor; thresholds -> poor'},
 'moderate': {'description': 'Moderate protocol for a GW convergence: grid -> enough good for standard materials; thresholds -> moderate (5 percent)'},
 'precise': {'description': 'precise protocol for a GW convergence: grid -> same as moderate; thresholds -> precise (1 percent)'},
 'molecule': {'description': 'Moderate protocol for a GW convergence in molecules'}}

Now, if you inspect the prepopulated inputs, you can see the default values respecting the imposed protocol:

In [10]:
builder.ywfl.nscf.pw.parameters.get_dict()

{'CONTROL': {'calculation': 'nscf',
  'forc_conv_thr': 0.001,
  'tprnfor': True,
  'tstress': True,
  'etot_conv_thr': 0.0004},
 'SYSTEM': {'nosym': False,
  'occupations': 'fixed',
  'ecutwfc': 60.0,
  'ecutrho': 480.0,
  'force_symmorphic': True,
  'nbnd': 200},
 'ELECTRONS': {'electron_maxstep': 80,
  'mixing_beta': 0.4,
  'conv_thr': 1.6e-09}}

In [11]:
builder.ywfl.yres.yambo.parameters.get_dict()

{'arguments': ['dipoles', 'ppa', 'HF_and_locXC', 'gw0'],
 'variables': {'Chimod': 'hartree',
  'DysSolver': 'n',
  'GTermKind': 'BG',
  'X_and_IO_nCPU_LinAlg_INV': [1, ''],
  'NGsBlkXp': [6, 'Ry'],
  'FFTGvecs': [18, 'Ry'],
  'BndsRnXp': [[1, 200], ''],
  'GbndRnge': [[1, 200], ''],
  'QPkrange': [[[1, 1, 32, 32]], '']}}

### Computational resources

In [12]:
builder.ywfl.scf.pw.metadata.options = {
    'max_wallclock_seconds': 2*60*60, # in seconds
    'resources': {
            "num_machines": 1, # nodes
            "num_mpiprocs_per_machine": 16, # MPI per nodes
            "num_cores_per_mpiproc": 1, # OPENMP
        },
    'prepend_text': u"export OMP_NUM_THREADS="+str(1), # if needed
    #'account':'project_name',
    'queue_name':'s3par',
    #'qos':'',
}

builder.ywfl.nscf.pw.metadata.options = builder.ywfl.scf.pw.metadata.options
builder.ywfl.yres.yambo.metadata.options = builder.ywfl.scf.pw.metadata.options

### Overrides

It is possible to modify the default inputs also during the builder creation phase, so not a posteriori. This can be done by using overrides:

In [13]:
overrides_scf = {
        'pseudo_family': "PseudoDojo/0.4/PBE/SR/standard/upf", 
        'pw':{
            
        'metadata':{
                    'options':{
                    'max_wallclock_seconds': 60*60, # in seconds
                    'resources': {
                            "num_machines": 1, # nodes
                            "num_mpiprocs_per_machine": 16, # MPI per nodes
                            "num_cores_per_mpiproc": 1, # OPENMP
                        },
                    'prepend_text': u"export OMP_NUM_THREADS="+str(1), # if needed
                    #'account':'project_name',
                    'queue_name':'s3par',
                    #'qos':'',
                                    },
        },
        },
    }

overrides_nscf = {
        'pseudo_family': "PseudoDojo/0.4/PBE/SR/standard/upf", 
        'pw': {
            'parameters':{
                'CONTROL':{}, #not needed if you don't override something
                'SYSTEM':{},
                'ELECTRONS':{'diagonalization':'cg'},
            },
             'metadata':{
                    'options':{
                    'max_wallclock_seconds': 60*60, # in seconds
                    'resources': {
                            "num_machines": 1, # nodes
                            "num_mpiprocs_per_machine": 16, # MPI per nodes
                            "num_cores_per_mpiproc": 1, # OPENMP
                        },
                    'prepend_text': u"export OMP_NUM_THREADS="+str(1), # if needed
                    #'account':'project_name',
                    'queue_name':'s3par',
                    #'qos':'',
                                    },
        },
    },
}

overrides_yambo = {
        "yambo": {
            "parameters": {
                "arguments": [
                    "rim_cut",
                ],
                "variables": {
                    "NGsBlkXp": [4, "Ry"],
                    "FFTGvecs": [20, "Ry"],
                },
            },
        'metadata':{
                    'options':{
                    'max_wallclock_seconds': 60*60, # in seconds
                    'resources': {
                            "num_machines": 1, # nodes
                            "num_mpiprocs_per_machine": 16, # MPI per nodes
                            "num_cores_per_mpiproc": 1, # OPENMP
                        },
                    'prepend_text': u"export OMP_NUM_THREADS="+str(1), # if needed, i.e. in PBS/Torque 
                    #'account':'project_name',
                    'queue_name':'s3par',
                    #'qos':'',
                                    },
                    },
        },
    
}


### providing the convergence space

In [None]:

#Be careful with the mesh choice!!! 
overrides_meta = {
        'FFTGvecs': {
            'start_ratio': 0.25,
            'stop_ratio': 0.7,
            'delta_ratio': 0.1,
            'max_ratio': 1,
        },
        'bands': {
            'start': 50,
            'stop': 400,
            'delta': 50,
            'max': 600,
            'ratio':[10,25,50],
        },
        'G_vectors': {
            'start': 2,
            'stop': 8,
            'delta': 1,
            'max': 10,
        },
        'kpoint_density': {
            'start': 0.8,
            'stop': 0.2,
            'delta': 3,
            'max': 0.1,
        } ,
        'conv_thr_k': 10,
        'conv_thr_bG': 10,
        'conv_thr_FFT': 10,
        'conv_thr_units': '%', # 'eV'

        
    }

## providing additional convergence settings

In the following we provide additional convergence settings, namely:

- 'what': a list of quantities to be computed, following the same naming style of the `YamboWorkflow` additional parsing list;
- 'type': 'heavy', or 'cheap'; heavy keep the converged parameters as obtained in previous iterations; for example, if I convergence k-mesh and then bands, in the bands convergence we will use the converged k-mesh. This will make the calculation more and more computational demanding, but in the end we will obtain the true converged results, not only the converged parameters values.

In [None]:
overrides_wfl_settings = {
        
        'what':['gap_'],
        'type': 'heavy', #or cheap; heavy uses converged value for parameters that we are not converging in a given iteration.
        
    }

In [1]:
#setting the overrides dictionary.

overrides = {
        'meta_parameters':overrides_meta,
        'ywfl':{'scf':overrides_scf,'nscf':overrides_nscf,'yres':overrides_yambo},
        'workflow_settings':overrides_wfl_settings,
    }

NameError: name 'overrides_meta' is not defined

In [14]:
builder = YamboConvergence.get_builder_from_protocol(
            pw_code = options['pwcode_id'],
            preprocessing_code = options['yamboprecode_id'],
            code = options['yambocode_id'],
            protocol=options['protocol'],
            protocol_qe=options['protocol'],
            structure= orm.load_node(options['structure_id']),
            overrides=overrides,
            #parent_folder=load_node(options['parent_id']).outputs.remote_folder,
            electronic_type=ElectronicType.INSULATOR, #default is METAL: smearing is used
            calc_type='gw', #or 'bse'; default is 'gw'
)

family = orm.load_group("PseudoDojo/0.4/PBE/SR/standard/upf")
#builder.<sublevels_up_to .pw>.pseudos = family.get_pseudos(structure=structure) 
builder.ywfl.scf.pw.pseudos = family.get_pseudos(structure=orm.load_node(161)) 
builder.ywfl.nscf.pw.pseudos = family.get_pseudos(structure=orm.load_node(161)) 

Summary of the main inputs:
BndsRnXp = 200
GbndRnge = 200
NGsBlkXp = 4 Ry
FFTGvecs = 20 Ry


kpoint mesh for nscf: [6, 6, 2]




In [31]:
builder.parameters_space.get_list()

[{'max': 84,
  'var': ['FFTGvecs'],
  'stop': 58,
  'delta': 8,
  'start': 21,
  'steps': 4,
  'conv_thr': 10,
  'conv_thr_units': '%',
  'max_iterations': 4,
  'convergence_algorithm': 'new_algorithm_1D'},
 {'max': [30, 30, 10],
  'var': ['kpoint_mesh'],
  'stop': [16, 16, 6],
  'delta': [3, 3, 3],
  'start': [4, 4, 2],
  'steps': 4,
  'conv_thr': 10,
  'conv_thr_units': '%',
  'max_iterations': 4,
  'convergence_algorithm': 'new_algorithm_1D'},
 {'max': [600, 600, 10],
  'var': ['BndsRnXp', 'GbndRnge', 'NGsBlkXp'],
  'stop': [400, 400, 8],
  'delta': [50, 50, 1],
  'start': [80, 80, 2],
  'steps': 6,
  'conv_thr': 10,
  'conv_thr_units': '%',
  'max_iterations': 8,
  'convergence_algorithm': 'new_algorithm_2D'}]

### Parameter-dependent resources

As you can imagine, increasing the parameters of the simulations may require also the change of the related computational resources, in order to be able to successfully perform them. 
Before the submission of the WorkChain, we can provide instructions on how to continously adapt the resources when parameters are changing.

In particular, we provide two dictionaries:
- parallelism instructions
- explicit resources instructions

In [2]:
dict_para_medium = {}
dict_para_medium['X_and_IO_CPU'] = '2 1 1 8 1'
dict_para_medium['X_and_IO_ROLEs'] = 'q k g c v'
dict_para_medium['DIP_CPU'] = '1 16 1'
dict_para_medium['DIP_ROLEs'] = 'k c v'
dict_para_medium['SE_CPU'] = '1 2 8'
dict_para_medium['SE_ROLEs'] = 'q qp b'

dict_res_medium = {
        "num_machines": 1,
        "num_mpiprocs_per_machine":8,
        "num_cores_per_mpiproc":2,
    }

dict_para_high = {}
dict_para_high['X_and_IO_CPU'] = '2 1 1 8 1' 
dict_para_high['X_and_IO_ROLEs'] = 'q k g c v'
dict_para_high['DIP_CPU'] = '1 16 1'
dict_para_high['DIP_ROLEs'] = 'k c v'
dict_para_high['SE_CPU'] = '1 2 8'
dict_para_high['SE_ROLEs'] = 'q qp b'

dict_res_high = {
        "num_machines": 1,
        "num_mpiprocs_per_machine":16,
        "num_cores_per_mpiproc":1,
    }

parallelism_instructions_manual = orm.Dict(dict={'manual' : {                                                            
                                                            'std_1':{
                                                                    'BndsRnXp':[1,100], #range for bands where to use the dict_para_medium and dict_res_medium instructions.
                                                                    'NGsBlkXp':[2,18],
                                                                    'parallelism':dict_para_medium,
                                                                    'resources':dict_res_medium,
                                                                    },
                                                            'std_2':{
                                                                    'BndsRnXp':[101,1000],
                                                                    'NGsBlkXp':[2,18],
                                                                    'parallelism':dict_para_high,
                                                                    'resources':dict_res_high,
                                                                    },}})

NameError: name 'orm' is not defined

We can just provide, together with the resources, the `mode` which yambo will use to automatically set up its parallelism, if we are not sure on how to decide the explicit parallelism instructions.

In [None]:
parallelism_instructions_auto = orm.Dict(dict={'automatic' : {                                                            
                                                            'std_1':{
                                                                    'BndsRnXp':[1,100],
                                                                    'NGsBlkXp':[1,18],
                                                                    'mode':'balanced',
                                                                    'resources':dict_res_medium,
                                                                    },
                                                            'std_2':{
                                                                    'BndsRnXp':[101,1000],
                                                                    'NGsBlkXp':[1,18],
                                                                    'mode':'memory',
                                                                    'resources':dict_res_high,
                                                                    },}})

In [None]:
builder.parallelism_instructions = parallelism_instructions_auto

## Providing a group where to collect all the convergence simulations

When `YamboConvergence` is submitted, it automatically creates a group where to put all the simulations. Each time a simulation is ready to be submitted, there is an internal check in the group, to understand
if an identical simulation has been already performed. In that case, the submission is skipped and we reuse the results to perform our analysis. This is a sort of ad-hoc [caching](https://aiida.readthedocs.io/projects/aiida-core/en/latest/topics/provenance/caching.html), which however does not duplicate the involved nodes. 

We prefer to just reuse the results as often even the retrieved files for yambo simulations are heavy (`ndb.QP` for example).

It is possible also to provide a custom group, by means of the `group_label` input String.

In [19]:
try:
    g = orm.load_group('tutorial/hBN/convergence')
except:
    g = orm.Group('tutorial/hBN/convergence')
    g.store()

In [20]:
builder.group_label = orm.Str('tutorial/hBN/convergence') # verdi group create tutorial/hBN/convergence; all calculationsc are added to the group

### Submission

In [22]:
from aiida.engine import submit

In [23]:
run = None

In [24]:
if run:
    print('run is already running -> {}'.format(run.pk))
    print('sure that you want to run again?, if so, copy the else instruction in the cell below and run!')
else:
    run = submit(builder)

print(run)



uuid: 3eb9b563-99d5-4d84-90c3-4f359f037aeb (pk: 2150) (aiida.workflows:yambo.yambo.yamboconvergence)


## Inspecting the workflow

report, e mostra che fa prima le bande...

In [30]:
!verdi process report {run.pk}

[22m2024-01-08 14:20:09 [855  | REPORT]: [2150|YamboConvergence|start_workflow]: group: tutorial/hBN/convergence
2024-01-08 14:20:09 [856  | REPORT]: [2150|YamboConvergence|start_workflow]: Workflow type: heavy; looking for convergence of ['gap_']
2024-01-08 14:20:09 [857  | REPORT]: [2150|YamboConvergence|start_workflow]: Workflow initilization step completed, the parameters will be: ['FFTGvecs'].
2024-01-08 14:20:10 [858  | REPORT]: [2150|YamboConvergence|has_to_continue]: Still iteration on ['FFTGvecs']
2024-01-08 14:20:10 [859  | REPORT]: [2150|YamboConvergence|pre_needed]: {'FFTGvecs': [21, 37, 45, 58], 'kpoint_mesh': [[4, 4, 2], [7, 7, 2], [13, 13, 5], [16, 16, 6]], 'BndsRnXp': [80, 400, 80, 180, 280, 400], 'NGsBlkXp': [2, 2, 8, 6, 4, 8], 'GbndRnge': [80, 400, 80, 180, 280, 400]}
2024-01-08 14:20:10 [860  | REPORT]: [2150|YamboConvergence|pre_needed]: ['GW bands are: 200', 'scf inputs found', 'nscf inputs found']
2024-01-08 14:20:10 [861  | REPORT]: [2150|YamboConvergence|do_pre

# Output analysis.

suppose that your calculation completed successfully, then you can access the outputs via the output method of the run instance: 

In [47]:
run.is_finished_ok

True

The converged parameters can be obtained via the "infos" output Dict:

In [32]:
run.outputs.infos.get_dict()

{'gap_': 6.0239447147232,
 'E_ref': 5.5690319668779,
 'BndsRnXp': 80.0,
 'FFTGvecs': 21,
 'GbndRnge': 80.0,
 'NGsBlkXp': 2.0,
 'kpoint_mesh': [7, 7, 4],
 'extrapolation': 5.4063928865123}

The full convergence history can be visualized in a table form using pandas:

In [49]:
import pandas as pd

In [50]:
history = run.outputs.history.get_dict()

In [51]:
history_table = pd.DataFrame(history)

In [52]:
history_table

Unnamed: 0,gap_,uuid,failed,useful,BndsRnXp,FFTGvecs,GbndRnge,NGsBlkXp,global_step,kpoint_mesh,parameters_studied
0,5.87903,e599d05f-07a3-4ab5-9256-ac1bbbe88e89,False,False,200,21,200,4,1,"[4, 4, 2]",FFTGvecs
1,5.865918,a36714a5-9b99-40dc-abb7-aa53472f97cc,False,False,200,37,200,4,2,"[4, 4, 2]",FFTGvecs
2,5.865277,9a033a7b-54e6-409c-8b35-bfe627dada4a,False,False,200,45,200,4,3,"[4, 4, 2]",FFTGvecs
3,5.859447,64455883-6bb4-41c5-8951-a6b6ff0db1d4,False,False,200,58,200,4,4,"[4, 4, 2]",FFTGvecs
4,5.87903,e599d05f-07a3-4ab5-9256-ac1bbbe88e89,False,False,200,21,200,4,5,"[4, 4, 2]",kpoint_mesh
5,5.566768,5b7b85aa-e5a3-483f-94c3-b47d0512a30e,False,False,200,21,200,4,6,"[7, 7, 2]",kpoint_mesh
6,5.444812,c81befb4-eaf6-4c47-89f9-582cda7e557f,False,False,200,21,200,4,7,"[13, 13, 5]",kpoint_mesh
7,5.39493,b7b6120d-3f16-414c-9019-2884a1fa299a,False,False,200,21,200,4,8,"[16, 16, 6]",kpoint_mesh
8,5.701608,f745b588-b406-4cb7-95a4-68d1620a11a2,False,False,200,21,200,4,9,"[7, 7, 4]",kpoint_mesh
9,6.02394,ed6f567b-f60c-4e55-8394-d8ad309a46b1,False,True,80,21,80,2,10,"[7, 7, 2]","BndsRnXp, GbndRnge, NGsBlkXp"


The last calculations can be obtained using:

In [53]:
history_table[history_table['useful']==True]

Unnamed: 0,gap_,uuid,failed,useful,BndsRnXp,FFTGvecs,GbndRnge,NGsBlkXp,global_step,kpoint_mesh,parameters_studied
9,6.02394,ed6f567b-f60c-4e55-8394-d8ad309a46b1,False,True,80,21,80,2,10,"[7, 7, 2]","BndsRnXp, GbndRnge, NGsBlkXp"


In [28]:
# some fine plotting of the results, similar to the ones in the paper.

for i in parameters_studied ---> collect the data... ah potresti fare dei subplots con tutto. plotly secondo me é molto bello x farli https://plotly.com/python/3d-line-plots/