# Programming for Chemistry 2025/2026 @ UniMI

![logo](logo_small.png "Logo")

## Lecture 17: Atomisitic Simulation Environment (ASE)

The Atomic Simulation Environment [ASE](https://ase-lib.org/) is a set of tools and Python modules for setting up, manipulating, running, visualizing and analyzing atomistic simulations.
Do you remember the `Molecule` class we developed in the previous lecture? ASE extends this concept to represent not only molecules but also periodic crystals, or low-dimensional systems like surfaces.

In addition to that, ASE provides interfaces to different codes through **Calculators** which are used together with the central Atoms object and the many available algorithms in ASE.
![calculators](ase_calculators.png "ASE Calculators")


ASE is:
* **Flexible:** Since ASE is based on the Python scripting language it is possible to perform very complicated simulation tasks without any code modifications. For example, a sequence of calculations may be performed with the use of simple *for-loop* constructions. There exist ASE modules for performing many standard simulation tasks.
* **Customizable:** The Python code in ASE is structured in modules intended for different purposes. There are `ase.calculators` for calculating energies, forces and stresses, `ase.md` and `ase.optimize` modules for controlling the motion of atoms, `constraints` objects and filters for performing `nudged-elastic-band` calculations etc. The modularity of the object-oriented code make it simple to contribute new functionality to ASE.
* **Pythonic:** It fits nicely into the rest of the Python world with use of the popular **NumPy** package for numerical work. The use of the Python language allows ASE to be used both interactively as well as in scripts.


## 1. Getting started with ASE
ASE can be imported by `import ase` and it's sub-packages like `import ase.io` or individual classes like `from ase.calculators.siesta import Siesta`. 

Since ASE depends heavily on NumPy, we need also to `import numpy as np`. If you have a C/C++ and a Fortran compiler, and optionally a MPI library (for parallel execution), you can also install `gpaw` and the `asap3` packages. The first provide a DFT-PAW *calculator*, the second provides empirical potentials for large scale molecular dynamics. Anyway, we can use the built-in `EMT` *calculator* in this notebook.  

ASE should be already installed in Anaconda. Under Linux/WSL you can install the official packages from your distribution. Otherwise, you can install any version of NumPy in a virtual environment using `pip` or `conda`.

Typically one does:
```bash
conda create myenvinronment
conda activate myenvironment
conda install ase
conda install gpaw asap3    # must have C/C++ compilers
```
or
```bash
python -m venv myenvironment
. myenvironment/bin/activate
pip install ase
pip install gpaw asap3      # must have C/C++ compilers
```

In addition, if you want to use *external calculators* you must install other codes, like **Quantum Espresso**, **SIESTA**, etc..

Nowadays, many **Machine Learning Interatomic Potentials (MLIPs)** like **MACE**, **MatterSim**, provide an *ASE calculator* interface. Refer to those package for the installation procedure.

In [None]:
import ase
import numpy as np
import matplotlib
import matplotlib.pyplot as plt

print(ase.__version__)

## 2. The `Atoms` class
The `Atoms` class is used to represent molecules and materials. `Atoms` represents (at minimum) a collection of atoms of any chemical species and associated positions

### 2.1 Molecules and crystals
We can define a molecule with lists of symbols and positions, or for convenience we can compress the list of symbols to a chemical formula.

Next, we can visualize the object using `ase.visualize.view()`.

In [None]:
from ase import Atoms
from ase.visualize import view

In [None]:
# create a nitrogen molecule
d = 1.10
molecule = Atoms(['N', 'N'], positions=[(0., 0., 0.), (0., 0., d)])

# alternatively
molecule = Atoms('N2', positions=[(0., 0., 0.), (0., 0., d)])

# inspect the molecule object
print(type(molecule))
print(molecule)

In [None]:
# let's visualize the molecule in the ASE GUI
view(molecule)

In [None]:
# let's visualize the molecule in the jupyter notebook
#view(molecule, viewer='x3d')

To make a crystal we have to set the `cell` and `pbc` keywords

* If the cell is specified with three values it is assumed to be cubic. In other cases we might use the full 3x3 matrix to describe off-diagonal terms, e.g. `cell=[[a, -a, 0], [a, a, 0], [0, 0, a]]`
* Once a cell is specified we can use the `scaled_positions` keyword to specify atomic positions relative to lattice vectors (aka *fractional coordinates*)
* We set `pbc=True` to indicate periodic boundary conditions in all directions or along each direction, e.g. `pbc=[True, True, False]` for a *slab* calculation with exposed surfaces.

In [None]:
a = 5.387
crystal = Atoms('Zn4S4',
                scaled_positions=[[0., 0., 0.],
                                  [0., 0.5, 0.5],
                                  [0.5, 0., 0.5],
                                  [0.5, 0.5, 0.],
                                  [0.25, 0.75, 0.75],
                                  [0.25, 0.25, 0.25],
                                  [0.75, 0.75, 0.25],
                                  [0.75, 0.25, 0.75]],
               cell=[a, a, a],
               pbc=True)

view(crystal)

### 2.2 Access and change information in `Atoms`
We can easily extract information about the chemical species and atomic positions is provided by us when we create the object. They come in a form of Python lists or NumPy ndarrays.

In [None]:
print("N2 symbols:", molecule.get_chemical_symbols())
print("N2 masses:", molecule.get_masses())
print("N2 center of mass:", molecule.get_center_of_mass())
print("N2 positions:")
print(molecule.get_positions())

In [None]:
print("Zn4S4 cell:", crystal.cell.cellpar())
print("Zn4S4 cartesian coordinated:")
print(crystal.get_positions())

The `Atoms` class behaves also as a **list**, thanks to the *special methods* like `__len__()` and `__getitem__()`. Thus we can iterate on the `Atoms` and each atom will be an object of the class `Atom`.

In [None]:
print("numebr of atoms:", len(crystal))
for atom in crystal:
    print(atom)

In [None]:
for i in range(len(crystal)):
    atom = crystal[i]
    print(atom.symbol, atom.index, atom.position)

### 2.3 Reading and writing `Atoms`
The `ase.io` submodule provides convenience functions to read and write `Atoms` from and to a variety of format. These include many crystallographic formats (like CIF, XSF, VASP-POSCAR...), molecular formats (like XYZ, EXTXYZ, PDB, ...). It has also the ability to read multiple `Atoms` frames from a Molecular Dynamics (MD) run or from an energy minimization.

Finally `ase.io` is able to parse the output of many simulation codes (i.e. Quantum Espresso, Abinit, SIESTA, GULP, LAMMPS) and to write the input files for some of them. No surprise that this ability is used in the `Calculator` class. One useful application of `ase.io` is to convert between file formats.

In [None]:
import ase.io

In [None]:
# let's read a crystal in CIF format
crystal = ase.io.read('CrSb2.cif')
print(crystal)
view(crystal)

In [None]:
# let's write in VASP-POSCAR format
ase.io.write('CrSb2.vasp', crystal, direct=True)

Here is an example on to create a minimalistic input file for Quantum Espresso:

In [None]:
from ase.io.espresso import write_espresso_in

pseudo = {'Cr': 'cr_pbesol_v1.5.uspp.F.UPF', 'Sb': 'sb_pbesol_v1.4.uspp.F.UPF'}

with open('CrSb2-scf.in', 'wt') as f:
    write_espresso_in(f, crystal, pseudopotentials=pseudo, kspacing=0.03,
                      prefix='crsb2', calculation='scf',
                      ecutwfc=35, ecutrho=350,
                      occupations='smearing', degauss=0.005,
                      mixing_beta=0.1, conv_thr=1e-8)

To get a list of the implemented file formats:

```python
from ase.cli.info import print_formats
print_formats()
```

It is also possible to read and write **lists** of `Atoms`, i.e. **trajectories**. Usually those are obtained as time evolution (like in Molecular Dynamics) or from a Markov chain (like in Monte Carlo).

In [None]:
# let's create a short animation with rattling atoms
crystal = ase.io.read('CrSb2.cif')
atoms_list = []
num_frames = 10

for frame in range(num_frames):
    crystal.rattle(stdev=0.05, seed=frame)    # rattle modifies the atoms in-place
    atoms_list.append(crystal.copy())         # important to make a copy, otherwise the list will store the reference to the same object
    
ase.io.write('CrSb2_rattle.cif', atoms_list, format='cif')

### 2.4 Building and manipulating `Atoms`
The `ase.build` sub-packge provides convenience functions to build molecules from the formula, predefined crystal structures, supercell, surfaces and much more.

In [None]:
import ase.build
from ase.collections import g2

In [None]:
# ase.build.molecule has a set of predefined molecules fromt the G2 dataset
print(len(g2.names))
print(g2.names)

methane = ase.build.molecule('CH4')
view(methane)

In [None]:
# ase.build.bulk is used to create crystals with predefined lattices
crystal = ase.build.bulk('ZnS', crystalstructure='zincblende', a=5.387, cubic=True)    # cubic=True creates the conventional cell
print(crystal)

crystal = ase.build.bulk('ZnS', crystalstructure='zincblende', a=5.387, cubic=False)    # cubic=True creates the primitive cell
print(crystal)

# to see the available options
#help(ase.build.bulk)

In [None]:
# to make a supercell, multiply a crystal by a list of integers
si = ase.build.bulk('Si', cubic=True)
print(si)

supercell = si * [4, 4, 8]
view(supercell)

In [None]:
# to make tilted supercells, one must use ase.build.make_supercell()
supercell = ase.build.make_supercell(si, [[2,2,0],[0,3,0],[0,0,4]])
print(supercell.cell)
view(supercell)

In [None]:
# To create defects (vacancies or interstitials) you can exploit the fact that Atoms behaves like a list
supercell = si * [4, 4, 4]

In [None]:
# substitute one Si with one P
P_doped = supercell.copy()
P_doped[5].symbol = 'P'
view(P_doped)

In [None]:
# make an interstitital
interstitial = si.copy()
interstitial.append(ase.Atom('F', position=[0, 0, 2.2]))
view(interstitial)

The `ase.build.surface` can be used to create 2d slabs:

In [None]:
# read the CeO2 structure
ceo2 = ase.io.read('CeO2.cif')

# let's create some CeO2 surfaces, use VESTA to visualize them
surfaces = [ (1,0,0), (1,1,0), (1,1,1), (2,1,0) ]
for indices in surfaces:
    ceo2_slab = ase.build.surface(ceo2, indices=indices, layers=6, vacuum=10, periodic=True)
    ase.io.write(f'CeO2_surf{indices[0]}{indices[1]}{indices[2]}.cif', ceo2_slab)

Finally, `ase.build` has functions to create nanotubes and graphene ribbons:

In [None]:
from ase.build.tube import nanotube

tube = nanotube(n=9, m=0, vacuum=10)
print(tube)
view(tube)

In [None]:
from ase.build.ribbon import graphene_nanoribbon

ribbon = graphene_nanoribbon(n=4, m=8, type='zigzag', saturated=True, vacuum=10)
print(ribbon)
view(ribbon)

It is easy to write your own function to generate other kind of structures, i.e. heterostructures and nanoparticles (see XYZ and CIF files).

![slides1](slides1.png "Slides1")

## 3. Calculators
The `Calculator` class calculates basic properties of an `Atoms` object, such as energy, forces and stress tensor.

There are three type of calculators:
1. **built-in calculators** that run the simulation within the same Python interpreter process, i.e. **EMT**, **GPAW**
2. **file-based calculators** that run the simulation as a sub-process, with communication mediated through input and output files, i.e. **Quantum Espresso**, **SIESTA**, **ORCA**, ...
3. **external calculators** which are provided by extra packages, such as **MACE**, **Quippy (GAP)**, ...

The calculators take care of converting `Atoms` in the unit of measure of each calculator, and convert the energy and forces from the code units into ASE units, i.e. eV, eV/Å, etc...

### 3.1 Internal calculators
The **EMT** calculator implements the **Effective Medium Theoryéé potentials for Ni, Cu, Pd, Ag, Pt and Au. Some other elements are included *for fun*, but really this is a method for alloys of those metals.
EMT is implemented in Python and it's fast enough for our demonstrations. However, the accuracy fo the EMT potentials is not comparable to that of DFT calculations.

In [None]:
from ase.calculators.emt import EMT

In [None]:
# let's create a mono-atomic Au wire (since it is periodic, you need only one atom)
def make_wire(spacing=2.5, box_size=10.0):
    return ase.Atoms('Au',
                    positions=[[0., box_size/2, box_size/2]],
                    cell=[spacing, box_size, box_size],
                    pbc=[True, False, False])

wire = make_wire()
#view(wire)

In [None]:
# we attach a EMT calculator and compute some properties
wire.calc = EMT()
print(wire.get_potential_energy())   # in eV
print(wire.get_forces())             # in eV/Å

In [None]:
# let's stretch the wire and compute the potential energy as a function of strain
# for this purpose we'll use numpy.linspace, fit to a polynomial, then plot the result with matplotlib
d = np.linspace(2.0, 3.0, 21)
energy = np.zeros_like(d)

for i in range(len(d)):
    wire = make_wire(spacing=d[i])
    wire.calc = EMT()
    energy[i] = wire.get_potential_energy()

In [None]:
# fit polynominal, calculate first derivative, find the zero
poly = np.polyfit(d, energy, deg=4)
dpoly = np.polyder(poly, 1)

dopt = 0.0
for root in np.roots(dpoly):
    if abs(root.imag) < 1e-10 and 2.0 < root.real < 3.0:
        dopt = root.real
        
print(f'optimal d = {dopt:.4} Å')
eopt = np.polyval(poly, dopt)

In [None]:
plt.figure(figsize=(4,4))

plt.scatter(d, energy-eopt, label='calculated')
plt.plot(d, np.polyval(poly, d)-eopt, label='poly fit')

plt.xlabel('Au-Au dist. (Å)')
plt.ylabel('Energy (eV)')
plt.legend()
plt.show()

### 3.2 Energy minimization
Given that we can compute energy and forces, we can optimize the atomic coordinates to find the mininum of the energy. In the `ase.optimize` sub-package you will find classes to perform various type of energy/enthalpy minimization, with and without costraints.

In [None]:
# let's create a simple Au nanocluster and randomize the atomic positions a bit
molecule = ase.Atoms('Au8', positions=[ [0,0,0], [3,0,0], [0,3,0], [0,0,3],
                                        [3,3,0], [3,0,3], [0,3,3], [3,3,3]] )

molecule.rattle(stdev=0.01)
#view(molecule)

In [None]:
# attach EMT calculator
molecule.calc = EMT()
print('initial energy:', molecule.get_potential_energy())

In [None]:
from ase.optimize import BFGS, FIRE

#minimizer = BFGS(molecule, trajectory='Au8.traj')
minimizer = FIRE(molecule, trajectory='Au8.traj')

minimizer.run(fmax=1e-3)
print('Final energy:', molecule.get_potential_energy())
ase.io.write('Au8.xyz', molecule)


### 3.3 Molecular dynamics
**Molecular dynamics (MD)** is the next thing to do. In MD we integrate the Newton equations of motion of a system of particles while sampling a **thermodynamic** ensemble. This can be **micro-canonic (NVE)**(i.e. constant energy), **canonic (NVT)** (i.e. constant temperature), **isothermal-isobaric (NPT)** (i.e. constant temperature and pressure).

* The main purpose of MD is to evaluate physical properties at finite T (and P) from the time average of the istantaneous values.
* An other purpose of MD is to explore the mechanism of phase transitions, chemical reactions, etc..
* MD can be use also to attempt to find the global miniminum of the energy of a system. 

Let's start with a piece of gold and let's perform 1000 steps in the NVT ensemble, then 1000 steps in the NVE ensemble.

In [None]:
# let's import some packages
import ase.build
from ase.calculators.emt import EMT
from ase import units
from ase.md.velocitydistribution import MaxwellBoltzmannDistribution, Stationary

# let's create a piece of Au and attach the EMT calculator
au_cube = ase.build.bulk('Au', cubic=True) * [3, 3, 3]
print("number of atoms:", len(au_cube))
au_cube.calc = EMT()

# Initialize the velocities to T=300K
temperature = 300
MaxwellBoltzmannDistribution(au_cube, temperature_K=temperature)
Stationary(au_cube)   # remove the motion of the center of mass

In [None]:
from ase.md.nvtberendsen import NVTBerendsen
from ase.md.verlet import VelocityVerlet
from ase.md import MDLogger
import os

In [None]:
# perform NVT run
time_step      = 0.5*units.fs     # 0.5 fs
taut           = 250*units.fs     # 250 fs
num_md_steps   = 1000
print_interval = 10

log_filename = 'MD-NVT.log'
traj_filename = 'MD-NVT.traj'
try:
    os.remove(log_filename)
    os.remove(traj_filename)
except:
    pass

print('===== NVT =====')
dyn = NVTBerendsen(au_cube, time_step, temperature, taut=taut, trajectory = traj_filename, 
                   logfile='-', loginterval=print_interval)

logger = MDLogger(dyn, au_cube, log_filename, header=True, stress=True, peratom=False, mode='a')
dyn.attach(logger, interval=print_interval)
status = dyn.run(num_md_steps)

In [None]:
# perform NVE run
time_step      = 0.5*units.fs     # 0.5 fs
num_md_steps   = 1000
print_interval = 10

log_filename = 'MD-NVE.log'
traj_filename = 'MD-NVE.traj'
try:
    os.remove(log_filename)
    os.remove(traj_filename)
except:
    pass

print('===== NVE =====')
dyn = VelocityVerlet(au_cube, time_step, trajectory = traj_filename, 
                     logfile='-', loginterval=print_interval)

logger = MDLogger(dyn, au_cube, log_filename, header=True, stress=True, peratom=False, mode='a')
dyn.attach(logger, interval=print_interval)
status = dyn.run(num_md_steps)

In [None]:
# let's plot the log files
nvt = np.loadtxt('MD-NVT.log', skiprows=1)
last_nvt = nvt[-1,0]
nve = np.loadtxt('MD-NVE.log', skiprows=1)

In [None]:
fig = plt.figure()

# total energy
plt.plot(nvt[:,0], nvt[:,1], color='C0', label='etot')
plt.plot(nve[:,0]+last_nvt, nve[:,1], color='C0')

# potential energy
plt.plot(nvt[:,0], nvt[:,2], color='C1', label='epot')
plt.plot(nve[:,0]+last_nvt, nve[:,2], color='C1')

# kinetic energy
plt.plot(nvt[:,0], nvt[:,3], color='C2', label='ekin')
plt.plot(nve[:,0]+last_nvt, nve[:,3], color='C2')

plt.axvline(last_nvt, color='black', linestyle='dashed')
plt.legend()
plt.xlabel('time (ps)')
plt.ylabel('energy (eV)')

plt.show()

In [None]:
fig = plt.figure()

# temperature
plt.plot(nvt[:,0], nvt[:,4], color='C0', label='temp')
plt.plot(nve[:,0]+last_nvt, nve[:,4], color='C0')

plt.axvline(last_nvt, color='black', linestyle='dashed')
plt.xlabel('time (ps)')
plt.ylabel('temperature (K)')

plt.show()

In [None]:
def press(data):
    return (data[:,5]+data[:,6]+data[:,7]) / 3.0

fig = plt.figure()

# temperature
plt.plot(nvt[:,0], press(nvt), color='C0', label='temp')
plt.plot(nve[:,0]+last_nvt, press(nve), color='C0')

plt.axvline(last_nvt, color='black', linestyle='dashed')
plt.xlabel('time (ps)')
plt.ylabel('pressure (GPa)')

plt.show()