# **The Electronic Density of States (DOS)**

**Authors:** Dou Du, Taylor James Baird and Giovanni Pizzi

<i class="fa fa-home fa-2x"></i><a href="../index.ipynb" style="font-size: 20px"> Go back to index</a>

**Source code:** https://github.com/osscar-org/quantum-mechanics/blob/master/notebook/band-theory/density_of_states.ipynb

This notebook demonstrates various approaches for the numerical calculation of the density of states (DOS) for a 3D free-electron model with periodic boundary conditions.

<hr style="height:1px;border:none;color:#cccccc;background-color:#cccccc;" />

## **Goals**
   
* Familiarize yourself with various numerical methods employed to calculate the electronic density of states.
* Examine the resulting DOS and compare the accuracy and computational cost of the various methods.

## **Background theory**

[More on the background theory.](./theory/theory_density_of_states.ipynb)

## **Tasks and exercises**

1. Investigate the influence of the number of k-points on the resulting DOS.
    <details>
    <summary style="color: red">Solution</summary>
    In the right panel, the plotted blue line is the analytical result for the DOS of the free electron model. 
    By choosing different numbers of k-points via the "Number of k-points slider", we can investigate how 
    the quality of the calculated results varies with the density of the k-point mesh. You will observe that the numerical results converge to the analytical result with increasing number of k-points. This can be attributed to the fact that the DOS can be interpreted as a probability density of electronic states as a function of energy. Since energy is generally related to the k-vector magnitude, the quality with which we resolve the range of energy eigenvalues is in turn directly controlled by how fine our sampling of the k-point mesh is.  
    </details>   

2. Which method gives most accurate results? Which method is fastest and why?

    <details>
    <summary style="color: red">Solution</summary>
    The linear tetrahedra interpolation (LTI) method is an accurate numerical approach which linearly interpolates the 3D k-points grid within a set of tetrahedra into which reciprocal space has been subdivided (see the <a href="./theory/theory_density_of_states.ipynb">background theory notebook</a>). The LTI method can yield much better results compared to a simple histogram. Gaussian smearing renders the 
    histogram plot considerably smoother, and closer to the analytical 
    solution. The histogram method is a simple statistical representation of the eigenvalues (one simply represents the frequency with which a given range of eigenenergies occur). This latter approach is typically the fastest of those discussed, but shall also generally give results with the poorest level of resolution.
    </details>

3. Set the number of k-points to the maximum value and consider Gaussian smearing with a reasonable value for the smearing parameter (say $\sigma=0.07$). Looking at the calculated DOS with the G-vector range set to $0$, how does it compare with the analytical result? Can you explain any discrepancies you notice? Increasing the G-vector range to $1$, how does the calculated DOS now compare?

    <details>
    <summary style="color: red">Solution</summary>
    You will notice that even with a high number of k-points and an appropriate smearing parameter, the calculated DOS starts to deviate significantly from the analytical result above an     energy of around . This discrepancy can be attributed to the fact that when we only consider the G-vector $G=0$, we are neglecting to include several bands in the electronic energy       dispersion that still contribute energy eigenvalues that fall within the range we are interested in. That is, and you can verify the claim by increasing the G-vector slider to the value 1, several bands are folded back into the first Brillouin zone. You can see which bands undergo folding by tracking those which change from black to red upon incrementing the G-     vector slider.
    </details>

<hr style="height:1px;border:none;color:#cccccc;background-color:#cccccc;" />

## Interactive visualization
(be patient, it might take a few seconds to load)

In [16]:
import numpy as np
import seekpath
import re
import os
import matplotlib
from ase.dft.dos import linear_tetrahedron_integration as lti
from ase.dft.kpoints import monkhorst_pack
from ase.cell import Cell
from scipy.stats import norm
import plotly.graph_objects as go
import plotly.express as px
import time
import matplotlib.pyplot as plt
from ipywidgets import Button, RadioButtons, Layout, IntSlider, HBox, Text, VBox, Checkbox, Label, FloatSlider, Output, HTML
from datetime import datetime

%matplotlib widget

In [17]:
def _compute_dos(kpts, G, ranges):
    """initial all the engienvalue according to the kpoints
    
    Args: 
        kpts: a array of kpts (kx, ky, kz)
        G: the reciprocal lattice vectors (3x3)
        ranges: the range of the reciprocal lattice 
        
    Returns:
        The eigenvalues of the free electron model.
    """
    eigs = []
    n = ranges
    
    for i in range(-n, n+1):
        for j in range(-n, n+1):
            for k in range(-n, n+1):
                g_vector = i*G[0] + j*G[1] + k*G[2]
                eigs.append(np.sum(0.5*(kpts + g_vector)**2, axis=3))

    eigs = np.moveaxis(eigs, 0, -1)
    return eigs
   
def _compute_total_kpts(G, grange=0):
    """Get all the kpoints 
    
    Args:
        G: the reciprocal lattice vectors (3x3)
        grange: the range of the reciprocal lattice
        
    Returns:
        The kpoints (kx, ky, kz) as a array
    """
    tot_kpts = []
    n = grange
    
    shape = (nkpt.value, nkpt.value, nkpt.value)
    kpts = np.dot(monkhorst_pack(shape), G).reshape(shape + (3,))
    kpts = kpts.reshape(nkpt.value**3, 3)

    for i in range(-n, n+1):
        for j in range(-n, n+1):
            for k in range(-n, n+1):
                g_vector = i*G[0] + j*G[1] + k*G[2]
                tot_kpts.extend(kpts+g_vector)
    return np.array(tot_kpts)
    

In [18]:
alat_bohr = 7.72
emax=1.5 # consider DOS up to 1.5 eV

#Choose the cubic lattice for using the linear tetrahadron method (ASE)
real_lattice_bohr = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]]) * alat_bohr / 2.0;

In [19]:
# modifications

style = {'description_width': 'initial'}

output_bs = Output()

line1 = Label(
    value=r'Number of k-points per dimension, $N_k$'
)

line2 = Label(
    value=r'(total number $=N_k^3$)'
)
nkpt = IntSlider(value=10, min=5, max=40, description="", style={'description_width': 'initial'}, continuous_update=False)

grange = IntSlider(value=1, min=0, max=1, description="G-vector range:", style=style)
grange_hint =HTML(value=f"<b>Note that there may be a delay while the DOS is computed for G-vectors > 0.</b>")
gcov = FloatSlider(value=0.5, min=0.1, max=1.0, description="Guassian covariance:", style=style)

def prettify(label):
    """
    Prettifier for matplotlib, using LaTeX syntax
    :param label: a string to prettify
    """

    label = (
        label
            .replace('GAMMA', r'$\Gamma$')
            .replace('DELTA', r'$\Delta$')
            .replace('LAMBDA', r'$\Lambda$')
            .replace('SIGMA', r'$\Sigma$')
    )
    label = re.sub(r'_(.?)', r'$_{\1}$', label)

    return label
def _get_band_energies(kpoints_list, b1, b2, b3, g_vectors_range):
    energy_data_curves = np.zeros(((2*g_vectors_range+1)**3, len(kpoints_list)), dtype=np.float_)

    cnt = 0
    for g_i in range(-g_vectors_range,g_vectors_range+1):
        for g_j in range(-g_vectors_range,g_vectors_range+1):
            for g_k in range(-g_vectors_range,g_vectors_range+1):
                g_vector = b1 * g_i + b2*g_j + b3 * g_k
                energy_data_curves[cnt] = np.sum(0.5*(kpoints_list + g_vector)**2, axis=1)# This is k^2 - NOTE: units to be double checked!
                cnt += 1


    # bands are ordered as follows: first band, second band, ...
    return energy_data_curves

def get_bands(real_lattice_bohr, reference_distance = 0.025, g_vectors_range = 3):

    # Simple way to get automatically the band path:
    # I go back to real space, just put a single atom at the origin,
    # then go back with seekpath.
    # NOTE! This might not give the most general path, as e.g. there are two
    # options for cubic FCC (cF1 and cF2 in seekpath).
    # But this should be general enough for this tool.

    structure = (real_lattice_bohr, [[0., 0., 0.]], [1])
    # Use a H atom at the origin
    seekpath_path = seekpath.get_explicit_k_path(structure, reference_distance=reference_distance)
    b1, b2, b3 = np.array(seekpath_path['reciprocal_primitive_lattice'])

    all_kpoints_x = np.array(seekpath_path['explicit_kpoints_linearcoord'])
    all_kpoints_list = np.array(seekpath_path['explicit_kpoints_abs'])

    segments_data = []
    for segment_indices in seekpath_path['explicit_segments']:
        start_label = seekpath_path['explicit_kpoints_labels'][segment_indices[0]]
        end_label = seekpath_path['explicit_kpoints_labels'][segment_indices[1]-1]

        kpoints_x = all_kpoints_x[slice(*segment_indices)]
        kpoints_list = all_kpoints_list[slice(*segment_indices)]

        energy_bands = _get_band_energies(kpoints_list, b1, b2, b3, g_vectors_range)

        segments_data.append({
            'start_label': start_label,
            'end_label': end_label,
            'kpoints_list': kpoints_list,
            'kpoints_x': kpoints_x,
            'energy_bands': energy_bands,
            'b1': b1,
            'b2': b2,
            'b3': b3,
        })

    return segments_data


def plot_bandstructure(c):
    global G, segments_data, lbands
    
    segments_data = get_bands(real_lattice_bohr)
    G = np.array([segments_data[0]['b1'], segments_data[0]['b2'], segments_data[0]['b3']])
    
    x_ticks = []
    x_labels = []
    lbands = []

    for segment_data in segments_data:
        if not x_labels:
            x_labels.append(prettify(segment_data['start_label']))
            x_ticks.append(segment_data['kpoints_x'][0])
        else:
            if x_labels[-1] != prettify(segment_data['start_label']):
                x_labels[-1] += "|" + prettify(segment_data['start_label'])
        x_labels.append(prettify(segment_data['end_label']))
        x_ticks.append(segment_data['kpoints_x'][-1])

        for energy_band in segment_data['energy_bands']:
            line, = ax_bs.plot(segment_data['kpoints_x'], energy_band, 'k')
            lbands.append(line)

    ax_bs.set_ylim([0, 1.0])
    ax_bs.yaxis.tick_right()
    ax_bs.yaxis.set_label_position("right")
    ax_bs.set_ylabel('Free-electron energy (eV)')
    ax_bs.set_xlim([np.min(x_ticks), np.max(x_ticks)])
    ax_bs.set_xticks(x_ticks)
    ax_bs.set_xticklabels(x_labels)
    ax_bs.grid(axis='x', color='red', linestyle='-', linewidth=0.5)
    fig.tight_layout()
    
    update_bands_color('bands')
    
def update_bands_color(c):
    n = 3
    
    shape = (nkpt.value, nkpt.value, nkpt.value)
    kpts = np.dot(monkhorst_pack(shape), G).reshape(shape + (3,))
    eigs = _compute_dos(kpts, G, grange.value)

    index = 0
    
    for segment_data in segments_data:
        for i in range(-n, n+1):
            for j in range(-n, n+1):
                for k in range(-n, n+1): 
                    if abs(i) <= grange.value and abs(j) <= grange.value and abs(k) <=grange.value:
                        lbands[index].set_color('r')
                    else:
                        lbands[index].set_color('k')
                    index+=1

grange.observe(update_bands_color, names="value")


In [20]:
style = {'description_width': 'initial'}

line1 = Label(
    value=r'Number of kpoints per dimension, $N_k$'
)

line2 = Label(
    value=r'(total number $=N_k^3$)'
)
nkpt_box = HBox([VBox([line1, line2]), nkpt])
nbin = IntSlider(value=50, min=20, max=100, description="Number of bins:", layout=Layout(width="400px"), style={'description_width': 'initial'})
gstd = FloatSlider(value=0.01, min=0.01, max=0.1, step=0.01, description="Gaussian $\sigma$ (eV):", layout=Layout(width="300px"), style={'description_width': 'initial'})

#All buttons
btlti = Button(description="Add plot", style = {'button_color':'green'})
bthist = Button(description="Add plot", style = {'button_color':'green'})
btgas = Button(description="Add plot", style = {'button_color':'green'})
btclear = Button(description="Clear plot", style = {'button_color':'green'})

#Ouput for the DOS figure
output = Output()

def compute_dos_lti(c):
    """Compute the DOS uing the ASE linear tetrahedron interpolation method.
    """
    global llti
    
    btlti.disabled = True
    bthist.disabled = True
    btgas.disabled = True
    btclear.disabled = True
    btlti.style = {'button_color':'red'}
    btlti.description = "Running"
    
    
    try:
        llti.remove()
    except:
        pass
    
    shape = (nkpt.value, nkpt.value, nkpt.value)
    G = Cell(real_lattice_bohr).reciprocal()*2*np.pi
    kpts = np.dot(monkhorst_pack(shape), G).reshape(shape + (3,))

    eigs = _compute_dos(kpts, G, grange.value)

    dosx = np.linspace(0, emax, 500)
    dosy = lti(real_lattice_bohr, eigs, dosx)

    # normalize lti dos for comparison with analytical result

    norm_lti=(dosx[1]-dosx[0])*np.sum(dosy)
    
    vol=(alat_bohr/2.)**3
    norm_an=(2./3.)*(vol/np.pi**2)*2**(1./2)*emax**(3./2) # analytical value of integrated DOS

    dosy/=norm_lti
    dosy*=norm_an
    
    
    llti, = ax_dos.plot(dosy, dosx, 'r-', label='LTI')
    ax_dos.legend(loc=1, bbox_to_anchor=(1.3, 1.0))
    
    btlti.disabled = False
    bthist.disabled = False
    btgas.disabled = False
    btclear.disabled = False
    btlti.style = {'button_color':'green'}
    btlti.description="Add plot"

btlti.on_click(compute_dos_lti)

def compute_dos_histogram(c):
    """Compute the DOS as a histogram.
    """
    global lhist
    
    btlti.disabled = True
    bthist.disabled = True
    btgas.disabled = True
    btclear.disabled = True
    bthist.style = {'button_color':'green'}
        
    try:
        lhist.remove()
    except:
        pass
    
    shape = (nkpt.value, nkpt.value, nkpt.value)
    G = Cell(real_lattice_bohr).reciprocal()*2*np.pi
    kpts = np.dot(monkhorst_pack(shape), G).reshape(shape + (3,))
    eigs = _compute_dos(kpts, G, grange.value)
    num_bins=nbin.value

    vol=(alat_bohr/2.)**3

    analy_x = np.linspace(0, emax, 500);
    analy_y = (1.0/(2.0*np.pi**2))*(2.0)**(3./2.)*analy_x**0.5*(alat_bohr / 2.0)**3.0;

    norm=np.sum(analy_y*(analy_x[1]-analy_x[0]))
    norm_an=(2./3.)*(vol/np.pi**2)*2**(1./2)*emax**(3./2) # analytical value of integrated DOS
  

    hy, hx = np.histogram(eigs.ravel(),range=[0,emax],bins=num_bins )
   
    hy = hy/np.sum(hy*np.diff(hx)) # normalize histogram to make total area 1
    hy*=norm_an # renormalize histogram for comparison with analytical DOS (norm gives integrated DOS of analytical curve)
  

    lhist = ax_dos.barh(hx[:-1]+np.diff(hx)[0], hy, color='yellow', edgecolor='black', 
                       height=np.diff(hx), label="Histogram")
    ax_dos.legend(loc=1, bbox_to_anchor=(1.3, 1.0))

    btlti.disabled = False
    bthist.disabled = False
    btgas.disabled = False
    btclear.disabled = False
    bthist.style = {'button_color':'green'}
    
bthist.on_click(compute_dos_histogram)

def compute_dos_gaussian(c):
    """Computing the DOS using Gaussian smearing method.
    """
    global lgas
    
    btlti.disabled = True
    bthist.disabled = True
    btgas.disabled = True
    btclear.disabled = True
    btgas.style = {'button_color':'red'}
    btgas.description = "Running"
        
    try:
        lgas.remove()
    except:
        pass
    
    shape = (nkpt.value, nkpt.value, nkpt.value)
    G = Cell(real_lattice_bohr).reciprocal()*2*np.pi
    kpts = np.dot(monkhorst_pack(shape), G).reshape(shape + (3,))
    eigs = _compute_dos(kpts, G, grange.value)
    num_bins=nbin.value
    
    knum=nkpt.value
    Gnum=grange.value
    
    # Accelerated approach to smearing: we smear out a histogram of evals instead of deltas centered on each eval
    # instead of summing smeared gaussians centered on eigs, histogram and then smear

    vol=(alat_bohr/2.)**3
    norm_an=(2./3.)*(vol/np.pi**2)*2**(1./2)*emax**(3./2) # analytical value of integrated DOS

    
    hy, hx = np.histogram(eigs.ravel(),range=[0,emax],bins=num_bins ) # further right bin edge included in hx
    hx=hx[:-1]
    gx=hx
    gy=0.0*hx

    for i,Ei in enumerate(hx):
        gy += hy*norm(Ei, gstd.value).pdf(hx)

    gy = gy/np.sum(gy*(hx[1]-hx[0])) # normalize histogram to make total area 1
    gy*=norm_an # renormalize histogram for comparison with analytical DOS (norm gives integrated DOS of analytical curve)
    
      
    
    # Standard approach to Gaussian smearing
#     for eig in eigs.ravel():
#         gy += norm(eig, gstd.value).pdf(gx)

#     norm_g=np.sum((gx[1]-gx[0])*gy)
#     gy = norm_an*gy/norm_g


    lgas, = ax_dos.plot(gy, gx, 'k--', label="Gaussian smearing")
    ax_dos.legend(loc=1, bbox_to_anchor=(1.3, 1.0))
    
    btlti.disabled = False
    bthist.disabled = False
    btgas.disabled = False
    btclear.disabled = False
    btgas.style = {'button_color':'green'}
    btgas.description = "Add plot"
    
btgas.on_click(compute_dos_gaussian)


def init_dos_plot():
    """Init the DOS plot.
    """
    global hline, ann,ax_dos
    btlti.disabled = True
    bthist.disabled = True
    btgas.disabled = True
    
    analy_x = np.linspace(0, emax, 500);
    analy_y = (1.0/(2.0*np.pi**2))*(2.0)**(3./2.)*analy_x**0.5*(alat_bohr / 2.0)**3.0;
    lanaly, = ax_dos.plot(analy_y,analy_x, 'b', label='Analytical solution')
    
    ax_dos.legend(loc=1, bbox_to_anchor=(1.3, 1.0))
    ax_dos.yaxis.tick_right()
    ax_dos.yaxis.set_label_position("right")
    ax_dos.set_xlabel('Density of States')

    ax_dos.set_ylabel('Energy (eV)')
    fig.tight_layout()
    
    btlti.disabled = False
    bthist.disabled = False
    btgas.disabled = False
     
    
with output:
    """Set the figure for the DOS
    """
    global fig, ax_dos, ax_bs
    fig, (ax_bs, ax_dos) = plt.subplots(1, 2, sharey=True)
    fig.set_size_inches(8.5, 5.0)
    fig.canvas.header_visible = False
    fig.canvas.layout.width = "850px"
    fig.tight_layout()
    init_dos_plot()
    plot_bandstructure('bands')
    plt.show()    
    
def clear_plot(c):
    """Clear the DOS calculated results when the "Clear" button is clicked.
    """
    ax_dos.clear()
    init_dos_plot()
    
btclear.on_click(clear_plot)

In [21]:
df = px.data.gapminder()

X, Y, Z = np.mgrid[-2:2:40j, -2:2:40j, -2:2:40j]

# Fermi surface
values = 0.5*(X * X + Y * Y + Z * Z)

G = Cell(real_lattice_bohr).reciprocal()*2*np.pi
kpts = _compute_total_kpts(G)

def update_kpts_fig(c):
    """Update the k-points plot when tuning the k-points slider.
    """
    kpts = _compute_total_kpts(G)
    

nkpt.observe(update_kpts_fig, names="value")

In [22]:
#Group buttons with descriptions as labels
method1 = HBox([HBox([HTML(value=f"<b>Simple histogram of the eigenvalues</b>"), nbin]), bthist])
method2 = HBox([HBox([HTML(value=f"<b>Gaussian smearing method</b>"), gstd]),btgas ])
method3 = HBox([HTML(value="<b>Linear tetrahedron interpolation method</b>"),btlti ])
method4 = HBox([btclear, Label(value="(Clear the calculated results)")])
nkpt_box = HBox([VBox([line1, line2]), nkpt,method4])

label1 = HTML(value = f"<b><font color='red'>Choose a method to calculate the DOS:</b>")
display(VBox([output,VBox([ nkpt_box,VBox([grange,grange_hint]), label1, method1, method2, method3])]))

VBox(children=(Output(), VBox(children=(HBox(children=(VBox(children=(Label(value='Number of kpoints per dimen…

<hr style="height:1px;border:none;color:#cccccc;background-color:#cccccc;" />

## **Legend**

(How to use the interactive visualization)

## **Interactive figures**

The left panel shows the electronic bandstructure of a free electron gas in 3 dimensions while the right panel displays the corresponding density of states. You can choose the number of k-points used in the computation of the DOS by manipulating the k-points slider. Similarly, the number of G-vectors employed in the calculation can be varied with the "G-vector range" slider (the value is capped at 1 to avoid excessive computational time in the case of the LTI method, however one should still expect long waiting times in in this case when $N_k$ is large).

## **Controls**

Three buttons allow one to compute the DOS with the three methods discussed 
earlier. The calculated DOS will appear in the figure on the right, superimposed on the analytic curve.
Computing the DOS with a large number of k-points may take several seconds.
When using the simple histogramming method, you can adjust the number of bins employed by varying the "Number of bins" slider.
For the Gaussian smearing method, you can also tune the standard 
deviation, $\sigma$, of the Gaussian functions by adjusting the "Gaussian $\sigma$" slider.