#CafChem tools for using Microsoft's Skala DFT functional with ASE.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/MauricioCafiero/CafChem/blob/main/notebooks/SkalaDFT_CafChem.ipynb)

## This notebook allows you to:
- Create ASE atoms objects from SMILES or from an XYZ file.
- Calculate energy, optimize structures, calculate dipole moments and vibrational frequencies.

## Requirements:
- This notebook will install rdkit, Skala and py3Dmol
- Needs a GPU for inference.

## Set-up

### Install Skala and RDKit

In [1]:
! pip install -q microsoft-skala
! pip install -q rdkit
! pip install py3Dmol

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/46.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.5/46.5 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m88.5/88.5 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.3/41.3 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.9/50.9 MB[0m [31m43.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.9/2.9 MB[0m [31m82.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m211.8/211.8 kB[0m [31m21.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m191.3/191.3 kB[0m [31m20.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

### Import libraries and pull CafChem from Github

In [2]:
!git clone https://github.com/MauricioCafiero/CafChem.git

Cloning into 'CafChem'...
remote: Enumerating objects: 1047, done.[K
remote: Counting objects: 100% (369/369), done.[K
remote: Compressing objects: 100% (112/112), done.[K
remote: Total 1047 (delta 336), reused 257 (delta 257), pack-reused 678 (from 1)[K
Receiving objects: 100% (1047/1047), 44.75 MiB | 39.95 MiB/s, done.
Resolving deltas: 100% (617/617), done.


In [4]:
import pandas as pd
import matplotlib.pyplot as plt
import shutil
import numpy as np
from skala.ase import Skala

import CafChem.CafChemSkala as ccsk

## Skala for QM
- functionals available:
  * Skala
  * lda
  * spw92 (LDA with PW92 correlation)
  * pbe
  * tpss

- basis set options:
  * def2-svp
  * def2-tzvp
  * def2-qzvp
  * ma-def2-qzvp
- dispersion:
  * Skala calculator property: with_dftd3=True
-internal distance units are Bohr. If you read in an XYZ, it is assumed in Angstroms.

### Energy, geometry optimization, dipoles

In [5]:
atoms = ccsk.smiles_to_atoms("C=O")

atoms.calc = Skala(xc="skala", basis="def2-svp", verbose=2, with_density_fit=True, charge=0, multiplicity=1)

In [7]:
xyz_string = ccsk.atoms_to_xyz(atoms, None, False)
ccsk.visualize_molecule(xyz_string)

### Check bond-lengths

In [10]:
lines = xyz_string.split('\n')
ccsk.test_units(lines, 1,2)
ccsk.test_units(lines, 1,3)
ccsk.test_units(lines, 1,4)

1.2246316120946734
1.1016994534980955
1.1016996291398944


In [11]:
for key, value in atoms.calc.parameters.items():
    print(key, value)

xc skala
basis def2-svp
with_density_fit True
with_newton False
with_dftd3 True
charge 0
multiplicity 1
verbose 2


In [12]:
changed = atoms.calc.set(verbose=2)

In [14]:
energy = ccsk.opt_energy(atoms, False)
print(energy)

skala-1.0.fun:   0%|          | 0.00/1.23M [00:00<?, ?B/s]

Overwritten attributes  nuc_grad_method  of <class 'pyscf.df.df_jk.DFSkalaRKS'>


Initial energy: -114.331671 ha
-114.33167098440488


### Optimization scheme options include:
- LBFGSLineSearch (takes longer but uses less memory)
- BFGS

In [15]:
energy = ccsk.opt_energy(atoms, True, opt_type='BFGS')
print(energy)

Initial energy: -114.331671 ha
      Step     Time          Energy          fmax
BFGS:    0 14:51:43    -3111.125137        2.335576
BFGS:    1 14:52:09    -3111.096663        3.753077
BFGS:    2 14:52:39    -3111.157656        0.329941
BFGS:    3 14:52:58    -3111.158472        0.093734
Final energy: -114.332896 ha
Energy difference: -0.001225 ha
-114.33289604629732


### check bond lengths

In [16]:
opt_xyz_string = ccsk.atoms_to_xyz(atoms, None, False)
lines = opt_xyz_string.split('\n')
ccsk.test_units(lines, 1,2)
ccsk.test_units(lines, 1,3)
ccsk.test_units(lines, 1,4)

1.1973879813765842
1.1017845488943099
1.1020026476100113


In [17]:
dipole_val = ccsk.calc_dipole(atoms)

Dipole moment magnitude: 0.451
Dipole moment vector:
x-component:   -0.445
y-component:    0.069
z-component:    0.006


In [18]:
new_xyz = ccsk.atoms_to_xyz(atoms, None, False)
ccsk.visualize_molecule(new_xyz)

### Vibrations

In [20]:
vibs, real_vibs = ccsk.calculate_vibrations(atoms)

Vibrational frequency 1: 1202.640 cm-1
Vibrational frequency 2: 1285.359 cm-1
Vibrational frequency 3: 1536.687 cm-1
Vibrational frequency 4: 1881.517 cm-1
Vibrational frequency 5: 3019.921 cm-1
Vibrational frequency 6: 3170.687 cm-1
Also calculated the following low frequency motions:
Frequency 1: 0.146 cm-1
Frequency 2: 26.147 cm-1
Frequency 3: 116.567 cm-1
Also found the following number of imaginary frequencies: 3
