# ISDF with PYSCF and Octopus

Octopus gives erroneous results for the serial ISDF implementation. As such, this notebook:
* Reimplements ISDF using PYSCF as a base
    * See prior notebooks [isdf_vectors.ipynb] and [qr_on_orbitals.ipynb]
* Does QR decomposition instead of kmeans
* Parses in Octopus wave functions, and see if one can get good results



In [3]:
"""Build Benzene molecule with a minimal GTO basis, using PYSCF
"""
%load_ext autoreload
%autoreload 2

import numpy as np
from pathlib import Path

from isdf_prototypes.clean_isdf import benzene_from_pyscf, bohr_to_ang, \
    construct_interpolation_vectors_more_parts, \
    approximate_product_basis, \
    error_l2, mean_norm, \
    find_interpolation_points_factory
from isdf_prototypes.math_ops import face_splitting_product


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [16]:
""" Get wave functions, grid and density.
* PYSCF 
* Read from file, for Octopus
"""
method = 'pyscf'

if method == 'pyscf':
    output_root = Path('../pyscf_nov2024_outputs')
    data: dict = benzene_from_pyscf(output_root, [10, 10, 10])
    print(data.keys())
    # assert [data] == ['wfs', 'rho', 'cube_grid'], 'Expected output data from pyscf calculation'
    
    # Add real-space points, and volume element
    data.update({'grid_points': data['cube_grid'].get_coords()})
    data.update({'dV': data['cube_grid'].get_volume_element()})

elif method == 'octopus':
    output_root = Path('../oct_nov2024_outputs')
    # Require wfs, rho, grid_points and dV - volume element
    print('Add read in and reshaping')
    
else:
    raise ValueError(f'Erroneous GS Method: {method}')


converged SCF energy = -229.930646528207
dict_keys(['wfs', 'rho', 'cube_grid'])


In [155]:
""" Compute interpolation indices

QR Approach 1
1. Sample from a Gaussian distribution to form a Gaussian matrix
  - Apply QR decomposition to G, and take Q, such that the matrix is orthogonalised
2. Contract phi and G over the state index. Take the face-splitting product of this to form Z_tilde

QR Approach 3
1. Compute interpolation indices using the QR Approach, BUT without orthogonalising the Gaussian
sampling matrices

KMeans Approach
"""
from isdf_prototypes.visualise import write_xyz

# Options: kmeans  orthogonalised_sampling  nonorthogonalised_sampling

n_int = 200
interpolation_method = 'nonorthogonalised_sampling'

n_grid_points = data['grid_points'].shape[0]
n_states = data['wfs'].shape[1]
assert data['wfs'].shape[0] == n_grid_points, 'Shape of wfs inconsistent with the grid'

indices = find_interpolation_points_factory(interpolation_method)(n_int, **data)

# Output grid points from indices to .xyz
with open(file= output_root / f"indices_{interpolation_method}.xyz", mode='w') as fid:
    string = write_xyz(['B']*n_int, data['grid_points'][indices]* bohr_to_ang)
    fid.write(string)
    

Interpolation Method is nonorthogonalised_sampling


In [156]:
""" Compute ISDF Vectors
"""
isdf_vectors = construct_interpolation_vectors_more_parts(data['wfs'], indices)
assert isdf_vectors.shape == (n_grid_points, n_int)
    

In [157]:
""" Compute full product matrix and approximate product matrix
"""
# Construct product basis matrix
z = face_splitting_product(data['wfs'])
assert z.shape == (n_grid_points, n_states**2)

# Approximate product basis amtric
z_isdf = approximate_product_basis(data['wfs'], indices, isdf_vectors)
assert z_isdf.shape == (n_grid_points, n_states**2)

error = error_l2(z, z_isdf, data['dV'])
rel_l2_error = error['mean'] / mean_norm(z, data['dV'])

print(f"{indices.size}, {error['min']:.2e}, {error['max']:.2e}, {error['mean']:.2e}, {rel_l2_error:.2e}")

if method == 'pyscf':
    print('Outputting a subset of the product states for visualisation')
    # Note, this is hard-coded in `benzene_from_pyscf`
    nx, ny, nz  = 10, 10, 10
    
    print()
    
    # Total range from index 0, to n_states**2-1
    
    for ij in [0, 25, 80, n_states**2-1]:
        # Exact product state
        fname = output_root / f'z_{ij}.cube'
        data['cube_grid'].write(field=z[:, ij].reshape(nx, ny, nz) , fname=fname.as_posix())
        # Approximate product state
        fname = output_root / f'zisdf_{ij}.cube'
        data['cube_grid'].write(field=z_isdf[:, ij].reshape(nx, ny, nz) , fname=fname.as_posix())



200, 4.91e-10, 6.71e-08, 1.84e-08, 1.25e-05
Outputting a subset of the product states for visualisation


## Results

### KMeans

Repeat each run 3 times, and highlight the most favourable.
Note, for 200 centroids $[CC^T]^{-1}$ fails `scipy.linalg.issymmetric(inv_cct, rtol=1.e-4)`. Have a think about why that is, and look at the condition number.
TODO - Plot this convergence


| N Interpolation Points | Min Error                   | Max Error                 | Mean Error               | Relative (L2) Error       | Notes                                                           |
|------------------------|-----------------------------|---------------------------|--------------------------|---------------------------|-----------------------------------------------------------------|
| 10                     | 1.38e-4  1.07e-4   1.27e-3  | 2.58e-3  2.57e-3  2.72e-3 | 1.09e-3  1.04e-3 1.07e-3 | 0.746    0.713    0.732   | Clearly the same functions, but not full visual agreement       |
| 25                     | 7.58e-5  7.50e-5   6.75e-5  | 2.02e-3  2.12e-3  2.18e-3 | 6.82e-4  6.95e-4 7.07e-4 | 0.466    0.475    0.48    | Visual agreement improved, but clearly differ                   | 
| 50                     | 3.10e-5  4.61e-5   4.75e-5  | 1.14e-3  1.86e-3  1.29e-3 | 2.61e-4  3.31e-4 3.03e-4 | 0.178    0.23     0.21    | Visual agreement to the point where the functions look the same |
| 100                    | 2.61e-6  1.31e-6   1.89e-6  | 3.15e-5  3.97e-5  2.38e-5 | 1.11e-5  7.81e-6 8.49e-6 | 7.60e-3  5.33e-3  5.80e-3 | Didn't check - assume fine                                      |
| 200                    | 1.14e-9  1.85e-9   7.77e-10 | 8.76e-8  1.05e-7  8.54e-8 | 2.50e-8  2.81e-8 2.41e-8 | 1.71e-5  1.92e-5  1.65e-5 | Perfect visual agreement                                        |


### Sub-sampling with Orthogonal G Matrix

Repeat each run 3 times, and highlight the most favourable

| N Interpolation Points | Min Error                  | Max Error                    | Mean Error                  | Relative (L2) Error        | Notes                                                                               |
|------------------------|----------------------------|------------------------------|-----------------------------|----------------------------|-------------------------------------------------------------------------------------|
| 10                     | 6.79e-05 7.43e-05 9.52e-05 | 2.36e-03  2.29e-03  2.49e-03 | 9.10e-04  9.52e-04 1.05e-03 | 6.22e-01 6.50e-01 7.18e-01 |                                                                                     |
| 25                     | 5.36e-05 3.71e-05 3.88e-05 | 1.86e-03  2.00e-03  1.88e-03 | 6.08e-04  5.63e-04 5.67e-04 | 4.16e-01 3.85e-01 3.87e-01 |                                                                                     | 
| 50                     | 1.30e-05 1.09e-05 1.31e-05 | 1.02e-03  5.15e-04  8.41e-04 | 2.30e-04  1.58e-04 1.95e-04 | 1.57e-01 1.08e-01 1.33e-01 | Visual agreement very good for final pair state, but clearly some small differences |
| 100                    | 1.21e-06 9.01e-07 1.45e-06 | 3.90e-05  4.36e-05  4.13e-05 | 6.49e-06  5.62e-06 6.78e-06 | 4.43e-03 3.84e-03 4.63e-03 |                                                                                     |
| 200                    | 6.26e-10 5.15e-10 1.08e-09 | 1.03e-07  1.17e-07  1.37e-07 | 2.70e-08  3.02e-08 3.37e-08 | 1.84e-05 2.06e-05 2.30e-05 |                                                                                     |

In general, this method looks slightly more effective than kmeans 


TODO: Come back and try this at the end

The same as the above, but I do:
$$
\tilde{Z}_{\alpha \beta}(\mathbf{r}) =
       \left(\sum_{i=1}^m \varphi_i(\mathbf{r}) G_{i \alpha}^{\varphi}\right)
       \left(\sum_{j=1}^n \varphi_j(\mathbf{r}) G_{j \beta}^\psi\right)
$$
rather than:
$$
\tilde{Z}_{\alpha \beta}(\mathbf{r}) =
       \left(\sum_{i=1}^m \varphi_i(\mathbf{r}) G_{i \alpha}^{\varphi}\right)
       \left(\sum_{i=1}^m \varphi_i(\mathbf{r}) G_{i \alpha}^{\varphi}\right)
$$

i.e use two different sampling matrices on the same set of KS states.


### Sub-sampling with Non-Orthogonal G Matrix

Repeat each run 3 times, and highlight the most favourable
- Comment on whether orthogonalisation of `G` has any effect. Can one make a theoretical argument for the empirical observation?


| N Interpolation Points | Min Error                    | Max Error                    | Mean Error                   | Relative (L2) Error          | Notes |
|------------------------|------------------------------|------------------------------|------------------------------|------------------------------|-------|
| 10                     | 1.22e-04  7.00e-05  8.08e-05 | 2.56e-03  2.31e-03  2.49e-03 | 1.09e-03  9.01e-04  9.96e-04 | 7.48e-01  6.16e-01  6.80e-01 |       |
| 25                     | 3.73e-05  3.06e-05  3.53e-05 | 1.99e-03  1.66e-03  1.97e-03 | 5.21e-04  4.92e-04  4.20e-04 | 3.56e-01  3.36e-01  2.87e-01 |       | 
| 50                     | 8.40e-06  9.94e-06  9.36e-06 | 5.39e-04  4.86e-04  4.56e-04 | 1.21e-04  1.26e-04  1.22e-04 | 8.26e-02  8.59e-02  8.37e-02 |       |
| 100                    | 9.00e-07  9.52e-07  1.09e-06 | 2.73e-05  1.43e-05  1.29e-05 | 4.44e-06  4.41e-06  4.37e-06 | 3.03e-03  3.01e-03  2.99e-03 |       |
| 200                    | 5.17e-10  3.76e-10  4.91e-10 | 6.60e-08  7.89e-08  6.71e-08 | 1.89e-08  1.92e-08  1.84e-08 | 1.29e-05  1.31e-05  1.25e-05 |       |



## What I have inferred from these results

Choice of random seeds for all sampling methods cause the final result to fluctuate, but the error is always consistent to the same order of magnitude.
Sub-sampling with G outperforms kmeans by a small amount, but for 200 interpolation points, the relative L2 errors are the same order of magnitude
Orthogonalisation of G appears to have no effect on the error associated with the pair product expansion. If anything, lack of orthognality appears to improve the results slight (although, could be because
I use two random matices in this test).

Visual agreement is very good from 100 points onwards.

It's not clear how many interpolation points one should choose, and is probably system-dependent.
Prior to any TD runs, one could therefore converge the number of interpolation points until they're satisfied with error introduced.
Things to add would be the Coulomb error metric, as this indicates the errors in the integrals, and to just do some studies of the relation between the error in the integrals compared to the error in the
exchange energy. My guess is that it's a linear relationship. Better yet, having some numerical expression to evaluate to give an estimate of this in the code would be very desirable.

