MO-coefficients get mangled when written and read again #142

eljost · 2024-03-20T09:43:32Z

Dear developers,

thanks for this nice library. I encountered an issue that seems related to #127. Please see the reproducer below:

Everything was tested on Python 3.12.1 and trexio==2.4.2.

It mimics the reading & writing of information from an unrestricted calculation done on 2 atoms with a total of 2 AOs and 2 alpha MOs Ca and 2 beta MOs Cb.

from pathlib import Path

import numpy as np
import trexio

np.set_printoptions(suppress=True, precision=4, linewidth=180)

nucleus_num = 2
ao_num = 2

fn = Path("issue.h5")
if fn.exists():
    fn.unlink()
tf = trexio.File(str(fn), mode="w", back_end=trexio.TREXIO_HDF5)

# Write file
coords3d = np.zeros((2, 3))
coords3d[1, 2] = 1.889
trexio.write_nucleus_num(tf, nucleus_num)
trexio.write_nucleus_coord(tf, coords3d)
mo_num = ao_num
Ca = np.random.rand(ao_num, mo_num)
Cb = np.random.rand(ao_num, mo_num)
# Mos are in columns; C.shape is (num_ao, 2*num_ao)
# Concatenate alpha and beta MOs
C = np.concatenate((Ca, Cb), axis=1)
mo_spin = ([0] * mo_num) + ([1] * mo_num)
mo_num = C.shape[1]
trexio.write_ao_num(tf, ao_num)
trexio.write_mo_num(tf, mo_num)
trexio.write_mo_coefficient(tf, C)
trexio.write_mo_spin(tf, mo_spin)
tf.close()

# Read again
with trexio.File(str(fn), mode="r", back_end=trexio.TREXIO_HDF5) as tf:
    coords3d_read = trexio.read_nucleus_coord(tf)
    # coords3d is fine
    np.testing.assert_allclose(coords3d, coords3d_read)
    C_read = trexio.read_mo_coefficient(tf)
    # C_read is mangled ...
    np.testing.assert_allclose(C, C_read)

When the MO-coefficients are stored with the shape outlined in the paper/documentation (ao_num, mo_num) the coefficients get mangled, when read again.

When I build the matrix C with shape (mo_num, ao_num) and store it, then everything is fine, but then C is also read again this shape, which seems not consistent with the documentation.

On the contrary, the 3d Cartesian coordinates seem fine; nothing gets transposed. So it is not like that I have to provide the data with column-major order initially.

So, did I misuse the API or is this a bug/documentation issue?

All the best
Johannes

The text was updated successfully, but these errors were encountered:

scemama · 2024-03-20T13:15:39Z

Hello,
I think that you are confused by the column-major ordering :-)

In the documentation, the nuclear coordinates are:

coord	float	(3,nucleus.num)	Coordinates of the atoms

So in Python, if you use the default ordering (row major), you are supposed to make a numpy array of shape (nucleus_num, 3). This is indeed what you do:

coords3d = np.zeros((2, 3))

and everything is fine.

For the MO coefficients, the documentation states that it is:

coefficient	float	(ao.num, mo.num)

So in Python, you should use a numpy array of shape (mo_num, ao_num) where mo_num = 2*ao_num in your particular case. But your MO coefficients array in Python has shape (ao_num, mo_num), which is wrong.

Thanks for giving feedback. As you are the 2nd person getting confused with this, we will try to make things clearer in the documentation.

Sorry for the confusion...

eljost · 2024-03-20T13:29:23Z

Thanks for getting back to me so quickly. I seem to have misread the shape of the coordinate array ... Good to know the library is fine and that the error is on my side.

eljost closed this as completed Mar 20, 2024

scemama mentioned this issue Mar 20, 2024

Added both row-major and column-major representations in documentation. #143

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MO-coefficients get mangled when written and read again #142

MO-coefficients get mangled when written and read again #142

eljost commented Mar 20, 2024

scemama commented Mar 20, 2024

eljost commented Mar 20, 2024

MO-coefficients get mangled when written and read again #142

MO-coefficients get mangled when written and read again #142

Comments

eljost commented Mar 20, 2024

scemama commented Mar 20, 2024

eljost commented Mar 20, 2024