# Data types and IO

#### Import general modules

mpi4py is always required when using these tools. Numpy is always good to have if any manipulation is to be done.

In [1]:
# Import required modules
from mpi4py import MPI #equivalent to the use of MPI_init() in C
import matplotlib.pyplot as plt
import numpy as np

# Get mpi info
comm = MPI.COMM_WORLD

#### Import modules from pynektools

In this case we will import all the data types that we currently support, as well as io functions that are required to populate them.

In [2]:
# Data types
from pynektools.datatypes.msh import Mesh
from pynektools.datatypes.coef import Coef
from pynektools.datatypes.field import Field, FieldRegistry

# Readers
from pynektools.io.ppymech.neksuite import preadnek, pynekread

# Writers
from pynektools.io.ppymech.neksuite import pwritenek, pynekwrite

fname_2d = '../data/mixlay0.f00001'
fname_3d = '../data/rbc0.f00001'

## Mesh

The mesh object is designed to interact with the coordinates of the SEM domain. It goes without saying then, that if a file is used to initialize the object, it must contain the mesh.

### Initializing

The mesh object can be initialized in multiple manners that we have ideantified to be typical when post processing data, here we show some of them.

#### Initializing an empty object

We can initialize an empty mesh object by just providing the communicator.

Note that at this stage we must indicate if we want to find the connectivity of the points. This is important if later one whishes to average points at element interfaces. Be carefull, however, as it will require more memory.

In [3]:
msh_2d = Mesh(comm, create_connectivity=True)
msh_3d = Mesh(comm, create_connectivity=True)

2024-08-25 19:40:00,997 - Mesh - INFO - Initializing empty Mesh object.
2024-08-25 19:40:00,998 - Mesh - INFO - Initializing empty Mesh object.


#### Initializing from a file

The standard way to initialize the data in a mesh object requires that an empty object is initialized and the the pynekread function is called. This function will take the empty object and read only the data containing the mesh.

One must give the name and the empty object that will hold the data. Aditionally, the type of the data is an input.

In [4]:
pynekread(fname_2d, comm, data_dtype=np.double, msh=msh_2d)
pynekread(fname_3d, comm, data_dtype=np.single, msh=msh_3d)

2024-08-25 19:40:01,007 - pynekread - INFO - Reading file: ../data/mixlay0.f00001
2024-08-25 19:40:01,012 - Mesh - INFO - Initializing Mesh object from x,y,z ndarrays.
2024-08-25 19:40:01,013 - Mesh - INFO - Initializing common attributes.
2024-08-25 19:40:01,014 - Mesh - INFO - Creating connectivity
2024-08-25 19:40:01,210 - Mesh - INFO - Mesh object initialized.
2024-08-25 19:40:01,211 - Mesh - INFO - Mesh data is of type: float64
2024-08-25 19:40:01,212 - Mesh - INFO - Elapsed time: 0.199450671s
2024-08-25 19:40:01,212 - pynekread - INFO - File read
2024-08-25 19:40:01,213 - pynekread - INFO - Elapsed time: 0.206250271s
2024-08-25 19:40:01,213 - pynekread - INFO - Reading file: ../data/rbc0.f00001
2024-08-25 19:40:01,218 - Mesh - INFO - Initializing Mesh object from x,y,z ndarrays.
2024-08-25 19:40:01,219 - Mesh - INFO - Initializing common attributes.
2024-08-25 19:40:01,219 - Mesh - INFO - Creating connectivity
2024-08-25 19:40:01,454 - Mesh - INFO - Mesh object initialized.
2024-

#### Initializing from a hexadata object.

We have previously use the module pymech to post process data. Because of this we wrote io routines that produce objects of this type, in case some existing workflows already rely on that. We note that this is parallel and each rank will have a hexadata object with a portion of the total data.

Note that in our implementation, the hexadata object will always read the full file.

In general, this should not be the main way to initialize objects. But we give the option.

The steps that we follow to initialize are the following:

1. Read the hexadata object.
2. Initialize the mesh object from it

In [5]:
# 1.
data_2d = preadnek(fname_2d, comm, data_dtype=np.double)
data_3d = preadnek(fname_3d, comm, data_dtype=np.single)
# 2. 
msh_2d = Mesh(comm, data=data_2d, create_connectivity=True)
msh_3d = Mesh(comm, data=data_3d, create_connectivity=True)

2024-08-25 19:40:01,467 - preadnek - INFO - Reading file: ../data/mixlay0.f00001
2024-08-25 19:40:01,510 - preadnek - INFO - Elapsed time: 0.043632612000000015s
2024-08-25 19:40:01,511 - preadnek - INFO - Reading file: ../data/rbc0.f00001
2024-08-25 19:40:01,533 - preadnek - INFO - Elapsed time: 0.02223868699999998s
2024-08-25 19:40:01,534 - Mesh - INFO - Initializing Mesh object from HexaData object.
2024-08-25 19:40:01,538 - Mesh - INFO - Initializing common attributes.
2024-08-25 19:40:01,539 - Mesh - INFO - Creating connectivity
2024-08-25 19:40:01,734 - Mesh - INFO - Mesh object initialized.
2024-08-25 19:40:01,735 - Mesh - INFO - Mesh data is of type: float64
2024-08-25 19:40:01,735 - Mesh - INFO - Elapsed time: 0.20075867199999997s
2024-08-25 19:40:01,738 - Mesh - INFO - Initializing Mesh object from HexaData object.
2024-08-25 19:40:01,741 - Mesh - INFO - Initializing common attributes.
2024-08-25 19:40:01,742 - Mesh - INFO - Creating connectivity
2024-08-25 19:40:01,990 - Mesh

#### Initializing from an ndarray

In some instances, one might create the SEM coordinates directly in python. If that is the case, starting directly from the coordinates as an array is also a possibility. One would first create an empty object and the initialize from coordinates as follow:

In [6]:
# 1.Copy the coordinates from before just to show

x = msh_3d.x.copy().astype(np.float64)
y = msh_3d.y.copy().astype(np.float64)
z = msh_3d.z.copy().astype(np.float64)

# 2.Initialize a new mesh object
msh_3d_sub = Mesh(comm, create_connectivity=True)
msh_3d_sub.init_from_coords(comm, x=x, y=y, z=z)

2024-08-25 19:40:02,007 - Mesh - INFO - Initializing empty Mesh object.
2024-08-25 19:40:02,008 - Mesh - INFO - Initializing Mesh object from x,y,z ndarrays.
2024-08-25 19:40:02,009 - Mesh - INFO - Initializing common attributes.
2024-08-25 19:40:02,010 - Mesh - INFO - Creating connectivity
2024-08-25 19:40:02,187 - Mesh - INFO - Mesh object initialized.
2024-08-25 19:40:02,187 - Mesh - INFO - Mesh data is of type: float64
2024-08-25 19:40:02,188 - Mesh - INFO - Elapsed time: 0.17974101399999998s


In this case, just to show a simple manipulation, we casted the arrays from single to double precision

## Coef

The coef object holds the jacobian matrix components, mass matrix and routines to perform derivatives and integrals in the SEM mesh. It is always initialized from a mesh object.

One aditional option for 3D meshes is to also obtain integration weights for the area of the facets in the SEM mesh. If you do not need it, do not activate it, as this takes some extra time and requires extra memory.

The data type of the coef object attributes will match the type of the mesh object.

In [7]:
coef_2d = Coef(msh_2d, comm, get_area=True)
coef_3d = Coef(msh_3d, comm, get_area=True)

2024-08-25 19:40:02,195 - Coef - INFO - Initializing Coef object
2024-08-25 19:40:02,197 - Coef - INFO - Getting derivative matrices
2024-08-25 19:40:02,199 - Coef - INFO - Calculating the components of the jacobian
2024-08-25 19:40:02,210 - Coef - INFO - Calculating the jacobian determinant and inverse of the jacobian matrix
2024-08-25 19:40:02,212 - Coef - INFO - Calculating the mass matrix
2024-08-25 19:40:02,213 - Coef - INFO - Coef object initialized
2024-08-25 19:40:02,213 - Coef - INFO - Coef data is of type: float64
2024-08-25 19:40:02,214 - Coef - INFO - Elapsed time: 0.01826359200000005s
2024-08-25 19:40:02,214 - Coef - INFO - Initializing Coef object
2024-08-25 19:40:02,215 - Coef - INFO - Getting derivative matrices
2024-08-25 19:40:02,216 - Coef - INFO - Calculating the components of the jacobian
2024-08-25 19:40:02,275 - Coef - INFO - Calculating the jacobian determinant and inverse of the jacobian matrix
2024-08-25 19:40:02,282 - Coef - INFO - Calculating the mass matrix

## Field

The field object is designed to hold the data from SEM fields in a way that intefacing to a nek5000 binary file is easily achieved.

The initialization can be done in similar ways as the mesh, i.e., by using hexadata and the pynekread routine.

### Initializing

#### Initializing an empty object.

We follow the same procedure to initialize an empty object.

In [8]:
fld_2d = Field(comm)
fld_3d = Field(comm)

2024-08-25 19:40:02,300 - Field - INFO - Initializing empty Field object
2024-08-25 19:40:02,301 - Field - INFO - Initializing empty Field object


#### Initializing from a file

Using the pynekread routine, one can follow the same procedure. In this case, if only the fld file is indicated as input, the mesh in the file will not be read

In [9]:
pynekread(fname_2d, comm, data_dtype=np.double, fld=fld_2d)
pynekread(fname_3d, comm, data_dtype=np.single, fld=fld_3d)

2024-08-25 19:40:02,308 - pynekread - INFO - Reading file: ../data/mixlay0.f00001
2024-08-25 19:40:02,310 - pynekread - INFO - Reading field data
2024-08-25 19:40:02,313 - pynekread - INFO - File read
2024-08-25 19:40:02,314 - pynekread - INFO - Elapsed time: 0.005880368999999996s
2024-08-25 19:40:02,314 - pynekread - INFO - Reading file: ../data/rbc0.f00001
2024-08-25 19:40:02,316 - pynekread - INFO - Reading field data
2024-08-25 19:40:02,321 - pynekread - INFO - File read
2024-08-25 19:40:02,322 - pynekread - INFO - Elapsed time: 0.008018121999999961s


Note that you can read both mesh and fields with pynekread if you specify both keywords, i.e, msh= and fld=


#### Initializing from hexadata

Just as for the mesh, the same interface is valid.

In [10]:
# 1.
data_2d = preadnek(fname_2d, comm, data_dtype=np.double)
data_3d = preadnek(fname_3d, comm, data_dtype=np.single)
# 2. 
fld_2d = Field(comm, data=data_2d)
fld_3d = Field(comm, data=data_3d)

2024-08-25 19:40:02,329 - preadnek - INFO - Reading file: ../data/mixlay0.f00001
2024-08-25 19:40:02,371 - preadnek - INFO - Elapsed time: 0.042511914999999956s
2024-08-25 19:40:02,374 - preadnek - INFO - Reading file: ../data/rbc0.f00001
2024-08-25 19:40:02,395 - preadnek - INFO - Elapsed time: 0.020635636999999818s
2024-08-25 19:40:02,397 - Field - INFO - Initializing Field object from HexaData
2024-08-25 19:40:02,404 - Field - INFO - Field object initialized
2024-08-25 19:40:02,405 - Field - INFO - Elapsed time: 0.008466679999999949s
2024-08-25 19:40:02,406 - Field - INFO - Initializing Field object from HexaData
2024-08-25 19:40:02,410 - Field - INFO - Field object initialized
2024-08-25 19:40:02,411 - Field - INFO - Elapsed time: 0.005523057999999859s



### Contents of the field object

The field object contains all the information in a subdirectory called fields that is divided into the conventions of a nek5000 binary format.

The keywords are:

In [11]:
for key in fld_2d.fields.keys():
    print(f'{key} has {len(fld_2d.fields[key])} fields')

print('=================')

for key in fld_3d.fields.keys():
    print(f'{key} has {len(fld_3d.fields[key])} fields')

vel has 2 fields
pres has 1 fields
temp has 1 fields
scal has 2 fields
vel has 3 fields
pres has 1 fields
temp has 1 fields
scal has 0 fields


Each keword has a list of fields depending on the contents of the file. Note how the 2 files that we test here have different information.

To access the content of the files, one can do something like this:

In [12]:
u = fld_3d.fields['vel'][0]
v = fld_3d.fields['vel'][1]
w = fld_3d.fields['vel'][2]
p = fld_3d.fields['pres'][0]
t = fld_3d.fields['temp'][0]

### Adding new ndarrays to the field

If one wishes to add new data, it is as simple as appending arrays to any of the keys of the field object. Given that nek5000 readers follow certain logic, it is always safer to add new data to the scalars, unless one wishes to overwrite velocity, pressure, and/or temperature.

For example, we can add the velocity magnitude as a scalar with:

In [13]:
mag = np.sqrt(u**2 + v**2 + w**2)
fld_3d.fields['scal'].append(mag)
fld_3d.update_vars()

If one adds new data, it is needed that one calls the update_vars method to update the attributes that keep track of the number of fields in the field object. This is needed, for example, if one wishes to write data to disk.

## Writing out data

Writing the data out always needs a mesh and field object, even if one does not wish to write the mesh out.

The procedure is as simple as follows:

In [14]:
fname_out = '../data/out_rbc0.f00001'

pynekwrite(fname_out, comm, msh=msh_3d, fld=fld_3d, write_mesh=True, wdsz=4)

2024-08-25 19:40:02,441 - pynekwrite - INFO - Writing file: ../data/out_rbc0.f00001
2024-08-25 19:40:02,453 - pynekwrite - INFO - Elapsed time: 0.01182564400000019s


## Field registry

The field registry is a class that extends the field class. We believe this is the class that should be used instead of fields, as it allows to do the same things, but with some extra flexibility.

The methods to initialize it and write data with it are the same. It also contains the fields attribute with a list of the present fields. 

It however, has an additional registry attribute that names and points to each field.

In [15]:
fld_3d_r = FieldRegistry(comm)
pynekread(fname_3d, comm, data_dtype=np.single, fld=fld_3d_r)

2024-08-25 19:40:02,461 - Field - INFO - Initializing empty Field object
2024-08-25 19:40:02,462 - pynekread - INFO - Reading file: ../data/rbc0.f00001
2024-08-25 19:40:02,464 - pynekread - INFO - Reading field data
2024-08-25 19:40:02,467 - pynekread - INFO - File read
2024-08-25 19:40:02,468 - pynekread - INFO - Elapsed time: 0.006165125999999965s


In [16]:
for key in fld_3d_r.registry.keys():
    print(f'{key} has {fld_3d_r.registry[key].dtype} dtype')

u has float32 dtype
v has float32 dtype
w has float32 dtype
p has float32 dtype
t has float32 dtype


Note that all the fields that have been adressed as indices in a list have now been assigned names.

### Adding new fields

#### From memory

Adding new fields can now also been done very easy as well. Here we can add a new field named mag that is the velocity magnitude calculated earlier.

This field will be added as an scalar to the fields attribute.

In [17]:
fld_3d_r.add_field(comm, field_name='mag', field=mag, dtype=mag.dtype)
print(f'Field mag added to registry and fields directory in pos {fld_3d_r.registry_pos["mag"]}')

Field mag added to registry and fields directory in pos scal_0


#### From disk

You can also just read one field from a file, which reduces the memory footprint.

The procesude is shown below, note that here we know that we have written the field mag as an scalar.

In [18]:
fld_3d_r.add_field(comm, field_name='mag_r', file_name=fname_out, file_type='fld', file_key='scal_0', dtype=mag.dtype)
print(f'Field mag_r added to registry and fields directory in pos {fld_3d_r.registry_pos["mag_r"]}')

2024-08-25 19:40:02,493 - pynekread_field - INFO - Reading field: scal_0 from file: ../data/out_rbc0.f00001
2024-08-25 19:40:02,497 - pynekread_field - INFO - File read
2024-08-25 19:40:02,498 - pynekread_field - INFO - Elapsed time: 0.004441335999999962s
Field mag_r added to registry and fields directory in pos scal_1


Let's compare the field that we previously wrote with the one we calculated

In [19]:
eq = np.allclose(fld_3d_r.registry['mag'], fld_3d_r.registry['mag_r'])
print(eq)

True


### Some considerations when dealing with the registry

In our implementation, each member of the registry is linked to a member of the lists contained in the fields attribute.

For the link to be mantained, we must modify the registry in place, not reassing fields to it.

If you would like to replace an entry in the field registry, then use the add field method with the same key

Lets experiment adding a field full of ones. Based on the order of our operations, we know it is stored in scal 2. You can also check the registry_pos attribute:

In [20]:
fld_3d_r.add_field(comm, field_name='ones', field=np.ones_like(mag), dtype=mag.dtype)
print(fld_3d_r.registry_pos['ones'])
print(fld_3d_r.registry['ones'][100,0,:,:])
print(fld_3d_r.fields['scal'][2][100,0,:,:])

scal_2
[[1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]]
[[1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1.]]


If I modify the registry, the list is modified:

In [21]:
fld_3d_r.registry['ones'][100,0,:,:] = 2.0
print(fld_3d_r.fields['scal'][2][100,0,:,:])

[[2. 2. 2. 2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2. 2. 2. 2.]]


The same happens in the opposite direction

In [22]:
fld_3d_r.fields['scal'][2][100,0,:,:] = 3.0
print(fld_3d_r.registry['ones'][100,0,:,:])

[[3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3. 3. 3. 3.]]


However if you try to assing a field directly to the registry, you will get an error:



In [23]:
zeros = np.zeros_like(mag)

try:
    fld_3d_r.registry['ones'] = zeros
except KeyError as e:
    print(e)

"Key 'ones' already exists. Cannot overwrite existing array without the add field method"


But see that if you add the field again with the proper method, then it should work as you want

In [24]:
fld_3d_r.add_field(comm, field_name='ones', field=zeros, dtype=zeros.dtype)
print(fld_3d_r.fields['scal'][2][100,0,:,:])

[[0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0.]]


## Inspecting memory usage

To inspect the memory usage, one can use the monitoring module

In [25]:
from pynektools.monitoring.memory_monitor import MemoryMonitor

mm = MemoryMonitor()

You can chose to inspect the total memory used, or check the ussage of each attibute.

In [26]:
mm.object_memory_usage(comm, msh_3d, "mesh_3d", print_msg=False)
mm.object_memory_usage_per_attribute(comm, msh_3d, "mesh_3d", print_msg=False)

mm.object_memory_usage(comm, coef_3d, "coef_3d", print_msg=False)
mm.object_memory_usage_per_attribute(comm, coef_3d, "coef_3d", print_msg=False)

mm.object_memory_usage(comm, fld_3d_r, "fld_3d", print_msg=False)
mm.object_memory_usage_per_attribute(comm, fld_3d_r, "fld_3d", print_msg=False)

### Report

Report the information.

Note that for field registry objects, the information of registry and field will appear as duplicate. We have seen howeverm that they point to the sampe places in memory, therefore no need to worry about memory duplication.

In [27]:
if comm.Get_rank() == 0:
    for key in mm.object_report.keys():
        mm.report_object_information(comm, key)
        print('===================================')


Rank: 0 - Memory usage of mesh_3d: 25.22057342529297 MB
Rank: 0 - Memory usage of mesh_3d attributes:
Rank: 0 - Memory usage of mesh_3d attr - coord_hash_to_shared_map: 21.699501037597656 MB
Rank: 0 - Memory usage of mesh_3d attr - create_connectivity_bool: 3.0517578125e-05 MB
Rank: 0 - Memory usage of mesh_3d attr - gdim: 3.0517578125e-05 MB
Rank: 0 - Memory usage of mesh_3d attr - glb_nelv: 4.57763671875e-05 MB
Rank: 0 - Memory usage of mesh_3d attr - global_element_number: 0.00469970703125 MB
Rank: 0 - Memory usage of mesh_3d attr - lx: 4.57763671875e-05 MB
Rank: 0 - Memory usage of mesh_3d attr - lxyz: 4.57763671875e-05 MB
Rank: 0 - Memory usage of mesh_3d attr - ly: 4.57763671875e-05 MB
Rank: 0 - Memory usage of mesh_3d attr - lz: 4.57763671875e-05 MB
Rank: 0 - Memory usage of mesh_3d attr - nelv: 4.57763671875e-05 MB
Rank: 0 - Memory usage of mesh_3d attr - offset_el: 4.57763671875e-05 MB
Rank: 0 - Memory usage of mesh_3d attr - x: 1.1719970703125 MB
Rank: 0 - Memory usage of mes