In [1]:
import numpy as np
import tensorflow as tf
import atomdnn
from atomdnn import data
from atomdnn import network
from atomdnn.data import *
from atomdnn.network import Network

# Creat Data object and reead inputdata from LAMMPS dump files

### read inputdata function:

### read_inputdata (fp_filename=None, der_filename=None, image_num=None,            read_der=TFAtomDNN.compute_force)

- **fp_filename**: fingerprints file path, use wildcard * in the name for a serials of files.

- **fp_filename**: derivatives file name, use wildcard * in the name for a serials of files.

- **image_num**: if not set, read all images

- **read_der**: set to true if read derivative data


It will generate input data in the form of tensorflow tensors, which can be accessed using keys:

- **input_dict [ 'fingerprints' ] [ i ] [ j ] [ k ]** gives the k-th fingerprint of j-th atom in i-th image.
    
- **input_dict [ 'atom_type' ] [ i ] [ j ]** gives the atom type of j-th atom in i-th image.
    
- **input_dict [ 'dGdr' ] [ i ] [ j ] [ k ] [ m ]** gives the derivative of k-th fingerprint in j-th derivative data blcok of i-th image w.r.t m-th cooridinate.
    
- **input_dict [ 'center_atom_id' ] [ i ] [ j ]** gives the center atom id in j-th derivative data block of i-th image.
    
- **input_dict [ 'neighbor_atom_id' ] [ i ] [ j ]** gives the neighbor atom id in j-th derivative data block of i-th image. Note that the neighbors could be ghost atoms.
    
- **input_dict [ 'neighbor_atom_coord' ] [ i ] [ j ] [ m ] [ 0 ]** gives the m-th coordiate of neighbor atom in j-th derivative block of i-th image. Note that the last dimension of the array is 1 which is added for matrix multiplication in neural network force_stress_layer. 

In [2]:
grdata = Data()

In [3]:
fp_filename = '/workspace/data/group_share/data_graphene_ase/deform/dump_fp.*'
der_filename = '/workspace/data/group_share/data_graphene_ase/deform/dump_der.*'
grdata.read_inputdata(fp_filename=fp_filename,der_filename=der_filename)


Reading fingerprints data from LAMMPS dump files /workspace/data/group_share/data_graphene_ase/deform/dump_fp.i
  so far read 50 images ...
  so far read 100 images ...
  so far read 150 images ...
  so far read 200 images ...
  so far read 250 images ...
  so far read 300 images ...
  Finish reading fingerprints from total 300 images.

  image number = 300
  max number of atom = 24
  number of fingerprints = 59
  type of atoms = 1

Reading derivative data from a series of files /workspace/data/group_share/data_graphene_ase/deform/dump_der.i
This may take a while ...
  so far read 50 images ...
  so far read 100 images ...
  so far read 150 images ...
  so far read 200 images ...
  so far read 250 images ...
  so far read 300 images ...
  Finish reading dGdr derivatives from total 300 images.

  Pad zeros to derivatives data if needed ...
  so far paded 50 images ...
  so far paded 100 images ...
  so far paded 150 images ...
  so far paded 200 images ...
  so far paded 250 images ...

In [4]:
fp_filename = '/workspace/data/group_share/data_graphene_ase/move/dump_fp.*'
der_filename = '/workspace/data/group_share/data_graphene_ase/move/dump_der.*'
grdata.read_inputdata(fp_filename=fp_filename,der_filename=der_filename,append=True)


Reading fingerprints data from LAMMPS dump files /workspace/data/group_share/data_graphene_ase/move/dump_fp.i
  so far read 350 images ...
  so far read 400 images ...
  so far read 450 images ...
  so far read 500 images ...
  so far read 550 images ...
  so far read 600 images ...
  so far read 650 images ...
  so far read 700 images ...
  so far read 750 images ...
  so far read 800 images ...
  so far read 850 images ...
  so far read 900 images ...
  Finish reading fingerprints from total 900 images.

  image number = 900
  max number of atom = 24
  number of fingerprints = 59
  type of atoms = 1

Reading derivative data from a series of files /workspace/data/group_share/data_graphene_ase/move/dump_der.i
This may take a while ...
  so far read 350 images ...
  so far read 400 images ...
  so far read 450 images ...
  so far read 500 images ...
  so far read 550 images ...
  so far read 600 images ...
  so far read 650 images ...
  so far read 700 images ...
  so far read 750 imag

# Read output data genrated using LAMMPS

In [5]:
output_file = '/workspace/data/group_share/data_graphene_ase/deform/lmp_output.dat'

grdata.read_outputdata(filename=output_file)


Reading outputs from /workspace/data/group_share/data_graphene_ase/deform/lmp_output.dat ...
  so far read 50 images ...
  so far read 100 images ...
  so far read 150 images ...
  so far read 200 images ...
  so far read 250 images ...
  Finish reading outputs from total 300 images.



In [6]:
output_file = '/workspace/data/group_share/data_graphene_ase/move/lmp_output.dat'

grdata.read_outputdata(filename=output_file,append=True)


Reading outputs from /workspace/data/group_share/data_graphene_ase/move/lmp_output.dat ...
  so far read 350 images ...
  so far read 400 images ...
  so far read 450 images ...
  so far read 500 images ...
  so far read 550 images ...
  so far read 600 images ...
  so far read 650 images ...
  so far read 700 images ...
  so far read 750 images ...
  so far read 800 images ...
  so far read 850 images ...
  Finish reading outputs from total 900 images.



# Create tensorflow dataset for training

In [7]:
# convert data to tensors
grdata.convert_data_to_tensor()

Conversion may take a while ...


2021-10-07 02:31:19.727665: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9673 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:19:00.0, compute capability: 7.5
2021-10-07 02:31:19.728748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 9673 MB memory:  -> device: 1, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:1a:00.0, compute capability: 7.5
2021-10-07 02:31:19.729570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:2 with 9673 MB memory:  -> device: 2, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:67:00.0, compute capability: 7.5
2021-10-07 02:31:19.730386: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:3 with 9673 MB memory:  -> device: 3, name: NVIDIA GeForce RTX

It took 69.3440 second.


In [8]:
tf_dataset = tf.data.Dataset.from_tensor_slices((grdata.input_dict,grdata.output_dict))

In [9]:
tf.data.experimental.save(tf_dataset, '/workspace/data/group_share/graphene_tfdataset')

2021-10-07 02:32:57.012530: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
