# Input and output representations
## Inputs
Each particle's input state vector represents:
- Position, $p_i^{t}$
- A sequence of $C=5$ previous velocities. The velocity is calculated from the difference in position between consecutive time steps: $\dot{p}^t=p^t-p^{t-1}$
- Features that capture the static material properties (e.g. water, sand, rigid, etc..). The material is expressed as a particle feature, $a_i$, represented with a learned embedding vector of size 16.
- The global properties of the system, $g$, include external forces and global material properties.
- For datasets with fixed flat orthogonal walls, instead of adding boundary particles, a feature is added to each node indicating the vector distance to each wall, $d^{t}_i$. To maintain spatial transalation invariance, this distance is clipped to the connectivity radius $R$, achieving a similar effect to that of the boundary particles.

The particle feature tensor looks as follows:
$$x^{t}_i = [p^{t}_i,\dot{p}^{t-C+1}_i,...,\dot{p}^{t}_i,a_i, g, d^{t}_i]$$


## Outputs
The prediction targets for supervised learning are the per-particle average acceleration, $$\ddot{p}^t_i=\dot{p}^{t+1}-\dot{p}^t=p^{t+1}-2p^{t}+p^{t-1}$$


## Normalization
...TODO...

## Noise
...TODO...



In [16]:
%%writefile ../open_gns/dataset.py

import numpy as np
import h5py
import torch
from torch_geometric.data import InMemoryDataset, Data
from torch_geometric.transforms import RadiusGraph

R=0.08 # Connectivity radius $R$

class GNSDataset(InMemoryDataset):
    def __init__(self, root, transform=None, pre_transform=None):
        super(GNSDataset, self).__init__(root, transform, pre_transform)
        self.data, self.slices = torch.load(self.processed_paths[0])
        
    @property
    def raw_file_names(self):
        return ['box_bath.hdf5']
    
    @property
    def processed_file_names(self):
        return ['box_bath_samples.pt']
    
    def process(self):
        # TODO: Split into train, test, val
        data_list = []
        # Read all positions & transform into features
        f = h5py.File(self.raw_file_names[0],'r')
        for k in range(1): # TODO: f.get('rollouts').keys():
            # Read positions
            positions = np.array(f.get(f'rollouts/{k}/positions'))
            num_steps = len(positions)
            # Calculate velocities
            velocities = np.concatenate(([np.zeros(positions[0].shape)],
                                        positions[1:] - positions[0:-1]),axis=0)
            # Calculate accelerations
            accelerations = np.concatenate(([np.zeros(velocities[0].shape)],
                                        velocities[1:] - velocities[0:-1]),axis=0)
            # Material properties (using one-hot encoding for now)
            m = np.zeros((len(positions[0]), 2))
            m[0:64] = [0,1] # First 64 particles are solid
            m[64:] = [1,0]
            # TODO: Global forces
            # TODO: Distance to walls
            # Drop the first 5 and the last step since we don't have accurate velocities/accelerations
            for t in range(6,num_steps-1):
                x = np.concatenate((positions[t], m, np.concatenate(velocities[t-5:t], axis=1)), axis=1)
                y = torch.tensor(accelerations[t]).float()
                data = Data(x=torch.tensor(x).float(), y=y, pos=torch.as_tensor(positions[t]))
                # Apply pre-transform to get edges
                calculate_edges = self.pre_transform or RadiusGraph(R)
                data = calculate_edges(data)
                data_list.append(data)
        torch.save(self.collate(data_list), self.processed_paths[0])
    


Overwriting ../open_gns/dataset.py


In [17]:
dataset = GNSDataset('.')
print(len(dataset), dataset.num_edge_features, dataset.num_node_features)
print(dataset[0])

Processing...
Done!
143 0 20
Data(edge_index=[2, 15440], pos=[1024, 3], x=[1024, 20], y=[1024, 3])
