# A quick guide to some WEAVE tools

## The problem

In this demo we are simulating drop a ball and letting it bounce.

We will show how we can take the simulation results and converts them to a sina format for ingestion in other script or in a sina store.

## The "simulation"

We will be using [this scipt](ball_bounce.py) to generate a single simulation of a bouncing ball

```bash
usage: ball_bounce.py [-h] [--xpos XPOS] [--ypos YPOS] [--zpos ZPOS]
                      [--xvel XVEL] [--yvel YVEL] [--zvel ZVEL]
                      [--gravity GRAVITY] [--box_side_length BOX_SIDE_LENGTH]
                      [--runtime RUNTIME] [--frequency FREQUENCY]
                      [--drag DRAG] [--output OUTPUT] [--group GROUP]
                      [--run RUN]

optional arguments:
  -h, --help            show this help message and exit
  --xpos XPOS, -x XPOS  initial x position (default: 0.0)
  --ypos YPOS, -y YPOS  initial y position (default: 0.0)
  --zpos ZPOS, -z ZPOS  initial z position (default: 0.0)
  --xvel XVEL, -X XVEL  initial x velocity (default: 0.0)
  --yvel YVEL, -Y YVEL  initial y velocity (default: 0.0)
  --zvel ZVEL, -Z ZVEL  initial z velocity (default: 0.0)
  --gravity GRAVITY, -g GRAVITY
                        gravity (default: 9.81)
  --box_side_length BOX_SIDE_LENGTH, -b BOX_SIDE_LENGTH
                        length of the box's sides (default: 10)
  --runtime RUNTIME, -r RUNTIME
                        length of time we let the simualtion run for (default:
                        20)
  --frequency FREQUENCY, --ticks_per_seconds FREQUENCY
                        sampling rate (default: 20)
  --drag DRAG, -d DRAG  drag coefficient (default: 0.1)
  --output OUTPUT, -o OUTPUT
                        output file (default: None)
  --group GROUP, -G GROUP
                        group id (default: 1)
  --run RUN, -R RUN     run id (default: 1)
```


This simulation produces a delimeter separated values (`dsv`) file containing the results.

## Running many parameters

### Basic maestro

We can easily run many of this simulations with maestro with [this yaml file](ball_bounce_simple.yaml)

```bash
maestro run ball_bounce_simple.yaml
```
You can use the `maestro status` command to see where your study is at

In [None]:
input("Press enter when study is done")

### PGEN

You can guess if the number of simulation increase it would be very tedious to manually put all these numbers in the yaml file.

Fortunately maestro allows for python-generation of the parameters. [This file](pgen.py) will generate 20 random samples for us.

## Keeping track of what we ran: Sina

As the number of simulation expands it will quickly become hard to figure out what we run

Sina can help with this.

## Creating sina records from the simulation results

The [following script](dsv_to_sina.py) can comb through our generated `dsv` files, and ingest them into a sina catalog.

Some LLNL code have Sina built in and produce the `.json` files as they run. You could also run the `sina ingest` command on these files to create the store.

In [this maestro yaml file](ball_bounce_suite.yaml) we add an extra step to generate the store after the simulations are ran.
Notice the `*` in the step depency that allows to funnel.

Let's run the following command to generate data

```bash
maestro run -p pgen.py ball_bounce_suite.yaml
```
You can use the `maestro status` command to see where your study is at

In [None]:
input("Press enter when study is done")

### Loading the store

Now that we have a store, let's open it up and run some queries on it.

In [3]:
import sina

store = sina.connect("output.sqlite")

In [4]:
# let's see what is in the store:
print(len(list(store.records.find())))

12


In [5]:
# let's open the record with the maximum number of bounces
rec = next(store.records.find_with_max("num_bounces", 1))
print(rec.raw)

{'id': '208393_9', 'type': 'csv_rec', 'data': {'x_pos_initial': {'value': 63.0}, 'y_pos_initial': {'value': 50.0}, 'z_pos_initial': {'value': 65.0}, 'x_vel_initial': {'value': 9.0}, 'y_vel_initial': {'value': 6.0}, 'z_vel_initial': {'value': -10.0}, 'gravity': {'value': 10.0}, 'box_side_length': {'value': 100.0}, 'group_id': {'value': 208393.0}, 'x_pos_final': {'value': 11.013506608296156}, 'y_pos_final': {'value': 0.0}, 'z_pos_final': {'value': 96.84875366741407}, 'x_vel_final': {'value': 0.3102552749812439}, 'y_vel_final': {'value': 0.0}, 'z_vel_final': {'value': 0.3332205903240224}, 'num_bounces': {'value': 8.0}}, 'curve_sets': {'physics_cycle_series': {'independent': {'cycle': {'value': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 7


## Sina

In [this notebook](visualization.ipynb) we take a look at some of Sina's query and viz capabilities.


## Kosh

We have seen how Sina can helps us tracking our simulations and searching through them.

Kosh is built on top of Sina and allows the user to access data that are too big to be in the store.

In this example we will be working with small files

In [6]:
import kosh

store = kosh.connect("output.sqlite")  # Similar syntack to Sina
# Le'ts confirm we have the same number of records (called datasets in Kosh)
print(len(list(store.find())))
# Let's open a record using the id we found in Sina above (record with max of bounces)
# Kosh gives use access to Sina's queries directly as well.
rec = next(store.get_sina_records().find_with_max("num_bounces", 1))
dataset = store.open(rec["id"])
print(dataset)

12
KOSH DATASET
	id: 208393_9
	name: ???
	creator: ???

--- Attributes ---
	box_side_length: 100.0
	gravity: 10.0
	group_id: 208393.0
	num_bounces: 8.0
	x_pos_final: 11.013506608296156
	x_pos_initial: 63.0
	x_vel_final: 0.3102552749812439
	x_vel_initial: 9.0
	y_pos_final: 0.0
	y_pos_initial: 50.0
	y_vel_final: 0.0
	y_vel_initial: 6.0
	z_pos_final: 96.84875366741407
	z_pos_initial: 65.0
	z_vel_final: 0.3332205903240224
	z_vel_initial: -10.0
--- Associated Data (1)---
	Mime_type: sina/curve
		internal ( physics_cycle_series )
--- Ensembles (0)---
	[]
--- Ensemble Attributes ---
--- Alias Feature Dictionary ---


In [7]:
# Attributes on a Kosh dataset are easy to alter and instantly updated in the db by default
print("N bounces:",dataset.num_bounces)
dataset.my_new_attribute = 6.
print("New:", dataset.my_new_attribute)

N bounces: 8.0
New: 6.0


In [8]:
# On can also easily acces curves:
print(dataset.list_features())
# Let's print only the first 5 times
dataset["physics_cycle_series/time"][:5]

['physics_cycle_series', 'physics_cycle_series/cycle', 'physics_cycle_series/time', 'physics_cycle_series/x_pos', 'physics_cycle_series/y_pos', 'physics_cycle_series/z_pos']


array([0.  , 0.05, 0.1 , 0.15, 0.2 ])

In [None]:
# let's loop through all the records/dataset in this record group and compute x_vel
# and store them in ahdf5 file (outside of db)
import h5py
for ds in store.find(group_id=dataset.group_id):
    x_pos = ds["physics_cycle_series/x_pos"]
    y_pos = ds["physics_cycle_series/y_pos"]
    z_pos = ds["physics_cycle_series/z_pos"]
    time = ds["physics_cycle_series/time"]
    x_vel = (x_pos[1:] - x_pos[:-1])/(time[1:]-time[:-1])
    y_vel = (y_pos[1:] - y_pos[:-1])/(time[1:]-time[:-1])
    z_vel = (z_pos[1:] - z_pos[:-1])/(time[1:]-time[:-1])
    speed = (x_vel+y_vel+z_vel)/3.
    nm = f"output/vel_{ds.id}.hdf5"
    h5 = h5py.File(nm,"w")
    h5["x_vel"] = x_vel
    h5["y_vel"] = y_vel
    h5["z_vel"] = z_vel
    h5["speed"] = speed
    h5.close()
    # Associate this new external data to dataset
    ds.associate(nm, "hdf5")

print(ds)
print(ds.list_features())

In [None]:
# We can access both curves or external data in the same way:
print(dataset["physics_cycle_series/x_pos"][:5])
print(dataset["x_vel"][:5])

In [None]:
# Kosh also offer the notion of ensembles which is based on Sina' relationships
my_group = store.create_ensemble()
# attributes of a group are shared by all memebers
my_group.a_group_attribute = "foo"

# Let's add our group members to this ensemble:
for ds in store.find(group_id=dataset.group_id):
    my_group.add(ds)

In [None]:
print(my_group)


In [None]:
print(ds.a_group_attribute)

In [None]:
# We could search the ensemble
dss = list(my_group.find_datasets(num_bounces=sina.utils.DataRange(min=6)))
print(len(dss))

In [None]:
# let's compute the average speed for this ensemble
# for this we will use an operator
@kosh.numpy_operator
def Avg(*inputs):
    avg = inputs[0][:]
    for input_ in inputs[1:]:
        avg += input_[:]
    return avg/len(inputs)


avg_speed = Avg(*( _["speed"] for _ in my_group.find_datasets()))[:]
print(avg_speed[:5])
            
    

In [None]:
# we can now store that result in a file and associate that file with the group
import numpy
nm = f"output/avg_speed_{my_group.id}.hdf5"
h5 = h5py.File(nm, "w")
h5["avg_speed"]= avg_speed
h5.close()

my_group.associate(nm, "hdf5")
my_group.group_speed = float(numpy.average(avg_speed))
print(my_group)
print(ds.group_speed)

In [None]:
print(my_group["avg_speed"][:5])