# Example use of the ChunkAvg class

The `ChunkAvg` class object can be used to read in chunked output data from files generated by the LAMMPS `ave/chunk` command. The `ChunkAvg` class will parse `ave/chunk` output file and extract the data into a set of [Pandas DataFrames](https://pandas.pydata.org/). Kaggle has a nice [Learn module on Pandas](https://www.kaggle.com/learn/pandas) if you are unfamiliar with Pandas and using it's DataFrame objects.  

In this example we will read a sample `ave/chunkc` output file name `temperature.chunkavg` that is included in the `sample-data/chunk-avg` folder. This file contains a few timepoints of a chunk averaged temperature variable used to estimate a radial temperature profile. 

We will go over how to use the `ChunkAvg` class to parse the data.

First, we can import the `ChunkAvg` class from the [lmpoa](https://github.com/NTBEL/lammps_output_analysis) package:

In [1]:
from lmpoa import ChunkAvg

We will also import the standard library package `os` to handle file paths:

In [2]:
import os

Then let's set the file and path to access it from the `sample-data` folder:

In [4]:
file = os.path.abspath('../sample-data/chunk-avg/temperature.chunkavg')

## Read the data

To read the output file we can instantiate an instance of the `ChunkAvg` class with the path to the file: 

In [5]:
chunk_avg = ChunkAvg(file)

For the `ChunkAvg` class the data is parsed into a set of Pandas DataFrames accessible with the `frames` attribute:

In [6]:
chunk_avg.frames

[<lmpoa.chunkavg.ChunkFrame at 0x12c674ee740>,
 <lmpoa.chunkavg.ChunkFrame at 0x12c674ef400>,
 <lmpoa.chunkavg.ChunkFrame at 0x12c07730d90>,
 <lmpoa.chunkavg.ChunkFrame at 0x12c07731000>,
 <lmpoa.chunkavg.ChunkFrame at 0x12c07731270>,
 <lmpoa.chunkavg.ChunkFrame at 0x12c077314e0>]

The `frames` attribute is a list of internally defined `ChunkFrame` objects, but each DataFrame can accessed as followed:

In [9]:
# We'll index frames and call the object like a function to return the corresponding Pandas DataFrame.
# Index 0 will be the first timepoint.
type( chunk_avg.frames[0]() )

pandas.core.frame.DataFrame

In [10]:
df_0 = chunk_avg.frames[0]()

In [11]:
df_0.head()

Unnamed: 0,Chunk,Coord1,Ncount,v_temp
0,1,0.5,1.0,355.238
1,2,1.5,0.0,0.0
2,3,2.5,10.0,311.14
3,4,3.5,3.0,142.148
4,5,4.5,12.0,370.698


Then to get the chunked temperature data at that timepoint we can access the `v_temp` column in this case:

In [12]:
df_0['v_temp']

0     355.238
1       0.000
2     311.140
3     142.148
4     370.698
       ...   
63    297.798
64    305.971
65    297.029
66    300.546
67    298.426
Name: v_temp, Length: 68, dtype: float64

The exact columns and elements of each DataFrame will depend on the type of chunking and the outputs specified in the corresponding `ave/chunk` command. 

## 2 Utility functions.

The `ChunkAvg` class has as set of built utility functions that we can use to perform some secondary analysis/processing. They include:

  * `bin3d_chunk_pressures(cid)` - estimate the pressure of each 3d-binned chunk given that the stress tensor values are included in the chunk averaging with compute id `cid` and appends it as new `pressure` column in the DataFrames. 
  * `bin3d_chunk_volumes(x_delta, y_delta, z_delta)` - compute the volume of each 3-binned chunk using the given chunk element side lengths (`x_delta`, etc.) and appends it as new `volume` column in the DataFrames.
  * `block_average(block_size, start=0, end=-1)` - compute a time-block average with the given block size and over the specified frames. 
  * `radial_chunk_densities()` - computes the number density of radial chunks for each frame and appends it as new `density` column in the DataFrames. 
  * `radial_chunk_volumes()` - same the `bin3d_chunk_volumes` function but for radial chunks.
  * `remove_empty_chunks(ncount_threshold=0)` - remove empty chunks from the set of chunks. It will remove a chunk if at any frame it is empty. This can also be used to filter out chunks with few atoms than a threshold value specified by the optional `ncount_threshold` argument. 
  

As an example we will compute the density of radial chunks (since our chunk averaged data is for radial chunks):

In [14]:
chunk_avg.radial_chunk_densities()

In [17]:
# We now have a new density column.
chunk_avg.frames[0]().head()

Unnamed: 0,Chunk,Coord1,Ncount,v_temp,density
0,1,0.5,1.0,355.238,0.238732
1,2,1.5,0.0,0.0,0.0
2,3,2.5,10.0,311.14,0.125649
3,4,3.5,3.0,142.148,0.019357
4,5,4.5,12.0,370.698,0.046964
