# finite volume methods as self-interacting graphs
Timothy Tyree<br>
April 25, 2020<br>


- It's worth noting that finite volume methods can be recast into the strictly more general synchronous (or asynchronous) graph representation problems, whose definition naturally follows from the analogy: volume$\rightarrow$node, boundary$\rightarrow$edge.  

- The Pregel paradigm is used by the pyspark.graphx module to implement (scalable) user defined functions on graphs using a distributed cluster, such as any ec2 instance supported by AWS.  

- There exist methods (see below) implemented in pyspark.graphx that makes efficient use of the GPU (via cuda cores and numba or jit backends).  As I understand, these GPU methods can also be used on distributed clusters, perhaps together simultaneously.  
    - If this is so, then we may develop/test/run finite volume simulations that use the GPU of a local server, and we may then run the same finite volume simulation code at scale using distributed high-throughput resources

## Separate the problem into  parts

| Part 1 | Part 2 |
| --- | --- |
| make the timestep | use the timestep |

__Part 1__
* TODO: write down the equations of motion as a system of PDEs
* TODO: measure/integrate the net "stuff" located in an abstract finite volume, $\Delta\in\mathcal M$ at a given time for an arbitrary smooth orientable manifold, $\mathcal M$
* TODO: apply Stokes theorem to $\Delta$
* TODO: partition the boundary of that small finite volume, $\partial\Delta$, into a sum over faces (boundaries shared with neighboring 
    * $\Delta$ will be a single node in our graph, which will have
         * __local state properties__ (field values of equations of motion)
         * __addresses of neighbors__ (ID/location for accessing neighbor field values, one-to-one with faces on the boundary
    * The faces between two finite volumes will be a single edge in our graph, which will have a
        * __src__ "source" node
        * __dst__ "destination" node
        * __method for time_step__ for each type of boundary, there will be a time_step method given by the
        * __type__ type of boundary 
            * for electrophysiological cardiac arrhythmia modeling types could be active-active boundary for excitable tissue, active-passive/passive-passive boundary for anything else
            * for mechanical cardiac modeling - the same types as ^those apply, but instead of having a static mesh, we have a mesh discretizing a symplectic manifold
            * for neuronal networks, types could be excitatory synapses and/or inhibitory synapses
* TODO: test that ^those methods do what they're supposed to do

__Part 2__
 * TODO: put an example mesh of an atrium configuration into python as a graph
 * TODO: put that graph into pyspark.graphx
 * TODO: run a trivial time_step on a cuda gpu and test that it does what it's supposed to do
 * TODO: run the time_step from part 1
 * TODO: scale the local spark context over AWS or over the open science grid.
  - pegasus can use spark : https://github.com/pegasus-isi/spark-workflow


## TODO(later): learn to use arrow by playing with user defined functions (UDF) in pyspark.sql 
(snippet herein)

In [1]:
from pyspark.sql.functions import pandas_udf, PandasUDFType

# Use pandas_udf to define a Pandas UDF
@pandas_udf('double', PandasUDFType.SCALAR)
# Input/output are both a pandas.Series of doubles

def pandas_plus_one(v):
    return v + 1

df.withColumn('v2', pandas_plus_one(df.v))

ImportError: PyArrow >= 0.8.0 must be installed; however, it was not found.

## TODO(better): learn to use graphx on a gpu.

# do anything with pyspark and findspark

In [3]:
import findspark
findspark.init()

import pyspark
import random

sc = pyspark.SparkContext(appName="Pi")
num_samples = 100000000

def inside(p):     
    x, y = random.random(), random.random()
    return x*x + y*y < 1

count = sc.parallelize(range(0, num_samples)).filter(inside).count()

pi = 4 * count / num_samples
print(pi)

sc.stop()

ValueError: Couldn't find Spark, make sure SPARK_HOME env is set or Spark is in an expected location (e.g. from homebrew installation).