In [1]:
%run StdPackages.ipynb

Load wheels

In [2]:
os.chdir(d['py'])
from _Equations import *
from DBWheels_eq import gpy_eq
import FunctionsPd
os.chdir(d['curr'])

### 1: Create useful data

In [3]:
rng = np.random.default_rng(seed=43)

*Create some useful data:*
* Three sets (i,j,t),
* Subsets for each of them (added suffix '_ss'),
* Mappings combining subsets (name convention 'i2j').
* Mapping combining all three subsets ('i2j2t').
* Variables defined over all sets; simple range values (named 'var_'+s).
* A scalar ('scalar').

In [4]:
db = Database.GpyDB(**{'name': 'test'})
sets = ['i','j','t']
subsets = [s+'_ss' for s in sets]
# sets
for s in sets:
    db[s] = pd.Index(range(200), name = s)
    # db[s] = pd.Index(range(11+rng.integers(0,100)), name = s)
db.update_alias(pd.MultiIndex.from_tuples([(k,2*k) for k in sets]))
# subsets
for ss in subsets:
    s = db.get(ss.split('_')[0])
    db[ss] = s[s>5]
# mappings
for i,j in [(i,j) for i in subsets for j in subsets if i!=j]:
    db[i.split('_')[0]+'2'+j.split('_')[0]] = pd.MultiIndex.from_product([db.get(i),db.get(j)])
db['2'.join(sets)] = pd.MultiIndex.from_product([db.get(s) for s in sets])
# Variables
for s in set(db.gettypes(('set','subset','mapping')).keys())-{'alias_set','alias_map2','alias_'}:
    db['var_'+s] = pd.Series(range(1,len(db.get(s))+1), index = db.get(s), name = 'var_'+s)
db['scalar'] = 2

### 2: Broadcasting methods

We use three main broadcasting methods that depend on the situation:
1. ```broadcast(vlist, d = None, c = None, db = None, index = None, bc = 'sparse', bc_scalar=False)```: 
    * Takes as inputs a list of variables (```vlist```), 
    * d = list of domain names,
    * index = pandas index
    * c = conditions to subset on
    * bc$\in${'sparse','full}.
    * bc_scalar = Broadcast scalar variables as well, or keep them as scalars.
2. ```broadcast_wind```: Version of ```broadcast``` that returns tuple with (vals, index).
3. ```broadcast_infdoms```: Broadcast variables in ```vlist``` to a common level from domains in ```vlist```.

In [5]:
v = [db['var_'+s] for s in sets if s!='t']+[db['scalar']]
eq = gpy_eq(name='test', db = db, domains = ['i','j','t'])

*1. Broadcast sparse: Consider an equation defined over [i,j,t], but variables over defined over some of these (i,j). The broadcast with 'sparse' setting broadcasts all variables to [i,j]:*

In [6]:
v1 = broadcast(v, index = eq.index, bctype = 'sparse',db=db)
v[0].vals,v1[0] # variable 0 before and after broadcast

(i
 0        1
 1        2
 2        3
 3        4
 4        5
       ... 
 195    196
 196    197
 197    198
 198    199
 199    200
 Name: var_i, Length: 200, dtype: int64,
 i    j  
 0    0        1
      1        1
      2        1
      3        1
      4        1
            ... 
 199  195    200
      196    200
      197    200
      198    200
      199    200
 Length: 40000, dtype: int64)

*If we add ```bc_scalar = True``` this method also broadcasts the scalar variable to this domain:*

In [7]:
v2 = broadcast(v, index = eq.index, bctype = 'sparse', db = db, bc_scalar=True)
v[2].vals, v2[2]

(2,
 i    j  
 0    0      2
      1      2
      2      2
      3      2
      4      2
            ..
 199  195    2
      196    2
      197    2
      198    2
      199    2
 Length: 40000, dtype: int64)

*2. Broadcast non-sparsely, i.e. fit all variables to a common index:*

In [8]:
v3 = broadcast(v, index = eq.index, bctype = 'full', db = db)
v[0].vals,v3[0] # variable 0 before and after broadcast

(i
 0        1
 1        2
 2        3
 3        4
 4        5
       ... 
 195    196
 196    197
 197    198
 198    199
 199    200
 Name: var_i, Length: 200, dtype: int64,
 i    j    t  
 0    0    0        1
           1        1
           2        1
           3        1
           4        1
                 ... 
 199  199  195    200
           196    200
           197    200
           198    200
           199    200
 Length: 8000000, dtype: int64)

*3. Return with index:*

In [9]:
v4,v4doms = broadcast_windex(v, index = eq.index, bc = 'sparse', db = db)
v4[0]

i    j  
0    0        1
     1        1
     2        1
     3        1
     4        1
           ... 
199  195    200
     196    200
     197    200
     198    200
     199    200
Length: 40000, dtype: int64

*4. Broadcast list of variables to common domain inferred from the list of variables passed*

In [10]:
v5 = broadcast(v, index = eq.index, db = db)
v5[0]

i    j  
0    0        1
     1        1
     2        1
     3        1
     4        1
           ... 
199  195    200
     196    200
     197    200
     198    200
     199    200
Length: 40000, dtype: int64

### 3: The main method

Applying functions we generally do the following:
1. Infer domains from the variables that are used within the function.
2. Perform function and return an output.
3. Broadcast the output to a specific domain. 

#### 3.1: The Sum method

As an example, consider the function ```fSum(args,**kwargs)``` that sums over args. Consider again the case where the sum defines the equation defined over [i,j,t]. If args is empty, the equation simply returns zeros on the right domain. Finding the right domain is done using the broadcast methods above:

In [11]:
fSum_noargs = FunctionsPd.fSum([], **{'index':eq.index})
fSum_noargs

i    j    t  
0    0    0      0
          1      0
          2      0
          3      0
          4      0
                ..
199  199  195    0
          196    0
          197    0
          198    0
          199    0
Length: 8000000, dtype: int64

If we pass a single element, the ```fSum``` method similarly simply use the broadcast function to return the new argument: 

In [15]:
fSum_1arg = FunctionsPd.fSum([v[0]], **{'index': eq.index})
fSum_1arg

i    j    t  
0    0    0        1
          1        1
          2        1
          3        1
          4        1
                ... 
199  199  195    200
          196    200
          197    200
          198    200
          199    200
Length: 8000000, dtype: int64

*NB: If the variable passed to fSum is defined over other indices than the corresponding index, this returns a domain error.*

If we pass a single element, the ```fSum``` method similarly simply use the broadcast function to return the new argument: 

In [13]:
eq.index.names == ['i','j','t']

True

In [14]:
broadcast([], index = eq.index, bctype='full',db = db)

i    j    t  
0    0    0      0
          1      0
          2      0
          3      0
          4      0
                ..
199  199  195    0
          196    0
          197    0
          198    0
          199    0
Length: 8000000, dtype: int64