In [1]:
import numpy as np, scipy, pandas as pd
from pyDbs.__init__ import *

# Documentation for ```pyDbs.Gpy```

Symbol class. Includes three subclasses: ```GpyVariable```, ```GpySet```, ```GpyScalar```. Symbols can be initialized using the convenience methods ```Gpy.c``` or ```Gpy.create```. The symbol classes are used to ensure a certain structure, e.g. in the way symbols are named, how to return a symbols name, index, domains, and values. They also have built-in methods for merging.
* ```GpyVariable``` values are defined as ```pd.Series``` defined over ```pd.Index``` or ```pd.MultiIndex```.
* ```GpySet``` values are defined as ```pd.Index``` or ```pd.MultiIndex``` instances.
* ```GpyScalar``` values are defined as scalars *not* defined over indices.

## 1. ```GpySet```

### A. Attributes

* ```v```: The index symbol (pandas object).
* ```name```.
* ```type == 'set'```.

*Notes*: The ```self.name``` attribute is not necessarily the same as ```self.v.name```, where ```self.v``` is a pd.Index. If no name is provided at initialization, the method defaults to assigning ```self.name = self.v.name``` though.

### B. Properties/methods

* ```self.index```: Returns ```self.v```.
* ```self.__len__```: length of ```self.v```.
* ```self.domains```: returns ```self.index.names```,
* ```self.merge(symbol, priority = 'second', union = True, **kwargs)```:
    * Merges the index ```self.v``` with pandas index in ```symbol```. Index domains has to be consistent.
    * If ```priority == 'replace'```: Use index from symbol (drop ```self.v```). Else:
    * If ```union == True```: Combine ```self.v``` with index ```symbol``` by ```pd.Index.union```,
    * else: Combine ```self.v``` with index ```symbol``` by ```pd.Index.intersection```.
* ```self.mergeGpy(symbol, priority = 'second', union = True, **kwargs)```: Equivalent to ```merge``` except that ```symbol``` has to be a ```GpySet``` instance.
* ```self.adj(rc, **kwargs)```: Updates pandas index in ```self.v``` after adjustments. The arguments ```rc, **kwargs``` are passed to the adjustment method ```pyDbs.adj.rc_pd```(see documentation in notebook ```docs_adj.ipynb``` for more on this).
* ```self.array(**kwargs)```: Returns values as numpy arrays (in this case ```self.v.values``` ). The **kwargs** is only included here for symmetry with other Gpy classes (see below).

### C. Examples

Initialize as ```pd.Index``` (or multiIndex):

In [2]:
idx = pd.Index(range(10), name = 'testName')
GpySetInst = GpySet(idx) # use GpySet class directly
GpySetInst = Gpy.c(idx) # Convenience class accepts an index --> identifies as a set. 

Adjust the name of the symbol (can be different than the index name):

In [3]:
GpySetInst = Gpy.c(idx, name = 'rook')

We can initialize from an instance as well; this simply creates a copy:

In [4]:
GpySetCopy = Gpy.c(GpySetInst)

Merge with other symbol (domains have to match for this to work, unless we use ```priority``` = 'replace'):

In [5]:
GpySetInst.merge(pd.Index(['a','b', 0], name = 'testName'), union = True) # this adds new elements 'a','b' - 0 is already in the original index
GpySetInst.v

Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 'a', 'b'], dtype='object', name='testName')

Adjust the ```self.v``` index using a condition consistent with ```pyDbs.adj.rc_pd``` methods:

In [6]:
GpySetInst.adj(pd.Index(range(5,15), name = 'testName')) # the rc_pd identifies overlap in indices with domains overlapping
GpySetInst.v

Index([5, 6, 7, 8, 9], dtype='object', name='testName')

The adjustment methods also take other arguments (see ```pyDbs.adj.rc_pd``` method for more on the use):

In [7]:
GpySetInst.adj(('not', pd.Index([8,9], name = 'testName'))) # the tuple with 'not' in the first entry negates the two
GpySetInst.v

Index([5, 6, 7], dtype='object', name='testName')

... or add multiple conditions:

In [8]:
GpySetInst.adj(('and', [pd.Index(range(10), name = 'testName'), pd.Index([5], name = 'testName')])) # the tuple with 'and' in the first entry means that all conditions in second entry (list) has to fit. 
GpySetInst.v

Index([5], dtype='object', name='testName')

## 2. ```GpyVariable```

### A. Attributes

* ```v```: Pandas series.
* ```lo```: Defaults to None. Used to store lower bounds on variables, specified as pandas series if provided.
* ```up```: Defaults to None. Used to store upper bounds on variables, specified as pandas series if provided. 
* ```name```: Defaults to name of ```self.v.name``` if no name is provided.
* ```type == 'variable'```.

### B. Properties/methods

* ```index```: Returns ```self.v.index```.
* ```__len__```: length of ```self.v```.
* ```domains```: returns ```self.index.names```.
* ```merge(symbol, lo = None, up = None, priority = 'second', **kwargs)```:
    * Merges pandas series from ```symbol``` with ```self.v```. Also uses this on ```lo```, ```up``` if these are provided.
    * Allows ```priority = 'second','first','replace'```. Assigns the priority of the new symbol when merging data, i.e. priority = 'second' means that the new values are only used where the symbol currently has no data.  
* ```mergeGpy(symbol, priority = 'second', **kwargs)```: Equivalent to ```merge``` except that ```symbol``` has to be a ```GpyVariable``` instance.
* ```self.array(attr = 'v', **kwargs)```: Returns numpy array from attribute 'attr'. If we call ```self.array(attr = 'lo')```, but the lower bound is not specified (i.e. ```self.lo``` = ```None```), this returns a numpy arrays with ```np.nan```  of the same length as ```self.v```. The same thing goes for the upper bound (```self.up```).
* ```self.adj(rc, **kwargs)```: Updates pandas series in (```self.v```, ```self.lo```, ```self.up```) after adjustments. The arguments ```rc, **kwargs``` are passed to the adjustment method ```pyDbs.adj.rc_pd```(see documentation in notebook ```docs_adj.ipynb``` for more on this). 

### C. Examples

Initialize as ```pd.Series```:

In [9]:
s = pd.Series(1, index = idx, name = 'testVariable')
GpyVarInst = GpyVariable(s) # use GpySet class directly
GpyVarInst = Gpy.c(s) # Convenience class accepts an series --> identifies as a variable. 

Initialize with ```pd.DataFrame```: Columns "v", "lo", "up" specify different attributes:

In [10]:
df = pd.DataFrame({'v': 1, 'lo': 0, 'up': 10}, index = idx)
GpyVarInst = GpyVariable(df)

We can initialize from an instance as well; this simply creates a copy:

In [11]:
GpyVarCopy = Gpy.c(GpyVarInst)

Merge take other series as inputs. The following adjusts the main attribute ```self.v```. The default priority is ```'first'```, such that overlapping domains use the new values:

In [12]:
GpyVarInst.merge(pd.Series(5, index = idx[0:2]))
GpyVarInst.v

testName
0    5
1    5
2    1
3    1
4    1
5    1
6    1
7    1
8    1
9    1
dtype: int64

We can adjust ```.lo, .up``` attributes in similar ways by calling ```self.mergeLo``` or ```self.mergeUp``` specifically:

In [13]:
GpyVarInst.mergeLo(pd.Series(-1, index = idx[3:5]))
GpyVarInst.lo

testName
0    0
1    0
2    0
3   -1
4   -1
5    0
6    0
7    0
8    0
9    0
dtype: int64

The ```merge``` method also allows for more than one attribute to be merged:

In [14]:
GpyVarInst.merge(pd.Series(np.e, index = idx), 
                 lo = pd.Series(0, index = idx), 
                 up = pd.Series(1e3, index = idx[0:5])) # this adjusts all three attributes v, lo, up

testName
0    2.718282
1    2.718282
2    2.718282
3    2.718282
4    2.718282
5    2.718282
6    2.718282
7    2.718282
8    2.718282
9    2.718282
dtype: float64

The ```self.array(attr = 'v')``` returns the numpy array version of the attribute:

In [15]:
GpyVarInst.array() # get 'v' - default 

array([2.71828183, 2.71828183, 2.71828183, 2.71828183, 2.71828183,
       2.71828183, 2.71828183, 2.71828183, 2.71828183, 2.71828183])

In [16]:
GpyVarInst.array(attr = 'up') # get np.array from 'up' attribute

array([1000., 1000., 1000., 1000., 1000.,   10.,   10.,   10.,   10.,
         10.])

Reset the 'lo' attribute to None; the ```self.array``` call then returns a vector of ```np.nan``` with suitable length:

In [17]:
GpyVarInst.lo = None
GpyVarInst.array(attr = 'lo')

array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan])

The ```self.adj``` method works in a similar ways as for ```GpySet``` instances, with the exception that the adjustments are made to all three attributes (v, lo, up):

In [18]:
GpyVarInst.adj(idx[0:5]) # this makes the adjustments - only keep the first 5 entries in the index
pd.concat([getattr(GpyVarInst,attr).rename(attr) for attr in ('v','lo','up') if getattr(GpyVarInst, attr) is not None], axis = 1) # this collects adjusted values in a dataframe to show the changes

Unnamed: 0_level_0,v,up
testName,Unnamed: 1_level_1,Unnamed: 2_level_1
0,2.718282,1000.0
1,2.718282,1000.0
2,2.718282,1000.0
3,2.718282,1000.0
4,2.718282,1000.0


## C. ```GpyScalar```

Similar to ```GpyVariable```, except that it does not store multiple values and is thus *not* defined as pd.Series. Instead, they simply store numbers.

* Main value stored at ```self.v```,
* optional values stored at ```self.lo``` and ```self.up```,
* name has to be specified,
* ```self.index``` defaults to None,
* ```self.domans``` defaults to empty list.

In [19]:
GpyScalarVar = Gpy.c(1, lo = 0, up = 10, name = 'testScalrVar')
GpyScalarVar.__dict__

{'v': 1, 'lo': 0, 'up': 10, 'name': 'testScalrVar', 'type': 'scalar'}

```merge``` and ```mergeGpy``` are implemented as variables. The ```array``` and ```adj``` methods are added for completeness; ```self.array(attr = 'v')``` simply returns the attribute and ```self.adj(rc, **kwars)``` simply returns ```self.v```.