In [1]:
%run StdPackages.ipynb

Load wheels and test databases:

In [2]:
fs = [f"{d['data']}\\test_size1000.gdx", f"{d['data']}\\baselinerun.gdx"] # files
ws = gams.GamsWorkspace() 
g2np = gams2numpy.Gams2Numpy(ws.system_directory)
dbs = {'gms1': ws.add_database_from_gdx(fs[0]), 'gms2': ws.add_database_from_gdx(fs[1]),
       'gpy1': Database.GpyDB(db=fs[0],**{'name': 'testdb1'}), 'gpy2': Database.GpyDB(db=fs[1],**{'name': 'testdb1'})}
db = dbs['gpy2']

In [3]:
os.chdir(d['py'])
# import DBWheels_rc
from DBWheels_rc import rctree_pd, rc_AdjGpy, rc_AdjPd, rctree_admissable_types
os.chdir(d['curr'])

# DBWheels_rc.py

The wheels ```DBWheels_rc.py``` is used to subset ```gpy``` or pandas symbols (```pd.Index, pd.MultiIndex, pd.Series pd.DataFrame```) using nested condition trees. Four main methods: ```rctree_pd, rc_AdjGpy, rc_AdjPd```. 

### 1: ```rctree_pd(s=None, c = None, alias = {}, lag = {}, pm = False)``` 

Before going through the conditions in ```c``` this uses the ```rc_AdjPd``` to apply the 'alias' and 'lag' arguments (see section 2 for more). After this, we have the following:

```python
if isinstance(c, rctree_admissable_types):
``` 
    The method compares indices in 'c' and the symbol 's'. How this is done depends on the setting for pm = True or False.
```python
elif isinstance(c, tuple):
``` 
    The tuple has to include exactly two elements: The first element specifies how to handle the second element; c[0] belongs to ('and','or','not'). The second element can be a new condition or a list of those (unless the first argument is 'not'). 
    
```python
elif isinstance(c, dict):
``` 
    The dict indicates a symbol that belongs to the types rctree_admissable_types, but is adjusted in some way that relies on the rctree_pd method itself. Thus, the dict should be arranged with {'s': s, 'c': c, ...}.
```python
elif c is None:
```
    Returns the full symbol s without slicing.

#### 1.1. Single condition

Let's start with a simple condition. Both ```s``` and ```c``` can be of the following four types:

In [4]:
rctree_admissable_types

(_Database.gpy,
 pandas.core.indexes.base.Index,
 pandas.core.series.Series,
 pandas.core.frame.DataFrame)

The following yields the same outcome:

In [5]:
s = db['Eff']
cs = [db['ElCap'], db.get('ElCap'), db['ElCap'].index, db.get('ElCap').to_frame()]
c = cs[0]
rctree_pd(s=s, c=c)

aggid   t   
agg5    2018    0.165076
agg87   2018    0.148073
agg71   2018    0.429286
agg123  2018        0.31
agg23   2018    0.339071
                  ...   
agg217  2018    0.315455
agg221  2018      0.3883
agg222  2018      0.3883
agg223  2018    0.377681
agg224  2018    0.377681
Name: level, Length: 139, dtype: object

Before running the subsetting, we could have (1) altered the set names using the alias method or (2) lagged a set value (if if is a numerical index):

In [6]:
rctree_pd(s, alias = {'t':'tt'}, lag = {'t':-1}) # adjust name of index t to tt, and lag the set with -1

aggid   tt  
agg127  2017         0.0
agg80   2017    1.023287
agg5    2017    0.165076
agg108  2017    0.834509
agg87   2017    0.148073
                  ...   
agg224  2017    0.377681
agg225  2017         0.0
agg226  2017         0.0
agg227  2017         0.0
agg228  2017         0.0
Name: level, Length: 228, dtype: object

If we subset this new symbol using the condition from before we get an empty series: This is because (1) domains do not overlap (t in one, tt in the other), and (2) the years in one variable is 2017, and 2018 in the other. One of the conditions is enough to yield the empty result:

In [7]:
rctree_pd(s, c=c, alias = {'t':'tt'}), rctree_pd(s, c=c, lag = {'t':-1})

(Series([], Name: level, dtype: object),
 Series([], Name: level, dtype: object))

We can also pass the symbol in ```c``` with similar adjustments as carried out on ```s```: While this is a weird scenario, we can get at the same observations as in ```rctree_pd(s,c)``` (only lagged and renamed) if we alias and lag the two variables in the same way. We pass such adjustments to ```c``` using a dictionary input:

In [8]:
alias,lag = {'t': 'tt'}, {'t':-1}
c_dict = {'s': c, 'alias': alias, 'lag': lag}
rctree_pd(s, c = c_dict, alias = alias, lag = lag)

aggid   tt  
agg5    2017    0.165076
agg87   2017    0.148073
agg71   2017    0.429286
agg123  2017        0.31
agg23   2017    0.339071
                  ...   
agg217  2017    0.315455
agg221  2017      0.3883
agg222  2017      0.3883
agg223  2017    0.377681
agg224  2017    0.377681
Name: level, Length: 139, dtype: object

#### 1.2. Tuples and multiple conditions

If we want to add multiple conditions (and/or), we use a tuple with two elements: (keyword, list of conditions):

In [9]:
rctree_pd(s, c = ('and', [db['ElCap'], db.get('chp')])) # both elcap and chp

aggid   t   
agg5    2018    0.165076
agg87   2018    0.148073
agg71   2018    0.429286
agg23   2018    0.339071
agg89   2018     0.17309
agg15   2018    0.395134
agg62   2018    0.392741
agg111  2018        0.47
agg79   2018    0.227883
agg122  2018         0.5
agg121  2018    0.400667
agg19   2018    0.401083
agg61   2018    0.368741
agg124  2018    0.122748
agg109  2018    0.262387
agg21   2018       0.206
agg32   2018    0.402759
agg56   2018    0.376404
agg22   2018    0.339948
agg70   2018       0.382
agg114  2018       0.388
agg78   2018    0.386496
agg12   2018    0.130029
agg110  2018         0.3
agg85   2018    0.184003
agg90   2018    0.271083
agg77   2018    0.153599
agg91   2018    0.207216
agg103  2018        0.22
agg83   2018    0.242192
agg3    2018    0.206905
agg20   2018     0.08652
agg24   2018    0.401724
agg99   2018    0.162306
agg100  2018    0.162306
agg101  2018    0.162306
agg96   2018    0.161218
agg97   2018    0.121251
agg104  2018    0.140284
agg105  2018

In [10]:
rctree_pd(s, c = ('or', [db['ElCap'].index, db.get('chp')])) # either elcap or chp

aggid   t   
agg127  2018         0.0
agg5    2018    0.165076
agg87   2018    0.148073
agg71   2018    0.429286
agg123  2018        0.31
                  ...   
agg224  2018    0.377681
agg225  2018         0.0
agg226  2018         0.0
agg227  2018         0.0
agg228  2018         0.0
Name: level, Length: 154, dtype: object

#### 1.3. Tuples and negating conditions

We negating conditions, we use the tuple approach with the second argument being a single element (instead of e.g. a list of conditions as with and/or):

In [11]:
rctree_pd(s, c = ('not', db['ElCap']))

aggid   t   
agg127  2018         0.0
agg80   2018    1.023287
agg108  2018    0.834509
agg40   2018    0.852302
agg126  2018         0.0
                  ...   
agg220  2018         0.0
agg225  2018         0.0
agg226  2018         0.0
agg227  2018         0.0
agg228  2018         0.0
Name: level, Length: 89, dtype: object

#### 1.4. Nested conditions

We nest conditions using tuple inputs: we need to specify whether or not the various arguments need to passed as and/or/not conditions. The different types of conditions can be arbitrarily nested. Here are some examples:

*ElCap and chp, and not HeatCap lagged with -1:*

In [12]:
rctree_pd(s, c = ('and', [db['ElCap'], db['chp'], ('not', {'s':db['HeatCap'], 'lag':{'t':-1}}) ]   ))

aggid   t   
agg5    2018    0.165076
agg87   2018    0.148073
agg71   2018    0.429286
agg23   2018    0.339071
agg89   2018     0.17309
agg15   2018    0.395134
agg62   2018    0.392741
agg111  2018        0.47
agg79   2018    0.227883
agg122  2018         0.5
agg121  2018    0.400667
agg19   2018    0.401083
agg61   2018    0.368741
agg124  2018    0.122748
agg109  2018    0.262387
agg21   2018       0.206
agg32   2018    0.402759
agg56   2018    0.376404
agg22   2018    0.339948
agg70   2018       0.382
agg114  2018       0.388
agg78   2018    0.386496
agg12   2018    0.130029
agg110  2018         0.3
agg85   2018    0.184003
agg90   2018    0.271083
agg77   2018    0.153599
agg91   2018    0.207216
agg103  2018        0.22
agg83   2018    0.242192
agg3    2018    0.206905
agg20   2018     0.08652
agg24   2018    0.401724
agg99   2018    0.162306
agg100  2018    0.162306
agg101  2018    0.162306
agg96   2018    0.161218
agg97   2018    0.121251
agg104  2018    0.140284
agg105  2018

#### 1.5. Partial matching of domains/levels

The ```rctree_pd``` method searches for overlaps in relevant indices. Consider the simple case where ```s```, ```c``` are both pandas multiindex. The default behavior when comparing the two indices, is to evaluate to ```False``` if the index ```c``` has levels that are not in ```s```; not the other way around though. For example: ```Eff[aggid,t]``` can returns a non-empty index when comparing to ```chp[aggid]```, but the reverse is empty per construction:

In [13]:
s,c = db['Eff'],db['chp']
rctree_pd(s,c)

aggid   t   
agg127  2018         0.0
agg5    2018    0.165076
agg87   2018    0.148073
agg71   2018    0.429286
agg23   2018    0.339071
                  ...   
agg224  2018    0.377681
agg225  2018         0.0
agg226  2018         0.0
agg227  2018         0.0
agg228  2018         0.0
Name: level, Length: 71, dtype: object

In [14]:
rctree_pd(c,s)

Index([], dtype='object', name='aggid')

If we want this second method to return a non-empty solution, by only comparing domains that are in both symbols, we can add kwarg ```pm = True```:

In [15]:
rctree_pd(c,s,pm=True)

Index(['agg127', 'agg5', 'agg87', 'agg71', 'agg23', 'agg89', 'agg126', 'agg15',
       'agg62', 'agg131', 'agg111', 'agg79', 'agg122', 'agg121', 'agg19',
       'agg61', 'agg113', 'agg124', 'agg109', 'agg119', 'agg21', 'agg120',
       'agg32', 'agg56', 'agg22', 'agg70', 'agg114', 'agg78', 'agg12',
       'agg110', 'agg128', 'agg85', 'agg90', 'agg77', 'agg91', 'agg65',
       'agg103', 'agg83', 'agg1', 'agg3', 'agg20', 'agg24', 'agg99', 'agg100',
       'agg101', 'agg96', 'agg97', 'agg104', 'agg105', 'agg72', 'agg107',
       'agg81', 'agg106', 'agg115', 'agg94', 'agg2', 'agg112', 'agg92',
       'agg117', 'agg218', 'agg217', 'agg219', 'agg220', 'agg221', 'agg222',
       'agg223', 'agg224', 'agg225', 'agg226', 'agg227', 'agg228'],
      dtype='object', name='aggid')

### 2: ```rc_AdjPd(s, alias = {}, lag = {})``` 

The method receives a symbol ```s``` that belong  to the types ```DBWheels_rc.rctree_admissable_types```, and returns an aliased/lagged version of the symbol. Alias, lag should be specified as dictionaries: Key = level in index, level = new name (in alias) or adjustment to the index (in lag). A few examples illustrate this:

*The efficiency variable defeind over [aggid,t]:*

In [16]:
s.vals

aggid   t   
agg127  2018         0.0
agg80   2018    1.023287
agg5    2018    0.165076
agg108  2018    0.834509
agg87   2018    0.148073
                  ...   
agg224  2018    0.377681
agg225  2018         0.0
agg226  2018         0.0
agg227  2018         0.0
agg228  2018         0.0
Name: level, Length: 228, dtype: object

*Adjust the name 't':*

In [17]:
rc_AdjPd(s, alias = {'t': 'newname_t'})

aggid   newname_t
agg127  2018              0.0
agg80   2018         1.023287
agg5    2018         0.165076
agg108  2018         0.834509
agg87   2018         0.148073
                       ...   
agg224  2018         0.377681
agg225  2018              0.0
agg226  2018              0.0
agg227  2018              0.0
agg228  2018              0.0
Name: level, Length: 228, dtype: object

*Adjust the index level 't' with +10:*

In [18]:
rc_AdjPd(s, lag = {'t': 10})

aggid   t   
agg127  2028         0.0
agg80   2028    1.023287
agg5    2028    0.165076
agg108  2028    0.834509
agg87   2028    0.148073
                  ...   
agg224  2028    0.377681
agg225  2028         0.0
agg226  2028         0.0
agg227  2028         0.0
agg228  2028         0.0
Name: level, Length: 228, dtype: object

### 3: ```rc_AdjGpy(s, c= None, alias = {}, lag = {}, pm = False)``` 

The method is similar to ```rctree_pd```, but returns a ```gpy``` instance of the symbol instead of pandas-like object.

In [19]:
new_gpy = rc_AdjGpy(s, alias = {'t':'newname_t'}, lag = {'t': 10})
new_gpy.vals

aggid   newname_t
agg127  2028              0.0
agg80   2028         1.023287
agg5    2028         0.165076
agg108  2028         0.834509
agg87   2028         0.148073
                       ...   
agg224  2028         0.377681
agg225  2028              0.0
agg226  2028              0.0
agg227  2028              0.0
agg228  2028              0.0
Name: level, Length: 228, dtype: object

## 4. ```rctree_gms```

The ```rctree_gms``` method automates writing simple pieces of gams code. In particular, it allows us to use the same type of condition trees as ```rctree_pd``` uses to subset pandas objects, to write the corresponding gams code. However, ```rctree_gms``` only uses ```gpy``` symbols as inputs (not pandas objects), as the gams code depends on more information than what pandas objects provides.

Examples:

*A single element:*

In [20]:
c = db['chp']
DBWheels_rc.rctree_gms(c)

NameError: name 'DBWheels_rc' is not defined

*Write a symbol with this conditional:*

In [None]:
DBWheels_rc.write_symbol(symbol, c)

*Write the level of this symbol with conditionals*

In [None]:
DBWheels_rc.write_symbol(symbol, c, l = ".l")

*Multiple conditions:*

In [None]:
c = {'and': [db['chp'], db['DK']]}
DBWheels_rc.rctree_gms(c)

In [None]:
DBWheels_rc.write_symbol(symbol,c)

*Write aliased symbol:*

In [None]:
DBWheels_rc.write_symbol(symbol,alias={'aggid': 'p'},lag={'t': -1})

In [None]:
symbol.vals