In [1]:
clean_up = True
%run StdPackages.ipynb

# GR18: Data

The GreenReform model uses processed IO data from another project. To allow for experiments to go on with the data project, we can choose whether or not to use the latest update of data here:

In [2]:
import shutil
updateFromIoProject = False
if updateFromIoProject:
    shutil.copy(os.path.join(d['project'], 'IOdata', 'data', 'IO2018'), os.path.join(d['data'], 'IO2018'))

### 1. Load data

*Load full 2018 data and relevant mappings used to aggregate the model:*

In [3]:
name = 'GR18'
error = 1e-7 # tolerance when testing IO balance
db = GpyDB(pickle_path = os.path.join(d['data'], 'IO2018'))
db.name = f"IO_{name}"
file_mappings = os.path.join(d['data'], 'GR2018_mappings.xlsx')
glob = gmsPyGlobals.SmallOpen(kwargs_vals = {'t': range(2018,2051)}) # global settings used throughout; interest rates, long run growth rates, time index etc.

Total value:

### 2. Aggregation

In [5]:
wb_mappings = read.simpleLoad(file_mappings)
auxMaps = read.maps(wb_mappings['AuxMaps'])

#### 2.1. Aggregate sectors

Sectors are aggregated based on two mappings:
1. ```s146tosGR```: Identifies how 146 domestic sectors are aggregated to roughly 60.
2. ```inv7toinvGR```: Maps the 7 types of durables into two - building and machines.

In [6]:
m = auxMaps['s146tosGR'].vals
m = m.set_levels([level.astype(str) for level in m.levels]) # force to string format
mDur = auxMaps['inv7toinvGR'].vals
mDur = mDur.set_levels([level.astype(str) for level in mDur.levels])
m_s = m.union(pd.MultiIndex.from_frame(mDur.to_frame(index=False).assign(temp = lambda x: 'I_'+x['nn'])[['n','temp']]).rename(['s','ss']))

For sectors not included in these mappings do nothing (implied by (x,x) tuples). These sectors include aggregate sectors defined by us (foreign sector, government sector, household, inventory):

In [7]:
m_sector = m_s.union(pd.MultiIndex.from_arrays([adj.rc_pd(db.get('s'), ('not', m_s.levels[0])), adj.rc_pd(db.get('s'), ('not', m_s.levels[0])).rename('ss')]))
aggregateDB.aggDB(db, m_sector)

<pyDatabases.gpyDB.gpyDB.GpyDB at 0x22252615ca0>

#### 2.2. Aggregate sector outputs 

Next, we apply the same (type) of mapping to aggregate goods types. At this stage, the goods index ```n``` includes:
* Domestically produced goods ```n_p```,
* foreign produced goods ```n_F``` (with syntax ```x_F``` where 'x' is an element from ```n_p```)
* investment sector goods (corresponding to ```s_i```)
* residual income (```resIncome```) and wages (```L```)

We map ```n_p, n_F``` types as well as durables/investments using the same approach as for sectors: 

In [8]:
m_goods = m.rename(['n','nn']).union(m.set_levels([level+'_F' for level in m.levels]).rename(['n','nn'])).union(mDur)
m_goods = m_goods.union(pd.MultiIndex.from_arrays([adj.rc_pd(db.get('n'), ('not', m_goods.levels[0])),
                                                   adj.rc_pd(db.get('n'), ('not', m_goods.levels[0])).rename('nn')]))
aggregateDB.aggDB(db, m_goods)

<pyDatabases.gpyDB.gpyDB.GpyDB at 0x22252615ca0>

### 3. Clean up taxes, government consumption, etc.

A final bit of clean-up of the relevant data here:  We only use the total government consumption, and not the consumption split onto the many types ```gc```. This is already recorded in the ```vD``` variable. Thus, we remove the more detailed accounts (```vC```, ```vC_tax```, ```gc```)

In [9]:
for k in ('gc','vC','vC_tax'):
    del(db.series[k])

Remove zeros:

In [10]:
[db.__setitem__(k, db.get(k)[db.get(k)!=0]) for k in ('vD','vD_inv','vD_dur','vD_depr','vAssets','vTax')];

### 4. Process data on durables, investments, and depreciation rates

* Depreciation of durables are translated to rates. 
* Distinguish between investment goods and durables: Define investment goods with syntax ```I_x``` for durable x.
* Define the mapping dur2inv and relevant subsets (```dur_p``` and ```inv_p```).

*NB: Only run this cell once.*

In [11]:
db['rDepr'] = db.get('vD_depr') / (db.get('vD_dur').replace(0,1))
db['dur2inv'] = pd.MultiIndex.from_frame(db.get('vD_dur').index.to_frame(index = False).assign(nn = lambda x: 'I_'+x['n'])).reorder_levels(['s','n','nn'])
db['dur_p'] = db.get('dur2inv').droplevel('nn').unique() # what variables are durables (K)
db['inv_p'] = db.get('dur2inv').droplevel('n').unique().rename({'nn':'n'}) # what variables are investment goods (I)
db.get('vD_inv').index = db.get('vD_inv').index.set_levels('I_'+db.get('vD_inv').index.levels[1], level=1)
db['vD'] = db.get('vD_inv').combine_first(db.get('vD')).combine_first(db.get('vD_dur'))

*Clean up data:*

In [10]:
# for k in ('vD_inv','vD_dur','vD_depr'):
#     del(db.series[k])

### 5. Eliminate small and negative values

We create RAS-like adjustments *within* a number of blocks. We keep the sub-totals fixed in the following blocks:
* Block A and I: Input-output from/to domestic production sectors (```n_p,s_p```) and the domestic investment sectors.
* Block B and J: Domestic production and investment sectors' demand for imported goods (```n_F, s_p, s_i```). For this block, we do not require row-sums to be the same before and after. The implication is that imports of a specific type $n^F_i$ may not be the same after the adjustment.

We do not make any adjustments to consumption components (in particular because there are not sufficient with consumption categories to balance the blocks). This approach ensures that most totals are the same - e.g. total imports per sector - is the same.

In [12]:
ws = gams.GamsWorkspace(working_directory=d['work']) # specify where you want to run the GAMS models from (here the repository referred to in d['work'])
threshold = 1 # anything below 1 million is removed from the data
ras_settings = IOfunctions.standardCleanSettings(db, threshold)
# Run RAS adjustment:
vs, ms = {}, {}
for k,v in ras_settings.items():
    vs[k] = RAS.shareRAS(v['v0'], v['vBar'], **v['kwargs']) # Initialize small gams model
    vs[k].compile() # set up model
    vs[k].write(); # write gams code
    ms[k] = vs[k].run(exportTo = d['work'], ws = ws) # solve
gpyDB.add_or_merge_vals(db, pd.concat([ms[k].out_db.get('vD') for k in ms]+[ras_settings[k]['vBar'] for k in ras_settings],axis=0), name = 'vD') # add data to database

*Remove zero values and residual income category:*

In [13]:
db['vD'] = adj.rc_pd(db.get('vD')[db.get('vD')!=0], ('not', pd.Index(['resIncome'], name = 'n')))

*Rescale values, divide by 10000 (measure in 10's of billions DKK):*

In [14]:
[db.__setitem__(k, db.get(k)/10000) for k in [i for i in db.getTypes(['variable','parameter']) if i.startswith(('q','v'))]+['TotalTax']];

### 6. Create variables

#### 6.1. Value of supply

At this stage, supply comes from (1) households supplying labor, (2) domestic production and investment sectors.

In [15]:
def repeatIndex(s, i1 = 'n', i2 = 's'):
    return s.reset_index().assign(**{i2: s.index}).set_index([i2,i1]).iloc[:,0]

In [16]:
vS = repeatIndex(adj.rc_pd(db.get('vD'), ('or', [db.get('n_p'), db.get('inv_p')])).groupby('n').sum()) # domestic production/investment supply
vS.loc[('HH','L')] = db.get('vD').xs('L',level='n').sum() # add value of household supply of labor
gpyDB.add_or_merge_vals(db, vS, name = 'vS') # add to database

#### 6.2. Prices

If no prices have been loaded, set them all to 1:

In [17]:
if 'p' not in db.symbols:
    db['p'] = pd.Series(1, index = db.get('vS').index.levels[-1].union(db.get('n_F')))

#### 6.3 Durables

Set the quantity of durables at the value - and define the price ```pD_dur``` as the static user cost term:

In [18]:
db['qD'] = adj.rc_pd(db.get('vD'), db.get('dur_p')).rename('qD')
db['pD_dur'] = adjMultiIndex.applyMult(db.get('p').rename_axis(index = {'n':'nn'}), db.get('dur2inv')).dropna().droplevel('nn') * (glob.db['R_LR'].vals/(1+glob.db['infl_LR'].vals)+db.get('rDepr')-1)

#### 6.4. Quantities

Back out quantities from values and prices: Don't keep residual income.

In [19]:
db['qD'] = db.get('qD').combine_first( adj.rc_pd(db.get('vD'), ('not', db.get('dur_p'))) / db.get('p'))
db['qS'] = db['vS'].vals / db.get('p')

#### 6.5. Effective prices

Initialize the prices ```pD``` and ```pS``` at the equilibrium prices:

In [20]:
if 'pD' not in db.symbols:
    db['pD'] = adjMultiIndex.bc(db.get('p'), adj.rc_pd(db.get('qD'), ('not', db.get('dur_p')))).reorder_levels(db['qD'].domains).rename('pD') # span the pure prices 'p' to fit entire qD domain
if 'pS' not in db.symbols:
    db['pS'] = adjMultiIndex.bc(adj.rc_pd(db.get('p'), ('not', db.get('n_F'))), db.get('qS')).reorder_levels(db['qS'].domains).rename('qS') # span 'p' to fit domain of qS. Drop prices on foreign goods.

### 7. Create relevant subsets and mappings

#### 7.1. Domains for general equilibrium

In [21]:
db['nEqui'] = db['vS'].index.levels[-1] # what levels do the model need to identify an equilibrium for.
db['d_qS']  = db['vS'].index # what (s,n) combinations does supply come from
db['d_qD'] = adj.rc_pd(db['vD'], db['nEqui']).index # what (s,n) combinations does demand come from
db['d_qSEqui'] = adj.rc_pd(db['d_qS'].vals, ('not', db['s_HH'])) # Going from partial to general equilibrium, what 'qS' values should be endogenized
db['d_pEqui']  = pd.Index(['L'], name = 'n') # Going from partial to general equilibrium, what 'p' values should be endogenized

#### 7.2. Trade mappings

Define the mapping from domestic to the equivalent foreign goods (with syntax ```x,x_F```):

In [22]:
db['dom2for'] = pd.MultiIndex.from_arrays([db.get('n_p').sort_values(), db.get('n_F').sort_values().rename('nn')])

Define the subset ```dExport[s,n]``` as the foreign sectors' demand for domestic goods:

In [23]:
db['dExport'] = adj.rc_pd(db.get('vD'), db.get('s_f')).index # foreign sectors' demand for domestic goods

Define the subset ```dImport``` as  the sector, domestic good, foreign good (s,n,nn) combinations in data - i.e. where the sector demands both the domestic and foreign product:

In [24]:
vD_dom = adjMultiIndex.applyMult(adj.rc_pd(db.get('vD'), db.get('n_p')), db.get('dom2for')) # demand for domestic goods mapped to foreign goods types
vD_for = adj.rc_pd(db.get('vD'), db.get('n_F')).rename_axis(index={'n':'nn'}) # demand for foreign goods
db['dImport'] = adj.rc_pd(vD_dom, vD_for).reorder_levels(['s','n','nn']).index

Define the subset ```dImport_dom``` as the sector, domestic good combination (s,n) where the sector only demands the domestic and not the corresponding foreign good:

In [25]:
db['dImport_dom'] = adj.rc_pd(vD_dom, ('not', vD_for)).droplevel('nn').reorder_levels(['s','n']).index

Define the subset ```dImport_for``` as the sector, foreign good combinations (s,n) where the sector only demand the foreign and not the corresponding domestic good:

In [26]:
db['dImport_for'] = adj.rc_pd(vD_for, ('not', db['dImport'])).index.rename(['s','n']).reorder_levels(['s','n'])

## Export:

In [27]:
aggregateDB.readSets(db) # read sets from the symbols in data
db.export(repo = d['data'])
with open(f"{d['data']}\\glob_{name}", "wb") as file:
    pickle.dump(glob,file)