# 6 of 15 central calculations

The first year course, Descriptive Economics A, presents 15 calculations for doing basic descriptive statistics.
This project shows how to conveniently apply these calcuations on selected data from Statistics Denmark to easily see the many calculation done on a short time series. We have selected six of these 15 calculations to describe the change in expenditure and income in the public sector (OFF3).

\begin{align}
    \text{Absolute change: }& x_t - x_{t-1} \\
    \text{Average absolute change: }& \frac{x_n - x_0}{n} \\
    \text{Percentage change: }& \left(\frac{x_t}{x_{t-1}}-1\right)\times 100 \\
    \text{Average percentage change: }& \left[\left(\frac{x_n}{x_0}\right)^{\frac{1}{n}}-1\right]\times 100 \\
    \text{Change in percentage points: }& \text{pct.-points}_t - \text{pct. points}_{t-1} \\
    \text{Simple index: }& \frac{x_t}{x_0}\times 100 \\
\end{align}

## Importing modules

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import pydst
import pprint as pp
import ipywidgets as widgets
from ipywidgets import interact
dst = pydst.Dst(lang='en')

## Collecting data

In [2]:
variables = dst.get_variables(table_id = 'OFF3') # collecting table OFF3 from Statistics Denmark
OFF3 = dst.get_data(table_id = 'OFF3', variables={'UI':['1.8','1.13','1.16','1.17','2.13','2.16','2.17','2.18',
                                                        '2.19'], # sub-categories
                                                  'Tid':['*'], # all time
                                                  'SEKTOR':['TOTAL']}) # total public sector
OFF3.sort_values(by = 'TID', inplace = True)

## Cleaning data

In [3]:
del OFF3['SEKTOR'] # deleting irrelevant variable
names = OFF3['UI'][0:9] # generating names variable based on different sub-categories

# Renaming variables
columns_dict = {}
columns_dict['UI'] = 'Variable'
columns_dict['TID'] = 'Time'
columns_dict['INDHOLD'] = 'm DKK'
OFF3.rename(columns = columns_dict,inplace=True)

In [4]:
# Initialize empty dictionary
rename_dict = {}
# List of wanted names
wanted_names = ['1.2: Capital accumulation',
                '1.3: Capital expenses',
                '1.4: Current and capital expenditure (1+3)',
                '1.1: Current expenditure',
                '2.1: Current revenue',
                '2.2: Capital revenue',
                '2.3: Current plus capital revenue (1+2)',
                '2.4: Currents surplus=Gross saving (2.1-1.1)',
                '2.5: Overall surplus=Net lending/borrowing (2.3-1.4)']
# Create rename dictionary for variable names
for name, wantedname in zip(names,wanted_names):
    rename_dict[name] = wantedname
# Rename the variables
for key,value in rename_dict.items():
    OFF3.Variable.replace(key, value, inplace=True)

### Create subsetting booleans

In [5]:
# Initializes list of lists for subsetting
Ilist = [[] for eachlist in range(9)]

In [6]:
# List i in Ilist is the true/false boolean for name i in wanted_names
for number, name in enumerate(wanted_names):
    Ilist[number] = OFF3['Variable']==name

In [8]:
names = ['capital_accumulation', 'capital_expenses', 'current_and_capital_expenditure',
         'current_expenditure', 'current_revenue', 'capital_revenue',
         'current_plus_capital_revenue', 'current_surplus', 'overall_surplus']
for i, name in enumerate(names):
    namespace = locals()
    namespace[name] = (OFF3[Ilist[i]])

In [9]:
dataframes = [capital_accumulation, capital_expenses, current_and_capital_expenditure,
              current_expenditure, current_revenue, capital_revenue,
              current_plus_capital_revenue, current_surplus, overall_surplus]

# Resetting index
for i in dataframes:
    i.reset_index(drop = True, inplace = True)

## Calculations

We can now make each calculation, (1)-(6), on every data frame as a new variable.

In [28]:
for i in dataframes:
    
    # Absolute change
    i['Absolute change'] = i['m DKK'].diff()
    
    # Average absolute change
    i['Average absolute change'] = i['m DKK'].mean()
    
    # Percentage change
    i['Percentage change'] = i['m DKK'].pct_change()*100
    
    # Average percentage change
    i['Average percentage change'] = ((i['m DKK'].iloc[-1]/i['m DKK'][0])**(1/len(i['m DKK']))-1)*100
    
    # Change in percentage points
    i['Change in percentage points'] = i['Pct. change'].diff()
    
    # Simple index
    i['Simple index'] = np.nan # Create 'Simple index'-variable with NaNs
    
    for j in range(0, len(i['m DKK'])):
        i['Simple index'][j] = i['m DKK'].iloc[j]/i['m DKK'].iloc[0]*100

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  import sys
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  # Remove the CWD from sys.path while we load stuff.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the document

## Plotting the data

In [23]:
print(type(dataframes()))

AttributeError: 'list' object has no attribute 'tolist'

In [25]:
dataframes

[                     Variable  Time  m DKK  Absolute change          Mean  \
 0   1.2: Capital accumulation  1971   6571              NaN  32813.604167   
 1   1.2: Capital accumulation  1972   6814            243.0  32813.604167   
 2   1.2: Capital accumulation  1973   7044            230.0  32813.604167   
 3   1.2: Capital accumulation  1974   8844           1800.0  32813.604167   
 4   1.2: Capital accumulation  1975   9625            781.0  32813.604167   
 5   1.2: Capital accumulation  1976  10621            996.0  32813.604167   
 6   1.2: Capital accumulation  1977  11689           1068.0  32813.604167   
 7   1.2: Capital accumulation  1978  12342            653.0  32813.604167   
 8   1.2: Capital accumulation  1979  14506           2164.0  32813.604167   
 9   1.2: Capital accumulation  1980  15017            511.0  32813.604167   
 10  1.2: Capital accumulation  1981  15519            502.0  32813.604167   
 11  1.2: Capital accumulation  1982  17011           1492.0  32

In [33]:
rows = ['m DKK', 'Absolute change', 'Average absolute change', 'Percentage change',
        'Average percentage change', 'Change in percentage points', 'Simple index']

data_names = ['Capital accumulation', 'Capital expenses', 'Current and capital expenditure',
             'Current expenditure', 'Current revenue', 'Capital revenue', 
             'Current plus capital revenue', 'Current surplus', 'Overall surplus']

def view2(x, df):
    plt.plot(dataframes[df]['Time'], dataframes[df][x])
    plt.ylabel(x)
    plt.xlabel(data_names[df])
    return plt.show()
w = widgets.Select(options=rows, description='Calculation')
q = widgets.Select(options=range(0,9),value=1, description='Data set')
interact(view2, x=w,df=q)

interactive(children=(Select(description='Calculation', options=('m DKK', 'Absolute change', 'Average absolute…

<function __main__.view2(x, df)>

In [15]:
capital_expenses

Unnamed: 0,Variable,Time,m DKK,Absolute change,Mean,Pct. change,Avg. pct. change,Change in pct. points,Simple index
0,1.3: Capital expenses,1971,7221,,38328.75,,5.263928,,100.0
1,1.3: Capital expenses,1972,7172,-49.0,38328.75,-0.678576,5.263928,,99.321424
2,1.3: Capital expenses,1973,7892,720.0,38328.75,10.039041,5.263928,10.717617,109.292342
3,1.3: Capital expenses,1974,9800,1908.0,38328.75,24.176381,5.263928,14.13734,135.715275
4,1.3: Capital expenses,1975,10903,1103.0,38328.75,11.255102,5.263928,-12.921279,150.990168
5,1.3: Capital expenses,1976,12636,1733.0,38328.75,15.894708,5.263928,4.639606,174.989614
6,1.3: Capital expenses,1977,13928,1292.0,38328.75,10.224755,5.263928,-5.669953,192.881872
7,1.3: Capital expenses,1978,14448,520.0,38328.75,3.733487,5.263928,-6.491268,200.083091
8,1.3: Capital expenses,1979,16421,1973.0,38328.75,13.655869,5.263928,9.922383,227.406176
9,1.3: Capital expenses,1980,17253,832.0,38328.75,5.066683,5.263928,-8.589186,238.928126
