# ESTIMATION OF TAX EXPENDITURES

In general, the purpose of tax expenditures is to support certain economic sectors and activities, certain social
groups, etc., and it is important to note that there is no unified definition of tax expenditures in the literature.
According to the World Bank, an example of tax expenditures, in a broader context, are tax provisions that
deviate from a normative or a specific tax system and may take a number of forms: exemptions, allowances,
deductions, rebates, credits, preferential tax rates or tax deferrals (WB 2006). According to the Organisation
for Economic Cooperation and Development (OECD), a tax expenditure is a transfer of public resources
that is achieved by reducing tax obligations with respect to a benchmark tax (i.e., the standard tax system),
rather than by a direct expenditure. Due to the fact that there is no single definition of tax expenditure that
is applied everywhere, most countries use the definition of tax expenditure of the OECD. Tax expenditures
may occur in various forms (e.g. exemption, relief, deductions, credit, etc.), which can be delivered through several types
of direct and indirect taxes. Their provision is usually based on a specific goal that should be achieved, and
those benefiting can be natural persons and legal entities.

In order to estimate tax expenditure it is crucial to have a robust micro-simulation model which can have the possibility to estimate tax expenditures through different groups of taxpayers.

In [None]:
options(warn=-1)
library(tidyverse)
library(ggQC)
library(data.table)
options(scipen = 999) 


# 1. Creating synthetic data for simulation

In [None]:
# Creating articial data for simulation      
SOURCE_DATA_WAGES<-data.frame(stringsAsFactors = FALSE, # Prevent to factors
                                              m = as.integer(runif(10000, 1, 13)),
                                              id = as.integer(runif(10000, 1, 1000000)),
                           nace = sample(x=c("01","02","03","B","10-12","13-15","16","17","18","19","20","21",   
                                                       "22","23","24","25","26","27","28","29","30","31-32","33","35","F","45","46","47",
                                                       "49","50","52","53","I","58","59-60","61","62-63","64","66","68B","69-70","71","73",
                                                        "74-75","77","79","80-82","85","86","87-88","90-92","93","94","95","96","T","36","65","51"),
                                            prob = c(0.0169,0.0169,0.0169,0.0169,0.0169,0.0169,0.0169,	
                                                        0.0169,0.0169,0.0169,0.0169,0.0169,0.0169,0.0169,
                                                        0.0169,0.0169,0.066,0.0169,0.0169,0.0169,0.0169,
                                                        0.0169,0.0169,0.0169,0.0169,0.0332,0.0169,0.0169,
                                                        0.0169,0.0169,0.0169,0.0169,0.0169,0.0169,0.0169,
                                                        0.0169,0.0169,0.0169,0.0169,0.0169,0.0169,0.0169,
                                                        0.0169,0.0169,0.0169,0.0169,0.0169,0.015,0.0158,
                                                        0.0169,0.0169,0.0169,0.0169,0.0169,0.012,0.01,0.001,0.001,0.001),size=10000,replace=TRUE),
                             yearofbirth = as.integer(runif(10000, 1957, 2003)),
                             typeofincome = sample(x=c("1","2","3"), prob = c(.4, .4,.2),size=10000,replace=TRUE),
                             sex = sample(x=c("M","F"), prob = c(.6, .4),size=10000,replace=TRUE),
                             gross_i = abs(rnorm(runif(10000, 18000, 10000000), mean = 350000, sd = 700000)),
                            personal_allowance = as.double(sample(x=c(0,8000), prob = c(.1, .9),size=10000,replace=TRUE)))%>%
#dplyr::mutate(ssc = gross_i*0.275,pit = (gross_i-(ssc+personal_allowance))*0.10)
dplyr::mutate(ssc = gross_i*0.275,pit = as.integer((gross_i-(ssc+personal_allowance))*0.10))
# Introduce Tax expenditures in data
SOURCE_DATA_WAGES$pit[sample(nrow(SOURCE_DATA_WAGES),3000)] <- 0
SOURCE_DATA_WAGES$ssc[sample(nrow(SOURCE_DATA_WAGES),3000)] <- 0
SOURCE_DATA_WAGES<-SOURCE_DATA_WAGES%>%
dplyr::mutate( net_i = gross_i-(ssc+pit))%>%
data.table()

In [None]:
nrow(SOURCE_DATA_WAGES)

# 2. Defining a simple function for detecting tax expenditures

This function is a simpler example of detecting ТЕ’s. Here, in order for ТЕ’s to be detected, the taxpayer needs to be in category 2 and category 3, as well as to have a tax amount of 0.

In [None]:
detectig_TE <- function(typeofincome,pit){
                  taxexpenditures=ifelse(typeofincome %in% c("2","3"), ifelse(pit=="0", "Yes", "No"),"No")
                  return(as.character(taxexpenditures))
                }

In [None]:
Estimated_TE<-mutate(SOURCE_DATA_WAGES,
                    TE=detectig_TE(typeofincome,pit)
                    )

In [None]:
head(Estimated_TE)

# 3.Estimation of TE's

In [None]:
TE_TABLE_GROUP<-Estimated_TE%>%
dplyr::filter(TE=="Yes")%>%
dplyr::mutate(estimated_te = as.integer((gross_i-(ssc+personal_allowance))*0.10))%>%
dplyr::select(nace,estimated_te)%>%
dplyr::group_by(nace)%>%
dplyr::summarise(value=sum(estimated_te))%>%
dplyr::top_n(10) 

In [None]:
head(TE_TABLE_GROUP)

In [None]:
summary(TE_TABLE_GROUP)

# 4. Plotting Pareto plot (TOP-10 TE's)

In [None]:
ggplot(TE_TABLE_GROUP, aes(x=nace, y=value)) +
  stat_pareto(point.color = "red",
              point.size = 3,
              line.color = "black",
              size.line = 1,
              bars.fill = c("blue", "orange")
  )+
  xlab('NACE chapters') +
  ylab('Millions')