This repo contains the data and scripts behind "Where in The U.S. Are You Most Likely to Be Audited by the IRS?" published April 1, 2019.
The earned income tax credit, or EITC, is a program designed to help boost low-income workers out of poverty. In response to pressure from congressional Republicans to root out incorrect payments of the credit, the IRS audits EITC recipients at higher rates than all but the richest Americans.
Kim M. Bloomquist, who served as a senior economist with the IRS’ research division for two decades, decided to map the distribution of audits to illustrate the dramatic regional effects of the agency's emphasis on EITC recipients. In a study first published in Tax Notes, he found that because more than a third of all audits are of EITC recipients, the number of audits in each county is largely a reflection of how many taxpayers there claimed the credit.
The included data covers the total number of income tax filings and the estimated number of audits per county, for the combined tax years 2012-15.
All raw data is in the data/raw/ subfolder.
-
Bloomquist - Regional Bias in IRS Audit Selection Data.xlsxCovers the estimated number of tax exams (aka audits) per county for tax years 2012-15. This data was calculated and provided by Kim M. Bloomquist. Rates were estimated using audit coverage rates published in the annual IRS Data Book in combination with county tax return data on the IRS website.
-
County-2012.xlsx, County-2013.xlsx, County-2014.xlsx, County-2015.xlsxCovers the number of filings per county for tax years 2012-15. These were dowloaded from the IRS website.
See dataToJSON.R for all data cleaning.
All cleaned data and documentation is in the data/cleaned/ subfolder.
auditsData_2019.04.03.csv and auditsData_2019.04.03.json have the same data, just saved in different formats.
auditsData_dicitonary.csv contains column definitions.