## Predicting National Firearms Registry Processing Times using Supervised Machine Learning

Author: Keigan Kincaid

Creation Date: 11th March 2022

Last Updated: 12th March 2022

Project Repository: https://github.com/Keigan/SLOW_AtF.py

Webpage: keigankincaid.com

Etherium Address (ENS Domain): keigan.eth

### Abstract

Social media platforms have made way for analysis of populations with relative ease. Major focuses of software companies, marketing firms, and advertisers have been analizing user data in order to predict and influence social media users. While common methods involving scrapping social media posts, trends, and activity are used to compile data pipe lines, often self-reporting enables access to datasets that has been cleaned prior to collection. These precompiled datasets enable rapid production of statistics, models, and insights that can meaningfully impact users. In this study we utilize a dataset from a National Firearms Act, abbreviated NFA, forum on the popular internet forum Reddit.(https://www.reddit.com/r/NFA/) The post in question is a thread of replies related to the proccessing of NFA Tax Stamps for Short Barreled Rifles, abbreviated SBR, and Suppressors in the United States.
    
In 1934 the National Firearms Act was passed requiring purchases and all transfers of ownership related to registered NFA firearms and items be done through the National Firearms Registration and Transfer Record. Amount many things, the NFA also requires that the permanent transport of NFA firearms across state lines by the owner must be reported to the Bureau of Alcohol, Tobacco, Firearms and Explosives, abbreviated ATF.
    
Users on Reddit have used the platform to share experiences relating to their displeasure in the buearocratic process, education regarding traversal through said process, and fortunatly for machine-learning researchers and data-engineers a tabular dataset publically available in Google Spreadsheets. Here we take a deep-dive into predicting processing times of NFA Tax Stamp approval through multipe methods and attempt to develop an accurate model to predict processing times.

## Variable Descriptions and Key Values

#### Type Of Application (Individual or Trust)
###### TYPE
- Individual = 0
- Trust = 1

#### Date Application Pending / Date Approved

###### P_DAY
- DD

###### P_MONTH
- MM

###### P_YEAR
- YYYY

###### A_DAY
- DD

###### A_MONTH
- MM

###### A_YEAR
- YYYY

#### State for which Application Submitted
###### STATE
- States key values are represented by the standardized FIPS codes

#### Item for which Application Submitted
###### ITEM
- Supressor = 0
- Short Barrel Rifle = 1
- Short Barrel Shotgun = 2
- Machine Gun = 3
- Destructive Device = 4
- Other Weapon = 5

#### Form for which Application Submitted
###### FORM
- Form 1 Traditional Filing = 0 
- Form 1 Electronic Filing Filing = 1
- Form 4 Traditional = 2
- Form 4 Eletronic Filing = 3

#### Overall Application Time Elapsed (from Pending to Approved)
###### DURATION
- Integer value representing number of days from date application marked as pending to the day in which the application marked approved.

In [11]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn

%matplotlib inline

In [36]:
nfa_tax_stamp_dataset = pd.read_csv('key_value_dataset.csv')

nfa_tax_stamp_dataset

Unnamed: 0,p_day,p_month,p_year,a_day,a_month,a_year,type,item,form,duration,state
0,28,9,2019,17,10,2019,0,5,1,19,34
1,9,6,2020,1,7,2020,0,5,1,22,34
2,30,10,2020,1,12,2020,0,5,1,32,34
3,28,2,2021,1,4,2021,0,5,1,32,34
4,27,4,2021,9,6,2021,0,5,1,43,53
...,...,...,...,...,...,...,...,...,...,...,...
2450,21,3,2019,28,5,2019,1,2,1,68,48
2451,15,9,2020,7,1,2021,0,2,2,114,42
2452,16,1,2021,24,8,2021,0,2,0,220,29
2453,13,7,2018,24,3,2019,1,2,0,254,51


In [37]:
X = nfa_tax_stamp_dataset[['type', 'p_day','p_month','p_year','a_day','a_month','a_year', 'state', 'item', 'form']]
Y = nfa_tax_stamp_dataset['duration']

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.25, random_state=1776)

X_train = X_train.reset_index(drop=True)
Y_train = Y_train.reset_index(drop=True)

print(X_train.shape)
print(Y_train.shape)
print(X_test.shape)
print(Y_test.shape)

(1841, 2)
(1841,)
(614, 2)
(614,)


In [38]:
model = svm.SVC(C=.01)
model.fit(X_train, Y_train)

y_pred = model.predict(X_test)

print("Accuracy Score:", accuracy_score(Y_test, y_pred), "\n")

print(classification_report(Y_test, y_pred))

Accuracy Score: 0.008143322475570033 

              precision    recall  f1-score   support

          -4       0.00      0.00      0.00         1
           7       0.00      0.00      0.00         1
           9       0.00      0.00      0.00         1
          10       0.00      0.00      0.00         1
          12       0.00      0.00      0.00         5
          13       0.00      0.00      0.00         5
          14       0.00      0.00      0.00         8
          15       0.00      0.00      0.00         7
          16       0.00      0.00      0.00         4
          17       0.00      0.00      0.00        11
          18       0.00      0.00      0.00         4
          19       0.00      0.00      0.00        11
          20       0.00      0.00      0.00        13
          21       0.00      0.00      0.00        11
          22       0.00      0.00      0.00        13
          23       0.00      0.00      0.00         8
          24       0.00      0.00      0.0

  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


In [None]:
References

Google Form Used To Collect Data:
    https://docs.google.com/forms/d/e/1FAIpQLSd7LcqXHZBflBfoRzfezM_ZRkmXfdjPHYzew5KdMtXN0mJ8Fg/viewform
        
Original Link To Response Dataset in Google Docs
    https://docs.google.com/spreadsheets/d/1RsR8JOt8fcKIAbuv6ParmYTfSsUHwk3mEhVt0encK6g/edit#gid=1955807116
        
NFA Tax Stamp Law
    https://uslaw.link/citation/us-law/public/73/474