# Methods

- We have the discretized CRSS dataset in '../../Big_Files/Discretized_All_05_19_23.csv'
- MissForest is a round-robin imputation method implemented in R, generally considered one of the best imputation methods.  It has several Python implementations.
- I tried to use MissForest, https://pypi.org/project/MissForest/, to impute missing values, but it gave me errors, and finding the source of the errors led me down the path to write my own round-robin implementation.
- I compare here three methods:
    - Round-Robin Random Forest (my own implementation of Round Robin, using scikit-learn's random forest)
    - Imputation by mode
    - IVEware, using the hyperparameters in the CRSS Imputation report
- To compare, I followed the example for MissForest.
    - I dropped all samples with a missing value, so I would have ground truth.
    - I erased 15% of the values in each sample.
    - I used each imputation method to impute the missing values, and, for each feature, counted how many did not match the ground truth.
- My round-robin method
    - In data_NaN, change all of the 'Unknown' to np.NaN.
    - In each feature, count the number of unknown samples.
    - In another copy, data_Mode, impute by mode in all of the features.
    - Starting with the feature with the least (nonzero) number of missing samples:
        - Copy that feature from data_NaN into data_Mode, so that only that feature has missing values.
        - Separate the dataframe into two, one with known values in the target variable (X) and one with unknown values (Z).
        - From the dataframe with known values (X), separate out the target variable (call it 'y')
        - Using Random Forest, build a model that maps X to y.  
        - Use the model to impute the missing values
    - At each iteration we replace the mode-imputed values with RF-imputed values.
- The IVEware implementation is available in several platforms, but Python is not one of them.  I run it in R outside this notebook.  Be aware that the random selection of values to erase is different for each run, so the IVEware imputation must be run anew.  

# Results of Comparison of Three Imputation Methods

- Note that these results use the CRSS_Discretized_All_12_22_22.csv file, which included the 2016-2020 data, but not the 2021.
- We ran the imputation on 78 features with 224,850 samples.  
    - The features are the features of the CRSS dataset that are have data for all of 2016 - 2020, are not the results of imputation by CRSS, may have a pattern (not random numbers like VIN numbers), and that do not have more than 20% of the samples missing.  
    - The features were discretized (binned) down to 2-10 categories before imputation.
    - The samples are those of the 619,027 that have no missing values in any of the 78 features.
- First Run
    - Percentage of Samples Incorrectly Imputed

| | Percentage of Samples Incorrectly Imputed |
| --- | --- |
| Random Forest | 22.25% |
| Mode Imputation | 28.51% |
| IVEware | 24.23% |

    - Comparison of number of errors in the 78 features:

|  | Fewer | Equal | More | Total |
| --- | --- | --- | --- | --- |
Compare RF to Mode |  45 | 33 | 0 | 78 |
Compare RF to IVEware | 50 | 0 | 28 | 78 |
Compare Mode to IVEware | 39 | 0 | 39 |  78 |


- Second Run
    - Percentage of Samples Incorrectly Imputed

| | Percentage of Samples Incorrectly Imputed |
| --- | --- |
| Random Forest | 22.17 % |
| Mode Imputation | 28.42% |
| IVEware |  23.84% |


    - Comparison of number of errors in the 78 features:

|  | Fewer | Equal | More |
| --- | --- | --- | --- |
| Compare RF to Mode | 46 | 31 | 1 |
| Compare RF to IVEware | 49 | 0 | 29 |
| Compare Mode to IVEware |  36 | 1 | 41 |

    - Number of NaN Imputed Differently by Different Methods

|  |  |
| --- | --- |
|Total Number of NaN|  2,443,202|
|RF Different from Mode|  273,351|
|RF Different from IVEware|  606,751|
|Mode Different from IVEware|  738,833|

- Third run with 79 features (I had neglected to include AGE)


    - Percentage of Samples Incorrectly Imputed

| | Percentage of Samples Incorrectly Imputed |
| --- | --- |
| Random Forest | 22.52 % |
| Mode Imputation | 28.63% |
| IVEware |  22.73% |



    - Comparison of number of errors in the 78 features:

|  | Fewer | Equal | More |
| --- | --- | --- | --- |
| Compare RF to Mode | 47 | 31 | 1 |
| Compare RF to IVEware | 47 | 0 | 32 |
| Compare Mode to IVEware |  38 | 0 | 41 |

    - Number of NaN Imputed Differently by Different Methods

|  |  |
| --- | --- |
|Total Number of NaN|  2,417,148|
|RF Different from Mode|  279,104|
|RF Different from IVEware|  580,863|
|Mode Different from IVEware|  713,171|



## Discussion

- Random Forest is as good or better than Mode for (nearly) every feature.
- Random Forest is as good or better than IVEware on more than half of the features, but not overwhelmingly, and slightly better in the count of missing samples correctly imputed.
- IVEware and Mode are comparable in the number of features, but IVEware is much better in the count of missing samples correctly imputed.
- Random Forest and Mode make the same mistakes.  
- IVEware makes different mistakes from Random Forest and Mode.

## Conclusion

- Use Random Forest

In [1]:
%%latex
\tableofcontents

<IPython.core.display.Latex object>

# Setup
## Import Libraries

In [2]:
import sys, copy, math, time, os

print ('Python version: {}'.format(sys.version))

import numpy as np
print ('NumPy version: {}'.format(np.__version__))
np.set_printoptions(suppress=True)


import pandas as pd
print ('Pandas version:  {}'.format(pd.__version__))
pd.set_option('display.max_rows', 500)

import sklearn
print ('SciKit-Learn version: {}'.format(sklearn.__version__))
from sklearn.model_selection import train_test_split

import sklearn.neighbors._base
sys.modules['sklearn.neighbors.base'] = sklearn.neighbors._base

from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import RandomForestRegressor

# Set Randomness.  Copied from https://www.kaggle.com/code/abazdyrev/keras-nn-focal-loss-experiments
import random
#np.random.seed(42) # NumPy
#random.seed(42) # Python
#tf.set_random_seed(42) # Tensorflow

from IPython.display import Audio
sound_file = './beep.wav'

import warnings
warnings.filterwarnings('ignore')

print ('Finished Importing Libraries')


Python version: 3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:25:13) [Clang 14.0.6 ]
NumPy version: 1.24.0
Pandas version:  1.5.2
SciKit-Learn version: 1.2.0
Finished Importing Libraries


# Import Data

## Get Data
- The Get_Data_from_Original() reads the (original) CRSS files from the CRSS directory, preprocesses it, and writes it to files in a folder outside this GitHub repo (because the files are too large for my subscription), and returns the dataframes.
- The Get_Data_from_Temp_Files() reads the temp files and returns the dataframes.  I created this option for running repeatedly during writing and debugging, because it's much faster.

In [3]:
def Get_Data():
    print ('Get_Data')
    data = pd.read_csv('../../Big_Files/CRSS_Discretized_All_05_19_23.csv', low_memory=False)
    print ('data.shape = ', data.shape)
    print ('Drop Imputed Columns')
    for feature in data:
        if '_IM' in feature:
            print (feature)
            data.drop(columns=feature, inplace=True)
    
    print ('data.shape = ', data.shape)
    print ()
    
    return data

In [4]:
data = Get_Data()


Get_Data
data.shape =  (747342, 107)
Drop Imputed Columns
HOUR_IM
LGTCON_IM
RELJCT2_IM
WEATHR_IM
WKDY_IM
NO_INJ_IM
MANCOL_IM
EVENT1_IM
ALCHL_IM
MAXSEV_IM
RELJCT1_IM
BDYTYP_IM
IMPACT1_IM
MXVSEV_IM
NUMINJ_IM
PCRASH1_IM
V_ALCH_IM
VEVENT_IM
AGE_IM
EJECT_IM
INJSEV_IM
PERALCH_IM
SEAT_IM
SEX_IM
VEH_AGE_IM
data.shape =  (747342, 82)



In [5]:
def Impute_Round_Robin(data):
    print ('Impute()')
    pd.set_option('display.max_columns', None)
    
    # Replace 'Unknown' with np.NaN
    data.replace({'Unknown': np.nan}, inplace=True)
    display(data.head(20))
    print ()
    
#    data.sort_values(by = ['CASENUM', 'VEH_NO', 'PER_NO'], ascending = [True, True, True])
    
    # Make a list of features with missing samples, 
    #     ordered by the number of missing samples, 
    #     from least to most.  
    Missing = []
    Complete = []
    for feature in data:
        s = data[feature].isna().sum()
        if s==0:
            Complete.append([feature, s])
        if s>0:
            Missing.append([feature, s])
    Missing = sorted (Missing, key=lambda x:x[1], reverse=False)
    print ()
    print ('Complete[]')
    display(Complete)
    print ()
    print ('Missing[]')
    display(Missing)
    print ()
    
    print ('Make data_Mode')
    print ()
    data_Mode = pd.DataFrame()
    for X in Complete:
        feature = X[0]
        data_Mode[feature] = data[feature]
    for M in Missing:
        feature = M[0]
        m = data[feature].mode()[0]
        print (feature, M[1], m)
        data_Mode[feature] = data[feature].fillna(m)
    print ('data_Mode')
    display(data_Mode.head(20))
#    data.sort_values(
#        by = ['CASENUM', 'VEH_NO', 'PER_NO'], 
#        ascending = [True, True, True], 
#        inplace=True
#    )
#    print ()
#    print ('data.PER_NO.equals(data__Mode.PER_NO)')
#    print (data.PER_NO.equals(data_Mode.PER_NO))
#    print ()
#    
    print ()
    print ('Make starting point for data_Imputed')
    data_Imputed = pd.DataFrame()
    for X in Complete:
        feature = X[0]
        data_Imputed[feature] = data[feature]
    for X in Missing:
        feature = X[0]
        data_Imputed[feature] = data_Mode[feature]
    print ('data_Imputed')
    display(data_Imputed.head(20))
    print ()
#    data_Imputed.sort_values(
#        by = ['CASENUM', 'VEH_NO', 'PER_NO'], 
#        ascending = [True, True, True], 
#        inplace=True
#    )
#    print ()
#    print ('data.PER_NO.equals(data_Imputed.PER_NO)')
#    print (data.PER_NO.equals(data_Imputed.PER_NO))
#    print ()
    
    print ('Start Loop')
    print ()
    n = 0
    for M in Missing:
        n += 1
        print (M)
        feature = M[0]
        data_Imputed[feature] = data[feature]
#        print ()
#        print ('data[feature].isna().sum()')
#        print (data[feature].isna().sum())
#        print ('data_Imputed[feature].isna().sum()')
#        print (data_Imputed[feature].isna().sum())
#        print ()
        W = data_Imputed.dropna(subset=[feature])
        X = data_Imputed.dropna(subset=[feature])
        y = X[feature]
        X.drop(columns=feature, inplace=True)
        Z = data_Imputed[data_Imputed[feature].isna()]
        Z.drop(columns=feature, inplace=True)
#        Z.reset_index(drop=True, inplace=True)
#        print (data.shape)
#        print (X.shape)
#        display(X.head(40))
#        display(y.head(40))
#        print (Z.shape)
#        display(Z)
        clf = RandomForestClassifier(max_depth=2, random_state=0)
        clf.fit(X,y)
#        print ('clf.predict(Z)')
        z = clf.predict(Z)
        print (len(z))
        display(z)
        Z[feature] = z
#        display(Z)
        data_Imputed = pd.concat([Z, W])
#        display(data_Imputed.head(60))
        print (data_Imputed.shape)
        print ()
#        data_Imputed.sort_values(
#            by = ['CASENUM', 'VEH_NO', 'PER_NO'], 
#            ascending = [True, True, True], 
#            inplace=True
#        )
#        print ()
#        print ('data.PER_NO.equals(data_Imputed.PER_NO)')
#        print (data.PER_NO.equals(data_Imputed.PER_NO))
#        print ()
               
        Check_Feature(data, data_Imputed, feature)
#        if n==10:
#            return data_Imputed
    
    
    
    
    print ()
    return data_Imputed

In [7]:
def Impute_Full(data):
    print ('Impute()')
    data.replace({'Unknown': np.nan}, inplace=True)
    for feature in data:
        print (feature, len(pd.unique(data[feature])))
    print ()
    mf = MissForest()
    data = mf.fit_transform(data)
    return data

In [8]:
def Check(data, data_Imputed):
    Features = data.columns
    print (Features)
    for feature in Features:
        U = pd.unique(data[feature]).tolist()
        print (U)
        A = []
        for u in U:
            a = len(data[data[feature]==u])
            b = len(data_Imputed[data_Imputed[feature]==u])
            A.append([u, a, b])
        display(A)
        print ()


In [9]:
def Check_Feature(data, data_Imputed, feature):
    U = pd.unique(data[feature]).tolist()
    U = [x for x in U if x == x]
    print (U)
    A = []
    for u in U:
        a = len(data[data[feature]==u])
        b = len(data_Imputed[data_Imputed[feature]==u])
        A.append([u, a, b, b-a])
    a = data[feature].isna().sum()
    b = data_Imputed[feature].isna().sum()
    A.append(['NaN', a, b, 0])
    A = pd.DataFrame(A, columns=['Value', 'Original', 'Imputed', 'Difference'])
    display(A)
    print ()


# Test_Accuracy

In [None]:
def Compare_Imputation_Methods_Part_1():
    print ()
    print ('Compare_Imputation_Methods_Part_1()')
    data = Get_Data()
    data.drop(columns=['CASENUM', 'VEH_NO', 'PER_NO'], inplace=True)
    print (data.shape)

    # Drop all samples with missing data, so we have ground truth
    data.replace({'Unknown':np.nan}, inplace=True)
    data.dropna(inplace=True)
    data.reset_index(inplace=True, drop=True)
    for feature in data:
        data[feature] = pd.to_numeric(data[feature])
    data.astype('int64')

    data_Ground_Truth = data.copy(deep=True)
    for feature in data_Ground_Truth:
        data_Ground_Truth[feature] = pd.to_numeric(data_Ground_Truth[feature])
    data_Ground_Truth = data_Ground_Truth.astype('int64')
    print ('data_Ground_Truth.shape')
    print (data_Ground_Truth.shape)
    display(data_Ground_Truth.head())

    # Randomly pick 15% of the values from each row
    # and set them to be missing
    print ('Remove 15% of values from each row')
    frac = .15
    N = data.shape[0] * frac # Number of NaN in each feature
    for c in data.columns:
        idx = np.random.choice(a=data.index, size=int(len(data) * frac))
        data.loc[idx, c] = np.nan
    data_NaN = data.copy(deep=True)
    print ('data_NaN.shape')
    print (data_NaN.shape)
    display(data_NaN.head())

    data_IVEware = data.fillna('')
    data_IVEware.to_csv('../../Big_Files/data_IVEware.txt', sep='\t', index=False)
    
    data_Mode = pd.DataFrame()
    for feature in data:
        data_Mode[feature] = data[feature].fillna(data[feature].mode()[0])
    data_Mode = data_Mode.astype('int64')
    print ('data_Mode.shape')
    print (data_Mode.shape)
    display(data_Mode.head())
    
    data_RF = Impute_Round_Robin(data)
    data_RF.sort_index(inplace=True)
    data_RF = data_RF[data.columns]  
    data_RF = data_RF.astype('int64')
    
    print ('data_RF.shape')
    print (data_RF.shape)
    display(data_RF.head())
#    print ()

    return data_Ground_Truth, data_NaN, data_RF, data_Mode

def Compare_Imputation_Methods_Part_2(
    data_Ground_Truth, data_NaN, data_RF, data_Mode, data_IVEware
):
    print ('Compare_Imputation_Methods_Part_2')
    A = []
    for feature in data_NaN:
        N = data_NaN[feature].isna().sum()
#        print (feature, N)
#        print ()
        D = data_Ground_Truth[feature] != data_RF[feature]
        d = D.sum()
        E = data_Ground_Truth[feature] != data_Mode[feature]
        e = E.sum()
        F = data_Ground_Truth[feature] != data_IVEware[feature]
        f = F.sum()
        G = data_RF[feature] != data_Mode[feature]
        g = G.sum()
        H = data_RF[feature] != data_IVEware[feature]
        h = H.sum()
        I = data_Mode[feature] != data_IVEware[feature]
        i = I.sum()
        print (feature, N, d, e, f, g, h, i)
        print (
            feature, 
            data_Ground_Truth.dtypes[feature],
            data_NaN.dtypes[feature],
            data_RF.dtypes[feature],
            data_Mode.dtypes[feature],
            data_IVEware.dtypes[feature],
        )
        A.append([
            feature, N, 
            d, int(d/N*100), 
            e, int(e/N*100), 
            f, int(f/N*100),
            g, int(g/N*100),
            h, int(h/N*100),
            i, int(i/N*100),
        ])
#        print (D[:10])
        print ()
    print ()
    
    A = sorted(A, key=lambda x:x[3])
    B = pd.DataFrame(
        A, 
        columns=[
            'Feature', 'nNaN', 
            'nRF Incorrect', 'pRF Incorrect', 
            'nMode Incorrect', 'pMode Incorrect', 
            'nIVEware Incorrect', 'pIVEware Incorrect',
            'RF and Mode Different', 'RF v/s Mode %',
            'RF and IVEware Different', 'RF v/s IVEware %',
            'Mode and IVEware Different', 'Mode v/s IVEware %'
        ]
    )
    display(B)
    a = sum([x[1] for x in A])
    b = sum([x[2] for x in A])
    c = sum([x[4] for x in A])
    d = sum([x[6] for x in A])
    e = round(b/a*100,2)
    f = round(c/a*100,2)
    g = round(d/a*100,2)
    s = len(A) - sum([x[8] for x in A])
    t = len(A) - sum([x[9] for x in A])
    u = len(A) - sum([x[10] for x in A])

    RF_less_Mode = sum([x[2] < x[4] for x in A])
    RF_equal_Mode = sum([x[2] == x[4] for x in A])
    RF_greater_Mode = sum([x[2] > x[4] for x in A])

    RF_less_IVEware = sum([x[2] < x[6] for x in A])
    RF_equal_IVEware = sum([x[2] == x[6] for x in A])
    RF_greater_IVEware = sum([x[2] > x[6] for x in A])

    Mode_less_IVEware = sum([x[4] < x[6] for x in A])
    Mode_equal_IVEware = sum([x[4] == x[6] for x in A])
    Mode_greater_IVEware = sum([x[4] > x[6] for x in A])

    print ()
    print ('Error RF = ', e)
    print ('Error Mode = ', f)
    print ('Error IVEware = ', g)
    print ('nRF > nMode: ', s)
    print ('nRF > nIVEware: ', t)
    print ('nModel > nIVEware: ', u)
    print ('Compare RF to Mode: ', RF_less_Mode, RF_equal_Mode, RF_greater_Mode)
    print ('Compare RF to IVEware: ', RF_less_IVEware, RF_equal_IVEware, RF_greater_IVEware)
    print ('Compare Mode to IVEware: ', Mode_less_IVEware, Mode_equal_IVEware, Mode_greater_IVEware)
    print ()
    print ('Number of NaN in data_NaN: ', data_NaN.isna().sum().sum())
    print ('RF Different from Mode: ', sum([x[8] for x in A]))
    print ('RF Different from IVEware: ', sum([x[10] for x in A]))
    print ('Mode Different from IVEware: ', sum([x[12] for x in A]))
        
    display(Audio(sound_file, autoplay=True))
    
    
        

In [None]:
data_Ground_Truth, data_NaN, data_RF, data_Mode = Compare_Imputation_Methods_Part_1()

## Now do IVEware Imputation:  IVE_12_22_22.xml

In [None]:
data_IVEware = pd.read_csv('../../Big_Files/data_IVEware.csv')
data_IVEware.drop(columns='Unnamed: 0', inplace=True)

print ('data_Ground_Truth', data_Ground_Truth.shape)
display(data_Ground_Truth.head(10))
print ('data_NaN', data_NaN.shape)
display(data_NaN.head(10))
print ('data_RF', data_RF.shape)
display(data_RF.head(10))
print ('data_Mode', data_Mode.shape)
display(data_Mode.head(10))
print ('data_IVEware', data_IVEware.shape)
display(data_IVEware.head(10))


In [None]:
Compare_Imputation_Methods_Part_2(
    data_Ground_Truth, data_NaN, data_RF, data_Mode, data_IVEware
)

In [10]:
def Main():
    data = Get_Data()
    
#    data_Imputed = Impute_Full(data)
    data_Imputed = Impute_Round_Robin(data)
    data_Imputed.to_csv('../../Big_Files/CRSS_Imputed_All_05_19_23.csv', index=False)
#    display(data_Imputed.head(50))
    
    Check(data, data_Imputed)
    display(Audio(sound_file, autoplay=True))
    return 0
Main()

Get_Data
data.shape =  (747342, 107)
Drop Imputed Columns
HOUR_IM
LGTCON_IM
RELJCT2_IM
WEATHR_IM
WKDY_IM
NO_INJ_IM
MANCOL_IM
EVENT1_IM
ALCHL_IM
MAXSEV_IM
RELJCT1_IM
BDYTYP_IM
IMPACT1_IM
MXVSEV_IM
NUMINJ_IM
PCRASH1_IM
V_ALCH_IM
VEVENT_IM
AGE_IM
EJECT_IM
INJSEV_IM
PERALCH_IM
SEAT_IM
SEX_IM
VEH_AGE_IM
data.shape =  (747342, 82)

Impute()


Unnamed: 0,CASENUM,HOUR,INT_HWY,LGT_COND,MONTH,PEDS,PERMVIT,REL_ROAD,RELJCT2,SCH_BUS,URBANICITY,VE_TOTAL,WEATHER,DAY_WEEK,WRK_ZONE,VE_FORMS,PVH_INVL,PERNOTMVIT,NUM_INJ,PSU,PJ,MAN_COLL,HARM_EV,TYP_INT,YEAR,REGION,ALCOHOL,MAX_SEV,RELJCT1,VEH_NO,ACC_TYPE,BODY_TYP,BUS_USE,CARGO_BT,DR_PRES,EMER_USE,FIRE_EXP,HAZ_CNO,HAZ_INV,HAZ_PLAC,HAZ_REL,HIT_RUN,IMPACT1,J_KNIFE,M_HARM,MAK_MOD,MAKE,MAX_VSEV,MODEL,NUM_INJV,NUMOCCS,P_CRASH1,P_CRASH2,PCRASH4,PCRASH5,ROLLOVER,SPEC_USE,SPEEDREL,TOW_VEH,TOWED,VALIGN,VEH_ALCH,VPROFILE,VSPD_LIM,VSURCOND,VTCONT_F,VTRAFCON,VTRAFWAY,AGE,AIR_BAG,ALC_RES,ALC_STATUS,EJECTION,HOSPITAL,INJ_SEV,PER_NO,PER_TYP,REST_MIS,REST_USE,SEAT_POS,SEX,VEH_AGE
0,201600014311,2,0,3,0,0,1,1,1,0,2,2,1,1,0,2,0,0,0,4,4,4,1,1.0,2016,4,2,0,0,1,4,1.0,1,0,1,1,1,1,0,0,1,0,3,1,1,2,0.0,2.0,1,3.0,3.0,5,5,1.0,2.0,1,1,1,0,2,1,1.0,1,7,1,,,,3.0,1.0,,,1,0.0,3.0,1,2,1,1.0,3,1.0,3.0
1,201600014311,2,0,3,0,0,1,1,1,0,2,2,1,1,0,2,0,0,0,4,4,4,1,1.0,2016,4,2,0,0,2,3,2.0,1,0,1,1,1,1,0,0,1,0,0,1,1,4,4.0,2.0,4,3.0,3.0,1,4,1.0,3.0,1,1,1,0,2,1,1.0,1,7,1,,,,2.0,1.0,,,1,0.0,3.0,1,2,1,1.0,3,1.0,0.0
2,201600014315,4,0,1,0,0,2,1,0,0,2,2,1,1,0,2,0,0,4,4,4,0,1,,2016,4,2,1,0,1,1,5.0,1,0,1,1,1,1,0,0,1,0,1,1,1,3,8.0,1.0,4,1.0,3.0,2,1,1.0,3.0,1,1,1,0,0,1,1.0,2,7,1,,,,1.0,0.0,,,1,0.0,1.0,1,2,1,1.0,3,1.0,4.0
3,201600014315,4,0,1,0,0,2,1,0,0,2,2,1,1,0,2,0,0,4,4,4,0,1,,2016,4,2,1,0,2,0,0.0,1,0,1,1,1,1,0,0,1,0,1,1,1,1,6.0,1.0,3,0.0,5.0,1,0,1.0,3.0,1,1,1,0,0,1,1.0,2,7,1,,,,1.0,0.0,,,1,0.0,1.0,1,2,1,1.0,3,0.0,2.0
4,201600014315,4,0,1,0,0,2,1,0,0,2,2,1,1,0,2,0,0,4,4,4,0,1,,2016,4,2,1,0,2,0,0.0,1,0,1,1,1,1,0,0,1,0,1,1,1,1,6.0,1.0,3,0.0,5.0,1,0,1.0,3.0,1,1,1,0,0,1,1.0,2,7,1,,,,1.0,0.0,0.0,1.0,1,0.0,1.0,2,1,1,1.0,0,1.0,2.0
5,201600014315,4,0,1,0,0,2,1,0,0,2,2,1,1,0,2,0,0,4,4,4,0,1,,2016,4,2,1,0,2,0,0.0,1,0,1,1,1,1,0,0,1,0,1,1,1,1,6.0,1.0,3,0.0,5.0,1,0,1.0,3.0,1,1,1,0,0,1,1.0,2,7,1,,,,1.0,0.0,0.0,1.0,1,0.0,1.0,3,1,1,1.0,4,1.0,2.0
6,201600014316,0,0,1,0,0,0,0,1,0,2,1,1,1,0,1,0,0,1,4,4,1,3,1.0,2016,4,2,3,0,1,0,1.0,1,0,1,1,1,1,0,0,1,0,2,1,0,0,0.0,0.0,1,1.0,3.0,1,0,0.0,1.0,0,1,1,0,0,1,1.0,1,1,1,,,2.0,2.0,1.0,,,0,1.0,0.0,1,2,1,0.0,3,0.0,3.0
7,201600014335,5,0,1,0,0,1,1,0,0,2,2,0,1,0,2,0,0,0,4,4,0,1,,2016,4,9,0,0,1,0,,1,0,1,1,1,1,0,0,1,1,1,1,1,4,,,4,,,1,2,,,1,1,1,0,2,1,,2,5,3,,,,,,0.0,1.0,1,0.0,,1,2,1,,3,,
8,201600014335,5,0,1,0,0,1,1,0,0,2,2,0,1,0,2,0,0,0,4,4,0,1,,2016,4,9,0,0,2,0,1.0,1,0,1,1,1,1,0,0,1,0,1,1,1,0,6.0,2.0,0,3.0,3.0,1,0,1.0,3.0,1,1,1,0,2,1,1.0,2,5,3,,,,2.0,1.0,0.0,1.0,1,0.0,3.0,1,2,1,1.0,3,0.0,2.0
9,201600014586,1,0,3,0,1,0,1,2,0,2,1,1,1,0,1,0,1,1,0,2,1,2,1.0,2016,4,2,2,0,1,4,1.0,1,0,1,1,1,1,0,0,1,0,1,1,2,2,4.0,2.0,2,3.0,3.0,2,5,1.0,3.0,1,1,0,0,2,1,1.0,1,2,1,1.0,1.0,2.0,3.0,,0.0,1.0,1,0.0,3.0,1,2,1,1.0,3,1.0,4.0




Complete[]


[['CASENUM', 0],
 ['INT_HWY', 0],
 ['MONTH', 0],
 ['PEDS', 0],
 ['PERMVIT', 0],
 ['REL_ROAD', 0],
 ['SCH_BUS', 0],
 ['URBANICITY', 0],
 ['VE_TOTAL', 0],
 ['DAY_WEEK', 0],
 ['WRK_ZONE', 0],
 ['VE_FORMS', 0],
 ['PVH_INVL', 0],
 ['PERNOTMVIT', 0],
 ['PSU', 0],
 ['PJ', 0],
 ['YEAR', 0],
 ['REGION', 0],
 ['ALCOHOL', 0],
 ['MAX_SEV', 0],
 ['RELJCT1', 0],
 ['VEH_NO', 0],
 ['FIRE_EXP', 0],
 ['HAZ_INV', 0],
 ['J_KNIFE', 0],
 ['MAK_MOD', 0],
 ['MODEL', 0],
 ['ROLLOVER', 0],
 ['PER_NO', 0],
 ['PER_TYP', 0],
 ['REST_MIS', 0]]


Missing[]


[['DR_PRES', 21],
 ['HIT_RUN', 29],
 ['HAZ_PLAC', 33],
 ['HAZ_REL', 53],
 ['HAZ_CNO', 122],
 ['HARM_EV', 259],
 ['M_HARM', 285],
 ['TOW_VEH', 1117],
 ['HOUR', 2198],
 ['MAN_COLL', 2909],
 ['PCRASH5', 3239],
 ['LGT_COND', 4215],
 ['ACC_TYPE', 4298],
 ['P_CRASH2', 5126],
 ['EMER_USE', 5695],
 ['NUM_INJ', 6774],
 ['BUS_USE', 8607],
 ['SEAT_POS', 9511],
 ['P_CRASH1', 11339],
 ['SPEC_USE', 13059],
 ['SPEEDREL', 13910],
 ['HOSPITAL', 14671],
 ['CARGO_BT', 14947],
 ['IMPACT1', 15206],
 ['MAKE', 16087],
 ['VEH_AGE', 23853],
 ['MAX_VSEV', 24007],
 ['NUM_INJV', 24007],
 ['VSURCOND', 26192],
 ['INJ_SEV', 26317],
 ['BODY_TYP', 28221],
 ['NUMOCCS', 28960],
 ['SEX', 31438],
 ['WEATHER', 31892],
 ['PCRASH4', 33529],
 ['TOWED', 40934],
 ['RELJCT2', 43900],
 ['EJECTION', 44473],
 ['VALIGN', 46788],
 ['AGE', 47949],
 ['VTRAFCON', 59388],
 ['VTCONT_F', 59582],
 ['REST_USE', 68557],
 ['AIR_BAG', 70506],
 ['TYP_INT', 79585],
 ['VSPD_LIM', 101121],
 ['VPROFILE', 102951],
 ['ALC_RES', 129056],
 ['ALC_STATUS'


Make data_Mode

DR_PRES 21 1
HIT_RUN 29 0
HAZ_PLAC 33 0
HAZ_REL 53 1
HAZ_CNO 122 1
HARM_EV 259 1
M_HARM 285 1
TOW_VEH 1117 0
HOUR 2198 3
MAN_COLL 2909 3
PCRASH5 3239 3
LGT_COND 4215 3
ACC_TYPE 4298 2
P_CRASH2 5126 0
EMER_USE 5695 1
NUM_INJ 6774 0
BUS_USE 8607 1
SEAT_POS 9511 3
P_CRASH1 11339 1
SPEC_USE 13059 1
SPEEDREL 13910 1
HOSPITAL 14671 0
CARGO_BT 14947 0
IMPACT1 15206 1
MAKE 16087 8
VEH_AGE 23853 0.0
MAX_VSEV 24007 2
NUM_INJV 24007 3
VSURCOND 26192 1
INJ_SEV 26317 3
BODY_TYP 28221 1
NUMOCCS 28960 3
SEX 31438 1
WEATHER 31892 1
PCRASH4 33529 1
TOWED 40934 2
RELJCT2 43900 1
EJECTION 44473 1
VALIGN 46788 1
AGE 47949 2
VTRAFCON 59388 1
VTCONT_F 59582 1
REST_USE 68557 1
AIR_BAG 70506 1
TYP_INT 79585 1
VSPD_LIM 101121 2
VPROFILE 102951 1
ALC_RES 129056 0
ALC_STATUS 129056 1
VTRAFWAY 129865 0
VEH_ALCH 150648 1
data_Mode


Unnamed: 0,CASENUM,INT_HWY,MONTH,PEDS,PERMVIT,REL_ROAD,SCH_BUS,URBANICITY,VE_TOTAL,DAY_WEEK,WRK_ZONE,VE_FORMS,PVH_INVL,PERNOTMVIT,PSU,PJ,YEAR,REGION,ALCOHOL,MAX_SEV,RELJCT1,VEH_NO,FIRE_EXP,HAZ_INV,J_KNIFE,MAK_MOD,MODEL,ROLLOVER,PER_NO,PER_TYP,REST_MIS,DR_PRES,HIT_RUN,HAZ_PLAC,HAZ_REL,HAZ_CNO,HARM_EV,M_HARM,TOW_VEH,HOUR,MAN_COLL,PCRASH5,LGT_COND,ACC_TYPE,P_CRASH2,EMER_USE,NUM_INJ,BUS_USE,SEAT_POS,P_CRASH1,SPEC_USE,SPEEDREL,HOSPITAL,CARGO_BT,IMPACT1,MAKE,VEH_AGE,MAX_VSEV,NUM_INJV,VSURCOND,INJ_SEV,BODY_TYP,NUMOCCS,SEX,WEATHER,PCRASH4,TOWED,RELJCT2,EJECTION,VALIGN,AGE,VTRAFCON,VTCONT_F,REST_USE,AIR_BAG,TYP_INT,VSPD_LIM,VPROFILE,ALC_RES,ALC_STATUS,VTRAFWAY,VEH_ALCH
0,201600014311,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,0,0,1,1,0,1,2,1,1,1,2,1,1,0,0,1,1,1,1,0,2,4,2,3,4,5,1,0,1,3,5,1,1,0,0,3,0,3.0,2,3,1,3,1,3,1,1,1,2,1,1,1,3,1,1,1,1,1,7,1,0,1,0,1
1,201600014311,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,0,0,2,1,0,1,4,4,1,1,2,1,1,0,0,1,1,1,1,0,2,4,3,3,3,4,1,0,1,3,1,1,1,0,0,0,4,0.0,2,3,1,3,2,3,1,1,1,2,1,1,1,2,1,1,1,1,1,7,1,0,1,0,1
2,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,1,1,0,1,3,4,1,1,2,1,1,0,0,1,1,1,1,0,4,0,3,1,1,1,1,4,1,3,2,1,1,0,0,1,8,4.0,1,1,1,1,5,3,1,1,1,0,0,1,1,1,1,1,1,0,1,7,2,0,1,0,1
3,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,1,2,1,1,0,0,1,1,1,1,0,4,0,3,1,0,0,1,4,1,3,1,1,1,0,0,1,6,2.0,1,0,1,1,0,5,0,1,1,0,0,1,1,1,1,1,1,0,1,7,2,0,1,0,1
4,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,2,1,1,1,0,0,1,1,1,1,0,4,0,3,1,0,0,1,4,1,0,1,1,1,0,0,1,6,2.0,1,0,1,1,0,5,1,1,1,0,0,1,1,1,1,1,1,0,1,7,2,0,1,0,1
5,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,3,1,1,1,0,0,1,1,1,1,0,4,0,3,1,0,0,1,4,1,4,1,1,1,0,0,1,6,2.0,1,0,1,1,0,5,1,1,1,0,0,1,1,1,1,1,1,0,1,7,2,0,1,0,1
6,201600014316,0,0,0,0,0,0,2,1,1,0,1,0,0,4,4,2016,4,2,3,0,1,1,0,1,0,1,0,1,2,1,1,0,0,1,1,3,0,0,0,1,1,1,0,0,1,1,1,3,1,1,1,1,0,2,0,3.0,0,1,1,0,1,3,0,1,0,0,1,0,1,2,1,1,0,1,1,1,1,0,1,2,1
7,201600014335,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,9,0,0,1,1,0,1,4,4,1,1,2,1,1,1,0,1,1,1,1,0,5,0,3,1,0,2,1,0,1,3,1,1,1,0,0,1,8,0.0,2,3,3,3,1,3,1,0,1,2,0,1,1,2,1,1,1,1,1,5,2,0,1,0,1
8,201600014335,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,9,0,0,2,1,0,1,0,0,1,1,2,1,1,0,0,1,1,1,1,0,5,0,3,1,0,0,1,0,1,3,1,1,1,0,0,1,6,2.0,2,3,3,3,1,3,0,0,1,2,0,1,1,2,1,1,1,1,1,5,2,0,1,0,1
9,201600014586,0,0,1,0,1,0,2,1,1,0,1,0,1,0,2,2016,4,2,2,0,1,1,0,1,2,2,1,1,2,1,1,0,0,1,1,2,2,0,1,1,3,3,4,5,1,1,1,3,2,1,0,0,0,1,4,4.0,2,3,1,3,1,3,1,1,1,2,2,1,1,3,1,1,1,1,1,2,1,0,1,2,1



Make starting point for data_Imputed
data_Imputed


Unnamed: 0,CASENUM,INT_HWY,MONTH,PEDS,PERMVIT,REL_ROAD,SCH_BUS,URBANICITY,VE_TOTAL,DAY_WEEK,WRK_ZONE,VE_FORMS,PVH_INVL,PERNOTMVIT,PSU,PJ,YEAR,REGION,ALCOHOL,MAX_SEV,RELJCT1,VEH_NO,FIRE_EXP,HAZ_INV,J_KNIFE,MAK_MOD,MODEL,ROLLOVER,PER_NO,PER_TYP,REST_MIS,DR_PRES,HIT_RUN,HAZ_PLAC,HAZ_REL,HAZ_CNO,HARM_EV,M_HARM,TOW_VEH,HOUR,MAN_COLL,PCRASH5,LGT_COND,ACC_TYPE,P_CRASH2,EMER_USE,NUM_INJ,BUS_USE,SEAT_POS,P_CRASH1,SPEC_USE,SPEEDREL,HOSPITAL,CARGO_BT,IMPACT1,MAKE,VEH_AGE,MAX_VSEV,NUM_INJV,VSURCOND,INJ_SEV,BODY_TYP,NUMOCCS,SEX,WEATHER,PCRASH4,TOWED,RELJCT2,EJECTION,VALIGN,AGE,VTRAFCON,VTCONT_F,REST_USE,AIR_BAG,TYP_INT,VSPD_LIM,VPROFILE,ALC_RES,ALC_STATUS,VTRAFWAY,VEH_ALCH
0,201600014311,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,0,0,1,1,0,1,2,1,1,1,2,1,1,0,0,1,1,1,1,0,2,4,2,3,4,5,1,0,1,3,5,1,1,0,0,3,0,3.0,2,3,1,3,1,3,1,1,1,2,1,1,1,3,1,1,1,1,1,7,1,0,1,0,1
1,201600014311,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,0,0,2,1,0,1,4,4,1,1,2,1,1,0,0,1,1,1,1,0,2,4,3,3,3,4,1,0,1,3,1,1,1,0,0,0,4,0.0,2,3,1,3,2,3,1,1,1,2,1,1,1,2,1,1,1,1,1,7,1,0,1,0,1
2,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,1,1,0,1,3,4,1,1,2,1,1,0,0,1,1,1,1,0,4,0,3,1,1,1,1,4,1,3,2,1,1,0,0,1,8,4.0,1,1,1,1,5,3,1,1,1,0,0,1,1,1,1,1,1,0,1,7,2,0,1,0,1
3,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,1,2,1,1,0,0,1,1,1,1,0,4,0,3,1,0,0,1,4,1,3,1,1,1,0,0,1,6,2.0,1,0,1,1,0,5,0,1,1,0,0,1,1,1,1,1,1,0,1,7,2,0,1,0,1
4,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,2,1,1,1,0,0,1,1,1,1,0,4,0,3,1,0,0,1,4,1,0,1,1,1,0,0,1,6,2.0,1,0,1,1,0,5,1,1,1,0,0,1,1,1,1,1,1,0,1,7,2,0,1,0,1
5,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,3,1,1,1,0,0,1,1,1,1,0,4,0,3,1,0,0,1,4,1,4,1,1,1,0,0,1,6,2.0,1,0,1,1,0,5,1,1,1,0,0,1,1,1,1,1,1,0,1,7,2,0,1,0,1
6,201600014316,0,0,0,0,0,0,2,1,1,0,1,0,0,4,4,2016,4,2,3,0,1,1,0,1,0,1,0,1,2,1,1,0,0,1,1,3,0,0,0,1,1,1,0,0,1,1,1,3,1,1,1,1,0,2,0,3.0,0,1,1,0,1,3,0,1,0,0,1,0,1,2,1,1,0,1,1,1,1,0,1,2,1
7,201600014335,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,9,0,0,1,1,0,1,4,4,1,1,2,1,1,1,0,1,1,1,1,0,5,0,3,1,0,2,1,0,1,3,1,1,1,0,0,1,8,0.0,2,3,3,3,1,3,1,0,1,2,0,1,1,2,1,1,1,1,1,5,2,0,1,0,1
8,201600014335,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,9,0,0,2,1,0,1,0,0,1,1,2,1,1,0,0,1,1,1,1,0,5,0,3,1,0,0,1,0,1,3,1,1,1,0,0,1,6,2.0,2,3,3,3,1,3,0,0,1,2,0,1,1,2,1,1,1,1,1,5,2,0,1,0,1
9,201600014586,0,0,1,0,1,0,2,1,1,0,1,0,1,0,2,2016,4,2,2,0,1,1,0,1,2,2,1,1,2,1,1,0,0,1,1,2,2,0,1,1,3,3,4,5,1,1,1,3,2,1,0,0,0,1,4,4.0,2,3,1,3,1,3,1,1,1,2,2,1,1,3,1,1,1,1,1,2,1,0,1,2,1



Start Loop

['DR_PRES', 21]
21


array(['1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,747155,747176,21
1,0.0,166,166,0
2,,21,0,0



['HIT_RUN', 29]
29


array(['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0'], dtype=object)

(747342, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,707761,707790,29
1,1.0,39552,39552,0
2,,29,0,0



['HAZ_PLAC', 33]
33


array(['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '1', '0', '0'], dtype=object)

(747342, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,747045,747077,32
1,1.0,264,265,1
2,,33,0,0



['HAZ_REL', 53]
53


array(['1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1'], dtype=object)

(747342, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,747045,747098,53
1,2.0,192,192,0
2,0.0,52,52,0
3,,53,0,0



['HAZ_CNO', 122]
122


array(['1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1'], dtype=object)

(747342, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,747045,747167,122
1,2.0,165,165,0
2,0.0,10,10,0
3,,122,0,0



['HARM_EV', 259]
259


array(['3', '3', '3', '3', '3', '3', '1', '1', '1', '1', '1', '3', '3',
       '3', '3', '1', '1', '3', '3', '3', '3', '3', '3', '3', '3', '3',
       '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '1', '3', '3',
       '3', '3', '3', '3', '3', '1', '3', '3', '3', '3', '3', '3', '3',
       '3', '3', '3', '3', '1', '3', '3', '1', '1', '1', '3', '3', '3',
       '3', '3', '3', '1', '3', '1', '3', '3', '3', '1', '1', '3', '3',
       '1', '3', '1', '1', '3', '1', '3', '1', '3', '3', '3', '3', '3',
       '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3',
       '3', '3', '3', '3', '3', '3', '3', '3', '1', '1', '1', '1', '1',
       '1', '3', '3', '1', '1', '3', '3', '1', '3', '3', '3', '3', '3',
       '3', '3', '3', '3', '3', '3', '3', '3', '1', '1', '1', '1', '1',
       '1', '1', '3', '3', '3', '3', '3', '3', '3', '3', '1', '1', '1',
       '3', '3', '3', '3', '3', '3', '3', '1', '3', '1', '3', '3', '1',
       '1', '1', '1', '3', '1', '3', '3', '3', '3', '1', '1', '1

(747342, 82)

['1', '3', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,609079,609163,84
1,3.0,61659,61834,175
2,2.0,62961,62961,0
3,0.0,13384,13384,0
4,,259,0,0



['M_HARM', 285]
285


array(['0', '0', '0', '0', '0', '0', '1', '1', '0', '0', '0', '0', '1',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '1',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0',
       '1', '1', '1', '0', '0', '0', '0', '0', '0', '1', '0', '1', '0',
       '0', '0', '1', '1', '0', '0', '1', '0', '1', '1', '0', '0', '1',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '1', '1', '1', '1', '1', '1', '0', '0', '1', '1', '0', '0', '1',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '1',
       '1', '1', '1', '1', '1', '0', '0', '0', '0', '0', '0', '0', '0',
       '1', '1', '1', '0', '0', '0', '0', '0', '0', '0', '1', '0', '1',
       '0', '0', '1', '1', '1', '1', '0', '1', '0', '0', '0', '0', '1',
       '1', '1', '1', '1', '1', '0', '1', '1', '1', '1', '0', '0

(747342, 82)

['1', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,605660,605774,114
1,0.0,74748,74919,171
2,2.0,66649,66649,0
3,,285,0,0



['TOW_VEH', 1117]
1117


array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

(747342, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,727133,728250,1117
1,1.0,19092,19092,0
2,,1117,0,0



['HOUR', 2198]
2198


array(['3', '2', '2', ..., '3', '3', '3'], dtype=object)

(747342, 82)

['2', '4', '0', '5', '1', '3', '6']


Unnamed: 0,Value,Original,Imputed,Difference
0,2.0,182534,182983,449
1,4.0,89321,89321,0
2,0.0,26925,26925,0
3,5.0,72502,72699,197
4,1.0,126939,126939,0
5,3.0,196937,198190,1253
6,6.0,49986,50285,299
7,,2198,0,0



['MAN_COLL', 2909]
2909


array(['1', '1', '1', ..., '2', '3', '3'], dtype=object)

(747342, 82)

['4', '0', '1', '3', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,4.0,89711,89711,0
1,0.0,29362,29362,0
2,1.0,138141,138241,100
3,3.0,270087,271877,1790
4,2.0,217132,218151,1019
5,,2909,0,0



['PCRASH5', 3239]
3239


array(['1', '1', '3', ..., '3', '3', '3'], dtype=object)

(747342, 82)

['2', '3', '1', '4', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,2.0,79635,79635,0
1,3.0,572454,575551,3097
2,1.0,80227,80369,142
3,4.0,10320,10320,0
4,0.0,1467,1467,0
5,,3239,0,0



['LGT_COND', 4215]
4215


array(['3', '3', '3', ..., '3', '3', '3'], dtype=object)

(747342, 82)

['3', '1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,530999,535214,4215
1,1.0,132968,132968,0
2,2.0,18095,18095,0
3,0.0,61065,61065,0
4,,4215,0,0



['ACC_TYPE', 4298]
4298


array(['0', '0', '0', ..., '0', '0', '4'], dtype=object)

(747342, 82)

['4', '3', '1', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,4.0,161582,161840,258
1,3.0,165651,167055,1404
2,1.0,104379,104809,430
3,0.0,144619,146642,2023
4,2.0,166813,166996,183
5,,4298,0,0



['P_CRASH2', 5126]
5126


array(['0', '0', '0', ..., '1', '5', '5'], dtype=object)

(747342, 82)

['5', '4', '1', '0', '2', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,5.0,148145,149718,1573
1,4.0,91965,91965,0
2,1.0,123112,123591,479
3,0.0,148665,150055,1390
4,2.0,85938,85938,0
5,3.0,144391,146075,1684
6,,5126,0,0



['EMER_USE', 5695]
5695


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,739284,744979,5695
1,0.0,1453,1453,0
2,2.0,910,910,0
3,,5695,0,0



['NUM_INJ', 6774]
6774


array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

(747342, 82)

['0', '4', '1', '2', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,344652,351408,6756
1,4.0,34194,34194,0
2,1.0,231614,231632,18
3,2.0,91425,91425,0
4,3.0,38683,38683,0
5,,6774,0,0



['BUS_USE', 8607]
8607


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,735432,744039,8607
1,2.0,3167,3167,0
2,0.0,136,136,0
3,,8607,0,0



['SEAT_POS', 9511]
9511


array(['3', '3', '3', ..., '3', '3', '3'], dtype=object)

(747342, 82)

['3', '0', '4', '2', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,551741,559299,7558
1,0.0,5316,5316,0
2,4.0,48163,48258,95
3,2.0,30483,30483,0
4,1.0,102128,103986,1858
5,,9511,0,0



['P_CRASH1', 11339]
11339


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['5', '1', '2', '4', '0', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,5.0,75895,75895,0
1,1.0,372952,384282,11330
2,2.0,76409,76409,0
3,4.0,112349,112358,9
4,0.0,50269,50269,0
5,3.0,48129,48129,0
6,,11339,0,0



['SPEC_USE', 13059]
13059


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,725434,738493,13059
1,2.0,5160,5160,0
2,0.0,3689,3689,0
3,,13059,0,0



['SPEEDREL', 13910]
13910


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,688441,702351,13910
1,0.0,44991,44991,0
2,,13910,0,0



['HOSPITAL', 14671]
14671


array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

(747342, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,619627,634018,14391
1,1.0,113044,113324,280
2,,14671,0,0



['CARGO_BT', 14947]
14947


array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

(747342, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,706341,721288,14947
1,1.0,26054,26054,0
2,,14947,0,0



['IMPACT1', 15206]
15206


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['3', '0', '1', '2', '4', '5']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,106668,106668,0
1,0.0,70148,70148,0
2,1.0,310120,325005,14885
3,2.0,54380,54380,0
4,4.0,161451,161772,321
5,5.0,29369,29369,0
6,,15206,0,0



['MAKE', 16087]
16087


array(['8', '8', '8', ..., '0', '8', '0'], dtype=object)

(747342, 82)

['0', '4', '8', '6', '7', '2', '1', '3', '5']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,144992,146025,1033
1,4.0,86201,86201,0
2,8.0,149518,164572,15054
3,6.0,96905,96905,0
4,7.0,30669,30669,0
5,2.0,92031,92031,0
6,1.0,104707,104707,0
7,3.0,15057,15057,0
8,5.0,11175,11175,0
9,,16087,0,0



['VEH_AGE', 23853]
23853


array(['0.0', '0.0', '0.0', ..., '0.0', '0.0', '0.0'], dtype=object)

(747342, 82)

['3.0', '0.0', '4.0', '2.0', '1.0']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,109090,109090,0
1,0.0,216288,240120,23832
2,4.0,52262,52262,0
3,2.0,153534,153555,21
4,1.0,192315,192315,0
5,,23853,0,0



['MAX_VSEV', 24007]
24007


array(['2', '2', '2', ..., '2', '2', '2'], dtype=object)

(747342, 82)

['2', '1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,2.0,472536,496543,24007
1,1.0,125197,125197,0
2,0.0,125602,125602,0
3,,24007,0,0



['NUM_INJV', 24007]
24007


array(['3', '3', '3', ..., '3', '3', '3'], dtype=object)

(747342, 82)

['3', '1', '0', '2', '13']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,472554,496561,24007
1,1.0,175863,175863,0
2,0.0,74831,74831,0
3,2.0,73,73,0
4,13.0,14,14,0
5,,24007,0,0



['VSURCOND', 26192]
26192


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '3', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,590442,616634,26192
1,3.0,32753,32753,0
2,2.0,95954,95954,0
3,0.0,2001,2001,0
4,,26192,0,0



['INJ_SEV', 26317]
26317


array(['3', '3', '3', ..., '0', '3', '0'], dtype=object)

(747342, 82)

['3', '1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,514577,540778,26201
1,1.0,104953,104959,6
2,0.0,101495,101605,110
3,,26317,0,0



['BODY_TYP', 28221]
28221


array(['5', '5', '5', ..., '1', '5', '5'], dtype=object)

(747342, 82)

['1', '2', '5', '0', '3', '4']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,259168,263133,3965
1,2.0,70533,70533,0
2,5.0,165193,189449,24256
3,0.0,55048,55048,0
4,3.0,115649,115649,0
5,4.0,53530,53530,0
6,,28221,0,0



['NUMOCCS', 28960]
28960


array(['3', '3', '3', ..., '3', '3', '3'], dtype=object)

(747342, 82)

['3', '5', '6', '1', '2', '0', '4']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,393857,421716,27859
1,5.0,74090,74090,0
2,6.0,61642,61642,0
3,1.0,178102,179203,1101
4,2.0,8357,8357,0
5,0.0,1789,1789,0
6,4.0,545,545,0
7,,28960,0,0



['SEX', 31438]
31438


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,390459,418272,27813
1,0.0,325445,329070,3625
2,,31438,0,0



['WEATHER', 31892]
31892


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0', '2', '3', '4']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,528256,560148,31892
1,0.0,3366,3366,0
2,2.0,65026,65026,0
3,3.0,106753,106753,0
4,4.0,12049,12049,0
5,,31892,0,0



['PCRASH4', 33529]
33529


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,687339,720868,33529
1,0.0,26474,26474,0
2,,33529,0,0



['TOWED', 40934]
40934


array(['2', '2', '2', ..., '2', '2', '2'], dtype=object)

(747342, 82)

['2', '0', '3', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,2.0,412065,449742,37677
1,0.0,231259,234516,3257
2,3.0,36255,36255,0
3,1.0,26829,26829,0
4,,40934,0,0



['RELJCT2', 43900]
43900


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0', '2', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,258855,298755,39900
1,0.0,203458,205945,2487
2,2.0,67599,67599,0
3,3.0,173530,175043,1513
4,,43900,0,0



['EJECTION', 44473]
44473


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,680014,724487,44473
1,0.0,22855,22855,0
2,,44473,0,0



['VALIGN', 46788]
46788


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,623585,670373,46788
1,2.0,18828,18828,0
2,0.0,58141,58141,0
3,,46788,0,0



['AGE', 47949]
47949


array(['2', '2', '2', ..., '2', '2', '2'], dtype=object)

(747342, 82)

['3', '2', '1', '0', '4']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,134761,134761,0
1,2.0,429588,477494,47906
2,1.0,51078,51078,0
3,0.0,57382,57425,43
4,4.0,26584,26584,0
5,,47949,0,0



['VTRAFCON', 59388]
59388


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '2', '3', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,424046,483428,59382
1,2.0,175546,175552,6
2,3.0,75283,75283,0
3,0.0,13079,13079,0
4,,59388,0,0



['VTCONT_F', 59582]
59582


array(['1', '1', '1', ..., '3', '3', '1'], dtype=object)

(747342, 82)

['1', '3', '4', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,424046,483493,59447
1,3.0,262469,262604,135
2,4.0,482,482,0
3,0.0,719,719,0
4,2.0,44,44,0
5,,59582,0,0



['REST_USE', 68557]
68557


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,585540,654070,68530
1,0.0,51513,51540,27
2,2.0,41732,41732,0
3,,68557,0,0



['AIR_BAG', 70506]
70506


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,540166,610672,70506
1,0.0,136670,136670,0
2,,70506,0,0



['TYP_INT', 79585]
79585


array(['1', '1', '1', ..., '1', '2', '2'], dtype=object)

(747342, 82)

['1', '2', '0', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,380911,411038,30127
1,2.0,200684,250142,49458
2,0.0,80715,80715,0
3,3.0,5447,5447,0
4,,79585,0,0



['VSPD_LIM', 101121]
101121


array(['2', '2', '7', ..., '2', '2', '2'], dtype=object)

(747342, 82)

['7', '1', '5', '2', '4', '0', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,7.0,131769,164741,32972
1,1.0,95805,99430,3625
2,5.0,123643,123643,0
3,2.0,139112,203636,64524
4,4.0,72882,72882,0
5,0.0,70352,70352,0
6,3.0,12658,12658,0
7,,101121,0,0



['VPROFILE', 102951]
102951


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,535360,638311,102951
1,2.0,70806,70806,0
2,0.0,38225,38225,0
3,,102951,0,0



['ALC_RES', 129056]
129056


array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

(747342, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,615155,744211,129056
1,1.0,3131,3131,0
2,,129056,0,0



['ALC_STATUS', 129056]
129056


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,605020,734076,129056
1,0.0,13266,13266,0
2,,129056,0,0



['VTRAFWAY', 129865]
129865


array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

(747342, 82)

['2', '0', '4', '3', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,2.0,35377,35377,0
1,0.0,279599,404813,125214
2,4.0,52101,52101,0
3,3.0,145165,149816,4651
4,1.0,105235,105235,0
5,,129865,0,0



['VEH_ALCH', 150648]
150648


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(747342, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,577872,728520,150648
1,0.0,18822,18822,0
2,,150648,0,0






IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



[[201600014311, 2, 2],
 [201600014315, 4, 4],
 [201600014316, 1, 1],
 [201600014335, 2, 2],
 [201600014586, 1, 1],
 [201600014593, 5, 5],
 [201600014603, 4, 4],
 [201600014610, 2, 2],
 [201600014622, 3, 3],
 [201600014624, 1, 1],
 [201600015222, 2, 2],
 [201600015227, 4, 4],
 [201600015251, 1, 1],
 [201600015256, 2, 2],
 [201600015257, 1, 1],
 [201600015268, 1, 1],
 [201600015305, 3, 3],
 [201600015805, 3, 3],
 [201600015883, 3, 3],
 [201600015924, 3, 3],
 [201600015934, 2, 2],
 [201600015940, 1, 1],
 [201600015944, 2, 2],
 [201600015948, 1, 1],
 [201600015958, 1, 1],
 [201600015960, 2, 2],
 [201600015967, 1, 1],
 [201600016011, 3, 3],
 [201600016014, 2, 2],
 [201600016016, 4, 4],
 [201600016018, 5, 5],
 [201600016026, 1, 1],
 [201600016027, 2, 2],
 [201600016163, 3, 3],
 [201600016164, 2, 2],
 [201600016166, 2, 2],
 [201600016167, 2, 2],
 [201600016169, 1, 1],
 [201600016171, 3, 3],
 [201600016175, 3, 3],
 [201600016176, 3, 3],
 [201600016184, 1, 1],
 [201600016195, 2, 2],
 [201600016


['2', '4', '0', '5', '1', '3', '6', nan]


[['2', 182534, 182983],
 ['4', 89321, 89321],
 ['0', 26925, 26925],
 ['5', 72502, 72699],
 ['1', 126939, 126939],
 ['3', 196937, 198190],
 ['6', 49986, 50285],
 [nan, 0, 0]]


[0, 1]


[[0, 670365, 670365], [1, 76977, 76977]]


['3', '1', '2', nan, '0']


[['3', 530999, 535214],
 ['1', 132968, 132968],
 ['2', 18095, 18095],
 [nan, 0, 0],
 ['0', 61065, 61065]]


[0, 1, 2]


[[0, 229579, 229579], [1, 255069, 255069], [2, 262694, 262694]]


[0, 1, 2]


[[0, 713566, 713566], [1, 32648, 32648], [2, 1128, 1128]]


[1, 2, 0]


[[1, 243234, 243234], [2, 419946, 419946], [0, 84162, 84162]]


[1, 0, 2]


[[1, 663460, 663460], [0, 73134, 73134], [2, 10748, 10748]]


['1', '0', '2', '3', nan]


[['1', 258855, 298755],
 ['0', 203458, 205945],
 ['2', 67599, 67599],
 ['3', 173530, 175043],
 [nan, 0, 0]]


[0, 1]


[[0, 743652, 743652], [1, 3690, 3690]]


[2, 1]


[[2, 166282, 166282], [1, 581060, 581060]]


[2, 1, 4, 3]


[[2, 520942, 520942],
 [1, 115365, 115365],
 [4, 28152, 28152],
 [3, 82883, 82883]]


['1', '0', nan, '2', '3', '4']


[['1', 528256, 560148],
 ['0', 3366, 3366],
 [nan, 0, 0],
 ['2', 65026, 65026],
 ['3', 106753, 106753],
 ['4', 12049, 12049]]


[1, 0]


[[1, 559695, 559695], [0, 187647, 187647]]


[0, 1, 2, 3]


[[0, 732983, 732983], [1, 13339, 13339], [2, 833, 833], [3, 187, 187]]


[2, 1, 4, 3]


[[2, 511854, 511854],
 [1, 129899, 129899],
 [4, 26322, 26322],
 [3, 79267, 79267]]


[0, 1]


[[0, 728884, 728884], [1, 18458, 18458]]


[0, 1]


[[0, 711108, 711108], [1, 36234, 36234]]


['0', '4', '1', nan, '2', '3']


[['0', 344652, 351408],
 ['4', 34194, 34194],
 ['1', 231614, 231632],
 [nan, 0, 0],
 ['2', 91425, 91425],
 ['3', 38683, 38683]]


[4, 0, 2, 1, 3]


[[4, 158192, 158192],
 [0, 144375, 144375],
 [2, 143823, 143823],
 [1, 139355, 139355],
 [3, 161597, 161597]]


[4, 2, 0, 1, 3, 4154, 1051]


[[4, 148375, 148375],
 [2, 150345, 150345],
 [0, 145674, 145674],
 [1, 158672, 158672],
 [3, 142352, 142352],
 [4154, 1895, 1895],
 [1051, 29, 29]]


['4', '0', '1', '3', '2', nan]


[['4', 89711, 89711],
 ['0', 29362, 29362],
 ['1', 138141, 138241],
 ['3', 270087, 271877],
 ['2', 217132, 218151],
 [nan, 0, 0]]


['1', '3', '2', '0', nan]


[['1', 609079, 609163],
 ['3', 61659, 61834],
 ['2', 62961, 62961],
 ['0', 13384, 13384],
 [nan, 0, 0]]


['1', nan, '2', '0', '3']


[['1', 380911, 411038],
 [nan, 0, 0],
 ['2', 200684, 250142],
 ['0', 80715, 80715],
 ['3', 5447, 5447]]


[2016, 2017, 2018, 2019, 2020, 2021]


[[2016, 113405, 113405],
 [2017, 133408, 133408],
 [2018, 115774, 115774],
 [2019, 129980, 129980],
 [2020, 126460, 126460],
 [2021, 128315, 128315]]


[4, 2, 3, 1]


[[4, 122856, 122856],
 [2, 131518, 131518],
 [3, 407287, 407287],
 [1, 85681, 85681]]


[2, 9, 1, 8]


[[2, 536231, 536231], [9, 180103, 180103], [1, 30915, 30915], [8, 93, 93]]


[0, 1, 3, 2, 9, 4, 5, 6]


[[0, 344639, 344639],
 [1, 186329, 186329],
 [3, 77957, 77957],
 [2, 114342, 114342],
 [9, 6774, 6774],
 [4, 14321, 14321],
 [5, 2967, 2967],
 [6, 13, 13]]


[0, 1, 8, 9]


[[0, 555107, 555107], [1, 30269, 30269], [8, 161702, 161702], [9, 264, 264]]


[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]


[[1, 406310, 406310],
 [2, 299592, 299592],
 [3, 33218, 33218],
 [4, 6301, 6301],
 [5, 1310, 1310],
 [6, 349, 349],
 [7, 129, 129],
 [8, 66, 66],
 [9, 30, 30],
 [10, 11, 11],
 [11, 8, 8],
 [12, 5, 5],
 [13, 6, 6],
 [14, 2, 2],
 [15, 5, 5]]


['4', '3', '1', '0', '2', nan]


[['4', 161582, 161840],
 ['3', 165651, 167055],
 ['1', 104379, 104809],
 ['0', 144619, 146642],
 ['2', 166813, 166996],
 [nan, 0, 0]]


['1', '2', '5', '0', nan, '3', '4']


[['1', 259168, 263133],
 ['2', 70533, 70533],
 ['5', 165193, 189449],
 ['0', 55048, 55048],
 [nan, 0, 0],
 ['3', 115649, 115649],
 ['4', 53530, 53530]]


['1', '2', nan, '0']


[['1', 735432, 744039], ['2', 3167, 3167], [nan, 0, 0], ['0', 136, 136]]


['0', '1', nan]


[['0', 706341, 721288], ['1', 26054, 26054], [nan, 0, 0]]


['1', '0', nan]


[['1', 747155, 747176], ['0', 166, 166], [nan, 0, 0]]


['1', nan, '0', '2']


[['1', 739284, 744979], [nan, 0, 0], ['0', 1453, 1453], ['2', 910, 910]]


[1, 0]


[[1, 745713, 745713], [0, 1629, 1629]]


['1', nan, '2', '0']


[['1', 747045, 747167], [nan, 0, 0], ['2', 165, 165], ['0', 10, 10]]


[0, 1]


[[0, 747045, 747045], [1, 297, 297]]


['0', '1', nan]


[['0', 747045, 747077], ['1', 264, 265], [nan, 0, 0]]


['1', '2', '0', nan]


[['1', 747045, 747098], ['2', 192, 192], ['0', 52, 52], [nan, 0, 0]]


['0', '1', nan]


[['0', 707761, 707790], ['1', 39552, 39552], [nan, 0, 0]]


['3', '0', '1', '2', '4', '5', nan]


[['3', 106668, 106668],
 ['0', 70148, 70148],
 ['1', 310120, 325005],
 ['2', 54380, 54380],
 ['4', 161451, 161772],
 ['5', 29369, 29369],
 [nan, 0, 0]]


[1, 2, 0]


[[1, 728312, 728312], [2, 18694, 18694], [0, 336, 336]]


['1', '0', '2', nan]


[['1', 605660, 605774], ['0', 74748, 74919], ['2', 66649, 66649], [nan, 0, 0]]


[2, 4, 3, 1, 0, 69063, 69055, 89883, 39403, 42498, 58498, 32498, 59398, 36499, 35046, 32057, 94988, 30498, 55498, 34498]


[[2, 149713, 149713],
 [4, 153133, 153133],
 [3, 151310, 151310],
 [1, 146445, 146445],
 [0, 146684, 146684],
 [69063, 3, 3],
 [69055, 1, 1],
 [89883, 12, 12],
 [39403, 1, 1],
 [42498, 7, 7],
 [58498, 6, 6],
 [32498, 1, 1],
 [59398, 2, 2],
 [36499, 2, 2],
 [35046, 1, 1],
 [32057, 3, 3],
 [94988, 2, 2],
 [30498, 4, 4],
 [55498, 10, 10],
 [34498, 2, 2]]


['0', '4', '8', '6', nan, '7', '2', '1', '3', '5']


[['0', 144992, 146025],
 ['4', 86201, 86201],
 ['8', 149518, 164572],
 ['6', 96905, 96905],
 [nan, 0, 0],
 ['7', 30669, 30669],
 ['2', 92031, 92031],
 ['1', 104707, 104707],
 ['3', 15057, 15057],
 ['5', 11175, 11175]]


['2', '1', '0', nan]


[['2', 472536, 496543],
 ['1', 125197, 125197],
 ['0', 125602, 125602],
 [nan, 0, 0]]


[1, 4, 3, 0, 2, 63]


[[1, 174610, 174610],
 [4, 168811, 168811],
 [3, 148912, 148912],
 [0, 119830, 119830],
 [2, 135176, 135176],
 [63, 3, 3]]


['3', '1', '0', nan, '2', '13']


[['3', 472554, 496561],
 ['1', 175863, 175863],
 ['0', 74831, 74831],
 [nan, 0, 0],
 ['2', 73, 73],
 ['13', 14, 14]]


['3', '5', nan, '6', '1', '2', '0', '4']


[['3', 393857, 421716],
 ['5', 74090, 74090],
 [nan, 0, 0],
 ['6', 61642, 61642],
 ['1', 178102, 179203],
 ['2', 8357, 8357],
 ['0', 1789, 1789],
 ['4', 545, 545]]


['5', '1', '2', '4', nan, '0', '3']


[['5', 75895, 75895],
 ['1', 372952, 384282],
 ['2', 76409, 76409],
 ['4', 112349, 112358],
 [nan, 0, 0],
 ['0', 50269, 50269],
 ['3', 48129, 48129]]


['5', '4', '1', '0', '2', '3', nan]


[['5', 148145, 149718],
 ['4', 91965, 91965],
 ['1', 123112, 123591],
 ['0', 148665, 150055],
 ['2', 85938, 85938],
 ['3', 144391, 146075],
 [nan, 0, 0]]


['1', '0', nan]


[['1', 687339, 720868], ['0', 26474, 26474], [nan, 0, 0]]


['2', '3', '1', nan, '4', '0']


[['2', 79635, 79635],
 ['3', 572454, 575551],
 ['1', 80227, 80369],
 [nan, 0, 0],
 ['4', 10320, 10320],
 ['0', 1467, 1467]]


[1, 0]


[[1, 725160, 725160], [0, 22182, 22182]]


['1', '2', nan, '0']


[['1', 725434, 738493], ['2', 5160, 5160], [nan, 0, 0], ['0', 3689, 3689]]


['1', '0', nan]


[['1', 688441, 702351], ['0', 44991, 44991], [nan, 0, 0]]


['0', '1', nan]


[['0', 727133, 728250], ['1', 19092, 19092], [nan, 0, 0]]


['2', '0', nan, '3', '1']


[['2', 412065, 449742],
 ['0', 231259, 234516],
 [nan, 0, 0],
 ['3', 36255, 36255],
 ['1', 26829, 26829]]


['1', '2', nan, '0']


[['1', 623585, 670373], ['2', 18828, 18828], [nan, 0, 0], ['0', 58141, 58141]]


['1', nan, '0']


[['1', 577872, 728520], [nan, 0, 0], ['0', 18822, 18822]]


['1', '2', '0', nan]


[['1', 535360, 638311], ['2', 70806, 70806], ['0', 38225, 38225], [nan, 0, 0]]


['7', '1', '5', '2', '4', '0', nan, '3']


[['7', 131769, 164741],
 ['1', 95805, 99430],
 ['5', 123643, 123643],
 ['2', 139112, 203636],
 ['4', 72882, 72882],
 ['0', 70352, 70352],
 [nan, 0, 0],
 ['3', 12658, 12658]]


['1', '3', '2', nan, '0']


[['1', 590442, 616634],
 ['3', 32753, 32753],
 ['2', 95954, 95954],
 [nan, 0, 0],
 ['0', 2001, 2001]]


[nan, '1', '3', '4', '0', '2']


[[nan, 0, 0],
 ['1', 424046, 483493],
 ['3', 262469, 262604],
 ['4', 482, 482],
 ['0', 719, 719],
 ['2', 44, 44]]


[nan, '1', '2', '3', '0']


[[nan, 0, 0],
 ['1', 424046, 483428],
 ['2', 175546, 175552],
 ['3', 75283, 75283],
 ['0', 13079, 13079]]


[nan, '2', '0', '4', '3', '1']


[[nan, 0, 0],
 ['2', 35377, 35377],
 ['0', 279599, 404813],
 ['4', 52101, 52101],
 ['3', 145165, 149816],
 ['1', 105235, 105235]]


['3', '2', '1', nan, '0', '4']


[['3', 134761, 134761],
 ['2', 429588, 477494],
 ['1', 51078, 51078],
 [nan, 0, 0],
 ['0', 57382, 57425],
 ['4', 26584, 26584]]


['1', '0', nan]


[['1', 540166, 610672], ['0', 136670, 136670], [nan, 0, 0]]


[nan, '0', '1']


[[nan, 0, 0], ['0', 615155, 744211], ['1', 3131, 3131]]


[nan, '1', '0']


[[nan, 0, 0], ['1', 605020, 734076], ['0', 13266, 13266]]


['1', '0', nan]


[['1', 680014, 724487], ['0', 22855, 22855], [nan, 0, 0]]


['0', '1', nan]


[['0', 619627, 634018], ['1', 113044, 113324], [nan, 0, 0]]


['3', '1', '0', nan]


[['3', 514577, 540778],
 ['1', 104953, 104959],
 ['0', 101495, 101605],
 [nan, 0, 0]]


[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75]


[[1, 552043, 552043],
 [2, 130552, 130552],
 [3, 40749, 40749],
 [4, 15947, 15947],
 [5, 5359, 5359],
 [6, 1552, 1552],
 [7, 577, 577],
 [8, 231, 231],
 [9, 98, 98],
 [10, 56, 56],
 [11, 44, 44],
 [12, 27, 27],
 [13, 16, 16],
 [14, 13, 13],
 [15, 7, 7],
 [16, 2, 2],
 [17, 2, 2],
 [18, 2, 2],
 [19, 2, 2],
 [20, 2, 2],
 [21, 2, 2],
 [22, 2, 2],
 [23, 2, 2],
 [24, 2, 2],
 [25, 2, 2],
 [26, 2, 2],
 [27, 1, 1],
 [28, 1, 1],
 [29, 1, 1],
 [30, 1, 1],
 [31, 1, 1],
 [32, 1, 1],
 [33, 1, 1],
 [34, 1, 1],
 [35, 1, 1],
 [36, 1, 1],
 [37, 1, 1],
 [38, 1, 1],
 [39, 1, 1],
 [40, 1, 1],
 [41, 1, 1],
 [42, 1, 1],
 [43, 1, 1],
 [44, 1, 1],
 [45, 1, 1],
 [46, 1, 1],
 [47, 1, 1],
 [48, 1, 1],
 [49, 1, 1],
 [50, 1, 1],
 [51, 1, 1],
 [52, 1, 1],
 [53, 1, 1],
 [54, 1, 1],
 [55, 1, 1],
 [56, 1, 1],
 [57, 1, 1],
 [58, 1, 1],
 [59, 1, 1],
 [60, 1, 1],
 [61, 1, 1],
 [62, 1, 1],
 [63, 1, 1],
 [64, 1, 1],
 [65, 1, 1],
 [66, 1, 1],
 [67, 1, 1],
 [68, 1, 1],
 [69, 1, 1],
 [70, 1, 1],
 [71, 1, 1],
 [72, 1, 1],
 [73,


[2, 1, 0]


[[2, 551843, 551843], [1, 195338, 195338], [0, 161, 161]]


[1, 0]


[[1, 680772, 680772], [0, 66570, 66570]]


['1', '0', nan, '2']


[['1', 585540, 654070], ['0', 51513, 51540], [nan, 0, 0], ['2', 41732, 41732]]


['3', '0', '4', '2', '1', nan]


[['3', 551741, 559299],
 ['0', 5316, 5316],
 ['4', 48163, 48258],
 ['2', 30483, 30483],
 ['1', 102128, 103986],
 [nan, 0, 0]]


['1', '0', nan]


[['1', 390459, 418272], ['0', 325445, 329070], [nan, 0, 0]]


['3.0', '0.0', '4.0', '2.0', nan, '1.0']


[['3.0', 109090, 109090],
 ['0.0', 216288, 240120],
 ['4.0', 52262, 52262],
 ['2.0', 153534, 153555],
 [nan, 0, 0],
 ['1.0', 192315, 192315]]




0