# Methods

- We have the discretized CRSS dataset in '../../Big_Files/Discretized_All_12_22_22.csv'
- MissForest is a round-robin imputation method implemented in R, generally considered one of the best imputation methods.  It has several Python implementations.
- I tried to use MissForest, https://pypi.org/project/MissForest/, to impute missing values, but it gave me errors, and finding the source of the errors led me down the path to write my own round-robin implementation.
- I compare here three methods:
    - Round-Robin Random Forest (my own implementation of Round Robin, using scikit-learn's random forest)
    - Imputation by mode
    - IVEware, using the hyperparameters in the CRSS Imputation report
- To compare, I followed the example for MissForest.
    - I dropped all samples with a missing value, so I would have ground truth.
    - I erased 15% of the values in each sample.
    - I used each imputation method to impute the missing values, and, for each feature, counted how many did not match the ground truth.
- My round-robin method
    - In data_NaN, change all of the 'Unknown' to np.NaN.
    - In each feature, count the number of unknown samples.
    - In another copy, data_Mode, impute by mode in all of the features.
    - Starting with the feature with the least (nonzero) number of missing samples:
        - Copy that feature from data_NaN into data_Mode, so that only that feature has missing values.
        - Separate the dataframe into two, one with known values in the target variable (X) and one with unknown values (Z).
        - From the dataframe with known values (X), separate out the target variable (call it 'y')
        - Using Random Forest, build a model that maps X to y.  
        - Use the model to impute the missing values
    - At each iteration we replace the mode-imputed values with RF-imputed values.
- The IVEware implementation is available in several platforms, but Python is not one of them.  I run it in R outside this notebook.  Be aware that the random selection of values to erase is different for each run, so the IVEware imputation must be run anew.  

# Results of Comparison of Three Imputation Methods

- We ran the imputation on 78 features with 224,850 samples.  
    - The features are the features of the CRSS dataset that are have data for all of 2016 - 2020, are not the results of imputation by CRSS, may have a pattern (not random numbers like VIN numbers), and that do not have more than 20% of the samples missing.  
    - The features were discretized (binned) down to 2-10 categories before imputation.
    - The samples are those of the 619,027 that have no missing values in any of the 78 features.
- First Run
    - Percentage of Samples Incorrectly Imputed

| | Percentage of Samples Incorrectly Imputed |
| --- | --- |
| Random Forest | 22.25% |
| Mode Imputation | 28.51% |
| IVEware | 24.23% |

    - Comparison of number of errors in the 78 features:

|  | Fewer | Equal | More | Total |
| --- | --- | --- | --- | --- |
Compare RF to Mode |  45 | 33 | 0 | 78 |
Compare RF to IVEware | 50 | 0 | 28 | 78 |
Compare Mode to IVEware | 39 | 0 | 39 |  78 |


- Second Run
    - Percentage of Samples Incorrectly Imputed

| | Percentage of Samples Incorrectly Imputed |
| --- | --- |
| Random Forest | 22.17 % |
| Mode Imputation | 28.42% |
| IVEware |  23.84% |


    - Comparison of number of errors in the 78 features:

|  | Fewer | Equal | More |
| --- | --- | --- | --- |
| Compare RF to Mode | 46 | 31 | 1 |
| Compare RF to IVEware | 49 | 0 | 29 |
| Compare Mode to IVEware |  36 | 1 | 41 |

    - Number of NaN Imputed Differently by Different Methods

|  |  |
| --- | --- |
|Total Number of NaN|  2,443,202|
|RF Different from Mode|  273,351|
|RF Different from IVEware|  606,751|
|Mode Different from IVEware|  738,833|

- Third run with 79 features (I had neglected to include AGE)


    - Percentage of Samples Incorrectly Imputed

| | Percentage of Samples Incorrectly Imputed |
| --- | --- |
| Random Forest | 22.52 % |
| Mode Imputation | 28.63% |
| IVEware |  22.73% |



    - Comparison of number of errors in the 78 features:

|  | Fewer | Equal | More |
| --- | --- | --- | --- |
| Compare RF to Mode | 47 | 31 | 1 |
| Compare RF to IVEware | 47 | 0 | 32 |
| Compare Mode to IVEware |  38 | 0 | 41 |

    - Number of NaN Imputed Differently by Different Methods

|  |  |
| --- | --- |
|Total Number of NaN|  2,417,148|
|RF Different from Mode|  279,104|
|RF Different from IVEware|  580,863|
|Mode Different from IVEware|  713,171|



## Discussion

- Random Forest is as good or better than Mode for (nearly) every feature.
- Random Forest is as good or better than IVEware on more than half of the features, but not overwhelmingly, and slightly better in the count of missing samples correctly imputed.
- IVEware and Mode are comparable in the number of features, but IVEware is much better in the count of missing samples correctly imputed.
- Random Forest and Mode make the same mistakes.  
- IVEware makes different mistakes from Random Forest and Mode.

## Conclusion

- Use Random Forest

In [1]:
%%latex
\tableofcontents

<IPython.core.display.Latex object>

# Setup
## Import Libraries

In [2]:
import sys, copy, math, time, os

print ('Python version: {}'.format(sys.version))

import numpy as np
print ('NumPy version: {}'.format(np.__version__))
np.set_printoptions(suppress=True)


import pandas as pd
print ('Pandas version:  {}'.format(pd.__version__))
pd.set_option('display.max_rows', 500)

import sklearn
print ('SciKit-Learn version: {}'.format(sklearn.__version__))
from sklearn.model_selection import train_test_split

import sklearn.neighbors._base
sys.modules['sklearn.neighbors.base'] = sklearn.neighbors._base

from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import RandomForestRegressor

# Set Randomness.  Copied from https://www.kaggle.com/code/abazdyrev/keras-nn-focal-loss-experiments
import random
#np.random.seed(42) # NumPy
#random.seed(42) # Python
#tf.set_random_seed(42) # Tensorflow

from IPython.display import Audio
sound_file = './beep.wav'

import warnings
warnings.filterwarnings('ignore')

print ('Finished Importing Libraries')


Python version: 3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:25:13) [Clang 14.0.6 ]
NumPy version: 1.24.0
Pandas version:  1.5.2
SciKit-Learn version: 1.2.0
Finished Importing Libraries


# Import Data

## Get Data
- The Get_Data_from_Original() reads the (original) CRSS files from the CRSS directory, preprocesses it, and writes it to files in a folder outside this GitHub repo (because the files are too large for my subscription), and returns the dataframes.
- The Get_Data_from_Temp_Files() reads the temp files and returns the dataframes.  I created this option for running repeatedly during writing and debugging, because it's much faster.

In [3]:
def Get_Data():
    print ('Get_Data')
    data = pd.read_csv('../../Big_Files/CRSS_Discretized_All_12_22_22.csv', low_memory=False)
    print ('data.shape = ', data.shape)
    print ('Drop Imputed Columns')
    for feature in data:
        if '_IM' in feature:
#            print (feature)
            data.drop(columns=feature, inplace=True)
    
    print ('data.shape = ', data.shape)
    print ()
    
    return data

In [4]:
#data = Get_Data()


In [5]:
def Impute_Round_Robin(data):
    print ('Impute()')
    pd.set_option('display.max_columns', None)
    
    # Replace 'Unknown' with np.NaN
    data.replace({'Unknown': np.nan}, inplace=True)
    display(data.head(20))
    print ()
    
#    data.sort_values(by = ['CASENUM', 'VEH_NO', 'PER_NO'], ascending = [True, True, True])
    
    # Make a list of features with missing samples, 
    #     ordered by the number of missing samples, 
    #     from least to most.  
    Missing = []
    Complete = []
    for feature in data:
        s = data[feature].isna().sum()
        if s==0:
            Complete.append([feature, s])
        if s>0:
            Missing.append([feature, s])
    Missing = sorted (Missing, key=lambda x:x[1], reverse=False)
    print ()
    print ('Complete[]')
    display(Complete)
    print ()
    print ('Missing[]')
    display(Missing)
    print ()
    
    print ('Make data_Mode')
    print ()
    data_Mode = pd.DataFrame()
    for X in Complete:
        feature = X[0]
        data_Mode[feature] = data[feature]
    for M in Missing:
        feature = M[0]
        m = data[feature].mode()[0]
        print (feature, M[1], m)
        data_Mode[feature] = data[feature].fillna(m)
    print ('data_Mode')
    display(data_Mode.head(20))
#    data.sort_values(
#        by = ['CASENUM', 'VEH_NO', 'PER_NO'], 
#        ascending = [True, True, True], 
#        inplace=True
#    )
#    print ()
#    print ('data.PER_NO.equals(data__Mode.PER_NO)')
#    print (data.PER_NO.equals(data_Mode.PER_NO))
#    print ()
#    
    print ()
    print ('Make starting point for data_Imputed')
    data_Imputed = pd.DataFrame()
    for X in Complete:
        feature = X[0]
        data_Imputed[feature] = data[feature]
    for X in Missing:
        feature = X[0]
        data_Imputed[feature] = data_Mode[feature]
    print ('data_Imputed')
    display(data_Imputed.head(20))
    print ()
#    data_Imputed.sort_values(
#        by = ['CASENUM', 'VEH_NO', 'PER_NO'], 
#        ascending = [True, True, True], 
#        inplace=True
#    )
#    print ()
#    print ('data.PER_NO.equals(data_Imputed.PER_NO)')
#    print (data.PER_NO.equals(data_Imputed.PER_NO))
#    print ()
    
    print ('Start Loop')
    print ()
    n = 0
    for M in Missing:
        n += 1
        print (M)
        feature = M[0]
        data_Imputed[feature] = data[feature]
#        print ()
#        print ('data[feature].isna().sum()')
#        print (data[feature].isna().sum())
#        print ('data_Imputed[feature].isna().sum()')
#        print (data_Imputed[feature].isna().sum())
#        print ()
        W = data_Imputed.dropna(subset=[feature])
        X = data_Imputed.dropna(subset=[feature])
        y = X[feature]
        X.drop(columns=feature, inplace=True)
        Z = data_Imputed[data_Imputed[feature].isna()]
        Z.drop(columns=feature, inplace=True)
#        Z.reset_index(drop=True, inplace=True)
#        print (data.shape)
#        print (X.shape)
#        display(X.head(40))
#        display(y.head(40))
#        print (Z.shape)
#        display(Z)
        clf = RandomForestClassifier(max_depth=2, random_state=0)
        clf.fit(X,y)
#        print ('clf.predict(Z)')
        z = clf.predict(Z)
        print (len(z))
        display(z)
        Z[feature] = z
#        display(Z)
        data_Imputed = pd.concat([Z, W])
#        display(data_Imputed.head(60))
        print (data_Imputed.shape)
        print ()
#        data_Imputed.sort_values(
#            by = ['CASENUM', 'VEH_NO', 'PER_NO'], 
#            ascending = [True, True, True], 
#            inplace=True
#        )
#        print ()
#        print ('data.PER_NO.equals(data_Imputed.PER_NO)')
#        print (data.PER_NO.equals(data_Imputed.PER_NO))
#        print ()
               
        Check_Feature(data, data_Imputed, feature)
#        if n==10:
#            return data_Imputed
    
    
    
    
    print ()
    return data_Imputed

In [6]:
def Impute_Full(data):
    print ('Impute()')
    data.replace({'Unknown': np.nan}, inplace=True)
    for feature in data:
        print (feature, len(pd.unique(data[feature])))
    print ()
    mf = MissForest()
    data = mf.fit_transform(data)
    return data

In [7]:
def Check(data, data_Imputed):
    Features = data.columns
    print (Features)
    for feature in Features:
        U = pd.unique(data[feature]).tolist()
        print (U)
        A = []
        for u in U:
            a = len(data[data[feature]==u])
            b = len(data_Imputed[data_Imputed[feature]==u])
            A.append([u, a, b])
        display(A)
        print ()


In [8]:
def Check_Feature(data, data_Imputed, feature):
    U = pd.unique(data[feature]).tolist()
    U = [x for x in U if x == x]
    print (U)
    A = []
    for u in U:
        a = len(data[data[feature]==u])
        b = len(data_Imputed[data_Imputed[feature]==u])
        A.append([u, a, b, b-a])
    a = data[feature].isna().sum()
    b = data_Imputed[feature].isna().sum()
    A.append(['NaN', a, b, 0])
    A = pd.DataFrame(A, columns=['Value', 'Original', 'Imputed', 'Difference'])
    display(A)
    print ()


# Test_Accuracy

In [None]:
def Compare_Imputation_Methods_Part_1():
    print ()
    print ('Compare_Imputation_Methods_Part_1()')
    data = Get_Data()
    data.drop(columns=['CASENUM', 'VEH_NO', 'PER_NO'], inplace=True)
    print (data.shape)

    # Drop all samples with missing data, so we have ground truth
    data.replace({'Unknown':np.nan}, inplace=True)
    data.dropna(inplace=True)
    data.reset_index(inplace=True, drop=True)
    for feature in data:
        data[feature] = pd.to_numeric(data[feature])
    data.astype('int64')

    data_Ground_Truth = data.copy(deep=True)
    for feature in data_Ground_Truth:
        data_Ground_Truth[feature] = pd.to_numeric(data_Ground_Truth[feature])
    data_Ground_Truth = data_Ground_Truth.astype('int64')
    print ('data_Ground_Truth.shape')
    print (data_Ground_Truth.shape)
    display(data_Ground_Truth.head())

    # Randomly pick 15% of the values from each row
    # and set them to be missing
    print ('Remove 15% of values from each row')
    frac = .15
    N = data.shape[0] * frac # Number of NaN in each feature
    for c in data.columns:
        idx = np.random.choice(a=data.index, size=int(len(data) * frac))
        data.loc[idx, c] = np.nan
    data_NaN = data.copy(deep=True)
    print ('data_NaN.shape')
    print (data_NaN.shape)
    display(data_NaN.head())

    data_IVEware = data.fillna('')
    data_IVEware.to_csv('../../Big_Files/data_IVEware.txt', sep='\t', index=False)
    
    data_Mode = pd.DataFrame()
    for feature in data:
        data_Mode[feature] = data[feature].fillna(data[feature].mode()[0])
    data_Mode = data_Mode.astype('int64')
    print ('data_Mode.shape')
    print (data_Mode.shape)
    display(data_Mode.head())
    
    data_RF = Impute_Round_Robin(data)
    data_RF.sort_index(inplace=True)
    data_RF = data_RF[data.columns]  
    data_RF = data_RF.astype('int64')
    
    print ('data_RF.shape')
    print (data_RF.shape)
    display(data_RF.head())
#    print ()

    return data_Ground_Truth, data_NaN, data_RF, data_Mode

def Compare_Imputation_Methods_Part_2(
    data_Ground_Truth, data_NaN, data_RF, data_Mode, data_IVEware
):
    print ('Compare_Imputation_Methods_Part_2')
    A = []
    for feature in data_NaN:
        N = data_NaN[feature].isna().sum()
#        print (feature, N)
#        print ()
        D = data_Ground_Truth[feature] != data_RF[feature]
        d = D.sum()
        E = data_Ground_Truth[feature] != data_Mode[feature]
        e = E.sum()
        F = data_Ground_Truth[feature] != data_IVEware[feature]
        f = F.sum()
        G = data_RF[feature] != data_Mode[feature]
        g = G.sum()
        H = data_RF[feature] != data_IVEware[feature]
        h = H.sum()
        I = data_Mode[feature] != data_IVEware[feature]
        i = I.sum()
        print (feature, N, d, e, f, g, h, i)
        print (
            feature, 
            data_Ground_Truth.dtypes[feature],
            data_NaN.dtypes[feature],
            data_RF.dtypes[feature],
            data_Mode.dtypes[feature],
            data_IVEware.dtypes[feature],
        )
        A.append([
            feature, N, 
            d, int(d/N*100), 
            e, int(e/N*100), 
            f, int(f/N*100),
            g, int(g/N*100),
            h, int(h/N*100),
            i, int(i/N*100),
        ])
#        print (D[:10])
        print ()
    print ()
    
    A = sorted(A, key=lambda x:x[3])
    B = pd.DataFrame(
        A, 
        columns=[
            'Feature', 'nNaN', 
            'nRF Incorrect', 'pRF Incorrect', 
            'nMode Incorrect', 'pMode Incorrect', 
            'nIVEware Incorrect', 'pIVEware Incorrect',
            'RF and Mode Different', 'RF v/s Mode %',
            'RF and IVEware Different', 'RF v/s IVEware %',
            'Mode and IVEware Different', 'Mode v/s IVEware %'
        ]
    )
    display(B)
    a = sum([x[1] for x in A])
    b = sum([x[2] for x in A])
    c = sum([x[4] for x in A])
    d = sum([x[6] for x in A])
    e = round(b/a*100,2)
    f = round(c/a*100,2)
    g = round(d/a*100,2)
    s = len(A) - sum([x[8] for x in A])
    t = len(A) - sum([x[9] for x in A])
    u = len(A) - sum([x[10] for x in A])

    RF_less_Mode = sum([x[2] < x[4] for x in A])
    RF_equal_Mode = sum([x[2] == x[4] for x in A])
    RF_greater_Mode = sum([x[2] > x[4] for x in A])

    RF_less_IVEware = sum([x[2] < x[6] for x in A])
    RF_equal_IVEware = sum([x[2] == x[6] for x in A])
    RF_greater_IVEware = sum([x[2] > x[6] for x in A])

    Mode_less_IVEware = sum([x[4] < x[6] for x in A])
    Mode_equal_IVEware = sum([x[4] == x[6] for x in A])
    Mode_greater_IVEware = sum([x[4] > x[6] for x in A])

    print ()
    print ('Error RF = ', e)
    print ('Error Mode = ', f)
    print ('Error IVEware = ', g)
    print ('nRF > nMode: ', s)
    print ('nRF > nIVEware: ', t)
    print ('nModel > nIVEware: ', u)
    print ('Compare RF to Mode: ', RF_less_Mode, RF_equal_Mode, RF_greater_Mode)
    print ('Compare RF to IVEware: ', RF_less_IVEware, RF_equal_IVEware, RF_greater_IVEware)
    print ('Compare Mode to IVEware: ', Mode_less_IVEware, Mode_equal_IVEware, Mode_greater_IVEware)
    print ()
    print ('Number of NaN in data_NaN: ', data_NaN.isna().sum().sum())
    print ('RF Different from Mode: ', sum([x[8] for x in A]))
    print ('RF Different from IVEware: ', sum([x[10] for x in A]))
    print ('Mode Different from IVEware: ', sum([x[12] for x in A]))
        
    display(Audio(sound_file, autoplay=True))
    
    
        

In [None]:
data_Ground_Truth, data_NaN, data_RF, data_Mode = Compare_Imputation_Methods_Part_1()

## Now do IVEware Imputation:  IVE_12_22_22.xml

In [None]:
data_IVEware = pd.read_csv('../../Big_Files/data_IVEware.csv')
data_IVEware.drop(columns='Unnamed: 0', inplace=True)

print ('data_Ground_Truth', data_Ground_Truth.shape)
display(data_Ground_Truth.head(10))
print ('data_NaN', data_NaN.shape)
display(data_NaN.head(10))
print ('data_RF', data_RF.shape)
display(data_RF.head(10))
print ('data_Mode', data_Mode.shape)
display(data_Mode.head(10))
print ('data_IVEware', data_IVEware.shape)
display(data_IVEware.head(10))


In [None]:
Compare_Imputation_Methods_Part_2(
    data_Ground_Truth, data_NaN, data_RF, data_Mode, data_IVEware
)

In [9]:
def Main():
    data = Get_Data()
    
#    data_Imputed = Impute_Full(data)
    data_Imputed = Impute_Round_Robin(data)
    data_Imputed.to_csv('../../Big_Files/CRSS_Imputed_All_12_22_22.csv', index=False)
#    display(data_Imputed.head(50))
    
    Check(data, data_Imputed)
    display(Audio(sound_file, autoplay=True))
    return 0
Main()

Get_Data
data.shape =  (619027, 107)
Drop Imputed Columns
data.shape =  (619027, 82)

Impute()


Unnamed: 0,CASENUM,HOUR,INT_HWY,LGT_COND,MONTH,PEDS,PERMVIT,REL_ROAD,RELJCT2,SCH_BUS,URBANICITY,VE_TOTAL,WEATHER,DAY_WEEK,WRK_ZONE,VE_FORMS,PVH_INVL,PERNOTMVIT,NUM_INJ,PSU,PJ,MAN_COLL,HARM_EV,TYP_INT,YEAR,REGION,ALCOHOL,MAX_SEV,RELJCT1,VEH_NO,ACC_TYPE,BODY_TYP,BUS_USE,CARGO_BT,DR_PRES,EMER_USE,FIRE_EXP,HAZ_CNO,HAZ_INV,HAZ_PLAC,HAZ_REL,HIT_RUN,IMPACT1,J_KNIFE,M_HARM,MAK_MOD,MAKE,MAX_VSEV,MODEL,NUM_INJV,NUMOCCS,P_CRASH1,P_CRASH2,PCRASH4,PCRASH5,ROLLOVER,SPEC_USE,SPEEDREL,TOW_VEH,TOWED,VALIGN,VEH_ALCH,VPROFILE,VSPD_LIM,VSURCOND,VTCONT_F,VTRAFCON,VTRAFWAY,AGE,AIR_BAG,ALC_RES,ALC_STATUS,EJECTION,HOSPITAL,INJ_SEV,PER_NO,PER_TYP,REST_MIS,REST_USE,SEAT_POS,SEX,VEH_AGE
0,201600014311,2,0,3,0,0,1,1,1,0,2,2,1,1,0,2,0,0,0,4,4,4,1,1.0,2016,4,2,0,0,1,4,1.0,1,0,1,1,1,1,0,0,1,0,3,1,1,2,0.0,2.0,1,3.0,3.0,5,5,1.0,2.0,1,1,1,0,2,1,1.0,1,7,1,,,,3.0,1.0,,,1,0,3.0,1,2,1,1.0,3,1.0,3.0
1,201600014311,2,0,3,0,0,1,1,1,0,2,2,1,1,0,2,0,0,0,4,4,4,1,1.0,2016,4,2,0,0,2,3,2.0,1,0,1,1,1,1,0,0,1,0,0,1,1,4,4.0,2.0,4,3.0,3.0,1,4,1.0,3.0,1,1,1,0,2,1,1.0,1,7,1,,,,2.0,1.0,,,1,0,3.0,1,2,1,1.0,3,1.0,0.0
2,201600014315,4,0,1,0,0,2,1,0,0,2,2,1,1,0,2,0,0,4,4,4,0,1,,2016,4,2,1,0,1,1,5.0,1,0,1,1,1,1,0,0,1,0,1,1,1,3,8.0,1.0,4,1.0,3.0,2,1,1.0,3.0,1,1,1,0,0,1,1.0,2,7,1,,,,1.0,0.0,,,1,0,1.0,1,2,1,1.0,3,1.0,4.0
3,201600014315,4,0,1,0,0,2,1,0,0,2,2,1,1,0,2,0,0,4,4,4,0,1,,2016,4,2,1,0,2,0,0.0,1,0,1,1,1,1,0,0,1,0,1,1,1,1,6.0,1.0,3,0.0,5.0,1,0,1.0,3.0,1,1,1,0,0,1,1.0,2,7,1,,,,1.0,0.0,,,1,0,1.0,1,2,1,1.0,3,0.0,2.0
4,201600014315,4,0,1,0,0,2,1,0,0,2,2,1,1,0,2,0,0,4,4,4,0,1,,2016,4,2,1,0,2,0,0.0,1,0,1,1,1,1,0,0,1,0,1,1,1,1,6.0,1.0,3,0.0,5.0,1,0,1.0,3.0,1,1,1,0,0,1,1.0,2,7,1,,,,1.0,0.0,0.0,1.0,1,0,1.0,2,1,1,1.0,0,1.0,2.0
5,201600014315,4,0,1,0,0,2,1,0,0,2,2,1,1,0,2,0,0,4,4,4,0,1,,2016,4,2,1,0,2,0,0.0,1,0,1,1,1,1,0,0,1,0,1,1,1,1,6.0,1.0,3,0.0,5.0,1,0,1.0,3.0,1,1,1,0,0,1,1.0,2,7,1,,,,1.0,0.0,0.0,1.0,1,0,1.0,3,1,1,1.0,4,1.0,2.0
6,201600014316,0,0,1,0,0,0,0,1,0,2,1,1,1,0,1,0,0,1,4,4,1,3,1.0,2016,4,2,3,0,1,0,1.0,1,0,1,1,1,1,0,0,1,0,2,1,0,0,0.0,0.0,1,1.0,3.0,1,0,0.0,1.0,0,1,1,0,0,1,1.0,1,1,1,,,2.0,2.0,1.0,,,0,5,0.0,1,2,1,0.0,3,0.0,3.0
7,201600014335,5,0,1,0,0,1,1,0,0,2,2,0,1,0,2,0,0,0,4,4,0,1,,2016,4,9,0,0,1,0,,1,0,1,1,1,1,0,0,1,1,1,1,1,4,,,4,,,1,2,,,1,1,1,0,2,1,,2,5,3,,,,,,0.0,1.0,1,0,,1,2,1,,3,,
8,201600014335,5,0,1,0,0,1,1,0,0,2,2,0,1,0,2,0,0,0,4,4,0,1,,2016,4,9,0,0,2,0,1.0,1,0,1,1,1,1,0,0,1,0,1,1,1,0,6.0,2.0,0,3.0,3.0,1,0,1.0,3.0,1,1,1,0,2,1,1.0,2,5,3,,,,2.0,1.0,0.0,1.0,1,0,3.0,1,2,1,1.0,3,0.0,2.0
9,201600014586,1,0,3,0,1,0,1,2,0,2,1,1,1,0,1,0,1,1,0,2,1,2,1.0,2016,4,2,2,0,1,4,1.0,1,0,1,1,1,1,0,0,1,0,1,1,2,2,4.0,2.0,2,3.0,3.0,2,5,1.0,3.0,1,1,0,0,2,1,1.0,1,2,1,1.0,1.0,2.0,3.0,,0.0,1.0,1,0,3.0,1,2,1,1.0,3,1.0,4.0




Complete[]


[['CASENUM', 0],
 ['INT_HWY', 0],
 ['MONTH', 0],
 ['PEDS', 0],
 ['PERMVIT', 0],
 ['REL_ROAD', 0],
 ['SCH_BUS', 0],
 ['URBANICITY', 0],
 ['VE_TOTAL', 0],
 ['DAY_WEEK', 0],
 ['WRK_ZONE', 0],
 ['VE_FORMS', 0],
 ['PVH_INVL', 0],
 ['PERNOTMVIT', 0],
 ['PSU', 0],
 ['PJ', 0],
 ['YEAR', 0],
 ['REGION', 0],
 ['ALCOHOL', 0],
 ['MAX_SEV', 0],
 ['RELJCT1', 0],
 ['VEH_NO', 0],
 ['FIRE_EXP', 0],
 ['HAZ_INV', 0],
 ['J_KNIFE', 0],
 ['MAK_MOD', 0],
 ['MODEL', 0],
 ['ROLLOVER', 0],
 ['HOSPITAL', 0],
 ['PER_NO', 0],
 ['PER_TYP', 0],
 ['REST_MIS', 0]]


Missing[]


[['DR_PRES', 20],
 ['HIT_RUN', 29],
 ['HAZ_PLAC', 31],
 ['HAZ_REL', 48],
 ['HAZ_CNO', 105],
 ['HARM_EV', 217],
 ['M_HARM', 231],
 ['TOW_VEH', 1043],
 ['HOUR', 1684],
 ['MAN_COLL', 2677],
 ['PCRASH5', 2928],
 ['ACC_TYPE', 3717],
 ['LGT_COND', 3814],
 ['P_CRASH2', 4492],
 ['EMER_USE', 5320],
 ['NUM_INJ', 5621],
 ['BUS_USE', 6690],
 ['SEAT_POS', 7820],
 ['P_CRASH1', 9963],
 ['CARGO_BT', 11255],
 ['SPEEDREL', 11578],
 ['SPEC_USE', 11955],
 ['IMPACT1', 13288],
 ['MAKE', 13328],
 ['VSURCOND', 14471],
 ['VEH_AGE', 19218],
 ['MAX_VSEV', 19513],
 ['NUM_INJV', 19513],
 ['INJ_SEV', 21474],
 ['BODY_TYP', 21859],
 ['NUMOCCS', 22142],
 ['SEX', 25579],
 ['WEATHER', 28669],
 ['PCRASH4', 30069],
 ['TOWED', 35788],
 ['EJECTION', 35963],
 ['AGE', 39525],
 ['VTRAFCON', 39795],
 ['VTCONT_F', 39981],
 ['VALIGN', 42739],
 ['RELJCT2', 43534],
 ['REST_USE', 55764],
 ['AIR_BAG', 62123],
 ['TYP_INT', 67861],
 ['VSPD_LIM', 83155],
 ['VPROFILE', 84506],
 ['ALC_RES', 106366],
 ['ALC_STATUS', 106366],
 ['VEH_ALCH', 


Make data_Mode

DR_PRES 20 1
HIT_RUN 29 0
HAZ_PLAC 31 0
HAZ_REL 48 1
HAZ_CNO 105 1
HARM_EV 217 1
M_HARM 231 1
TOW_VEH 1043 0
HOUR 1684 3
MAN_COLL 2677 3
PCRASH5 2928 3
ACC_TYPE 3717 2
LGT_COND 3814 3
P_CRASH2 4492 5
EMER_USE 5320 1
NUM_INJ 5621 0
BUS_USE 6690 1
SEAT_POS 7820 3
P_CRASH1 9963 1
CARGO_BT 11255 0
SPEEDREL 11578 1
SPEC_USE 11955 1
IMPACT1 13288 1
MAKE 13328 8
VSURCOND 14471 1
VEH_AGE 19218 0.0
MAX_VSEV 19513 2
NUM_INJV 19513 3
INJ_SEV 21474 3
BODY_TYP 21859 1
NUMOCCS 22142 3
SEX 25579 1
WEATHER 28669 1
PCRASH4 30069 1
TOWED 35788 2
EJECTION 35963 1
AGE 39525 2
VTRAFCON 39795 1
VTCONT_F 39981 1
VALIGN 42739 1
RELJCT2 43534 1
REST_USE 55764 1
AIR_BAG 62123 1
TYP_INT 67861 1
VSPD_LIM 83155 2
VPROFILE 84506 1
ALC_RES 106366 0
ALC_STATUS 106366 1
VEH_ALCH 107580 1
VTRAFWAY 111189 0
data_Mode


Unnamed: 0,CASENUM,INT_HWY,MONTH,PEDS,PERMVIT,REL_ROAD,SCH_BUS,URBANICITY,VE_TOTAL,DAY_WEEK,WRK_ZONE,VE_FORMS,PVH_INVL,PERNOTMVIT,PSU,PJ,YEAR,REGION,ALCOHOL,MAX_SEV,RELJCT1,VEH_NO,FIRE_EXP,HAZ_INV,J_KNIFE,MAK_MOD,MODEL,ROLLOVER,HOSPITAL,PER_NO,PER_TYP,REST_MIS,DR_PRES,HIT_RUN,HAZ_PLAC,HAZ_REL,HAZ_CNO,HARM_EV,M_HARM,TOW_VEH,HOUR,MAN_COLL,PCRASH5,ACC_TYPE,LGT_COND,P_CRASH2,EMER_USE,NUM_INJ,BUS_USE,SEAT_POS,P_CRASH1,CARGO_BT,SPEEDREL,SPEC_USE,IMPACT1,MAKE,VSURCOND,VEH_AGE,MAX_VSEV,NUM_INJV,INJ_SEV,BODY_TYP,NUMOCCS,SEX,WEATHER,PCRASH4,TOWED,EJECTION,AGE,VTRAFCON,VTCONT_F,VALIGN,RELJCT2,REST_USE,AIR_BAG,TYP_INT,VSPD_LIM,VPROFILE,ALC_RES,ALC_STATUS,VEH_ALCH,VTRAFWAY
0,201600014311,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,0,0,1,1,0,1,2,1,1,0,1,2,1,1,0,0,1,1,1,1,0,2,4,2,4,3,5,1,0,1,3,5,0,1,1,3,0,1,3.0,2,3,3,1,3,1,1,1,2,1,3,1,1,1,1,1,1,1,7,1,0,1,1,0
1,201600014311,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,0,0,2,1,0,1,4,4,1,0,1,2,1,1,0,0,1,1,1,1,0,2,4,3,3,3,4,1,0,1,3,1,0,1,1,0,4,1,0.0,2,3,3,2,3,1,1,1,2,1,2,1,1,1,1,1,1,1,7,1,0,1,1,0
2,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,1,1,0,1,3,4,1,0,1,2,1,1,0,0,1,1,1,1,0,4,0,3,1,1,1,1,4,1,3,2,0,1,1,1,8,1,4.0,1,1,1,5,3,1,1,1,0,1,1,1,1,1,0,1,0,1,7,2,0,1,1,0
3,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,0,1,2,1,1,0,0,1,1,1,1,0,4,0,3,0,1,0,1,4,1,3,1,0,1,1,1,6,1,2.0,1,0,1,0,5,0,1,1,0,1,1,1,1,1,0,1,0,1,7,2,0,1,1,0
4,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,0,2,1,1,1,0,0,1,1,1,1,0,4,0,3,0,1,0,1,4,1,0,1,0,1,1,1,6,1,2.0,1,0,1,0,5,1,1,1,0,1,1,1,1,1,0,1,0,1,7,2,0,1,1,0
5,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,0,3,1,1,1,0,0,1,1,1,1,0,4,0,3,0,1,0,1,4,1,4,1,0,1,1,1,6,1,2.0,1,0,1,0,5,1,1,1,0,1,1,1,1,1,0,1,0,1,7,2,0,1,1,0
6,201600014316,0,0,0,0,0,0,2,1,1,0,1,0,0,4,4,2016,4,2,3,0,1,1,0,1,0,1,0,5,1,2,1,1,0,0,1,1,3,0,0,0,1,1,0,1,0,1,1,1,3,1,0,1,1,2,0,1,3.0,0,1,0,1,3,0,1,0,0,0,2,1,1,1,1,0,1,1,1,1,0,1,1,2
7,201600014335,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,9,0,0,1,1,0,1,4,4,1,0,1,2,1,1,1,0,1,1,1,1,0,5,0,3,0,1,2,1,0,1,3,1,0,1,1,1,8,3,0.0,2,3,3,1,3,1,0,1,2,1,2,1,1,1,0,1,1,1,5,2,0,1,1,0
8,201600014335,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,9,0,0,2,1,0,1,0,0,1,0,1,2,1,1,0,0,1,1,1,1,0,5,0,3,0,1,0,1,0,1,3,1,0,1,1,1,6,3,2.0,2,3,3,1,3,0,0,1,2,1,2,1,1,1,0,1,1,1,5,2,0,1,1,0
9,201600014586,0,0,1,0,1,0,2,1,1,0,1,0,1,0,2,2016,4,2,2,0,1,1,0,1,2,2,1,0,1,2,1,1,0,0,1,1,2,2,0,1,1,3,4,3,5,1,1,1,3,2,0,0,1,1,4,1,4.0,2,3,3,1,3,1,1,1,2,1,3,1,1,1,2,1,1,1,2,1,0,1,1,2



Make starting point for data_Imputed
data_Imputed


Unnamed: 0,CASENUM,INT_HWY,MONTH,PEDS,PERMVIT,REL_ROAD,SCH_BUS,URBANICITY,VE_TOTAL,DAY_WEEK,WRK_ZONE,VE_FORMS,PVH_INVL,PERNOTMVIT,PSU,PJ,YEAR,REGION,ALCOHOL,MAX_SEV,RELJCT1,VEH_NO,FIRE_EXP,HAZ_INV,J_KNIFE,MAK_MOD,MODEL,ROLLOVER,HOSPITAL,PER_NO,PER_TYP,REST_MIS,DR_PRES,HIT_RUN,HAZ_PLAC,HAZ_REL,HAZ_CNO,HARM_EV,M_HARM,TOW_VEH,HOUR,MAN_COLL,PCRASH5,ACC_TYPE,LGT_COND,P_CRASH2,EMER_USE,NUM_INJ,BUS_USE,SEAT_POS,P_CRASH1,CARGO_BT,SPEEDREL,SPEC_USE,IMPACT1,MAKE,VSURCOND,VEH_AGE,MAX_VSEV,NUM_INJV,INJ_SEV,BODY_TYP,NUMOCCS,SEX,WEATHER,PCRASH4,TOWED,EJECTION,AGE,VTRAFCON,VTCONT_F,VALIGN,RELJCT2,REST_USE,AIR_BAG,TYP_INT,VSPD_LIM,VPROFILE,ALC_RES,ALC_STATUS,VEH_ALCH,VTRAFWAY
0,201600014311,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,0,0,1,1,0,1,2,1,1,0,1,2,1,1,0,0,1,1,1,1,0,2,4,2,4,3,5,1,0,1,3,5,0,1,1,3,0,1,3.0,2,3,3,1,3,1,1,1,2,1,3,1,1,1,1,1,1,1,7,1,0,1,1,0
1,201600014311,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,0,0,2,1,0,1,4,4,1,0,1,2,1,1,0,0,1,1,1,1,0,2,4,3,3,3,4,1,0,1,3,1,0,1,1,0,4,1,0.0,2,3,3,2,3,1,1,1,2,1,2,1,1,1,1,1,1,1,7,1,0,1,1,0
2,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,1,1,0,1,3,4,1,0,1,2,1,1,0,0,1,1,1,1,0,4,0,3,1,1,1,1,4,1,3,2,0,1,1,1,8,1,4.0,1,1,1,5,3,1,1,1,0,1,1,1,1,1,0,1,0,1,7,2,0,1,1,0
3,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,0,1,2,1,1,0,0,1,1,1,1,0,4,0,3,0,1,0,1,4,1,3,1,0,1,1,1,6,1,2.0,1,0,1,0,5,0,1,1,0,1,1,1,1,1,0,1,0,1,7,2,0,1,1,0
4,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,0,2,1,1,1,0,0,1,1,1,1,0,4,0,3,0,1,0,1,4,1,0,1,0,1,1,1,6,1,2.0,1,0,1,0,5,1,1,1,0,1,1,1,1,1,0,1,0,1,7,2,0,1,1,0
5,201600014315,0,0,0,2,1,0,2,2,1,0,2,0,0,4,4,2016,4,2,1,0,2,1,0,1,1,3,1,0,3,1,1,1,0,0,1,1,1,1,0,4,0,3,0,1,0,1,4,1,4,1,0,1,1,1,6,1,2.0,1,0,1,0,5,1,1,1,0,1,1,1,1,1,0,1,0,1,7,2,0,1,1,0
6,201600014316,0,0,0,0,0,0,2,1,1,0,1,0,0,4,4,2016,4,2,3,0,1,1,0,1,0,1,0,5,1,2,1,1,0,0,1,1,3,0,0,0,1,1,0,1,0,1,1,1,3,1,0,1,1,2,0,1,3.0,0,1,0,1,3,0,1,0,0,0,2,1,1,1,1,0,1,1,1,1,0,1,1,2
7,201600014335,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,9,0,0,1,1,0,1,4,4,1,0,1,2,1,1,1,0,1,1,1,1,0,5,0,3,0,1,2,1,0,1,3,1,0,1,1,1,8,3,0.0,2,3,3,1,3,1,0,1,2,1,2,1,1,1,0,1,1,1,5,2,0,1,1,0
8,201600014335,0,0,0,1,1,0,2,2,1,0,2,0,0,4,4,2016,4,9,0,0,2,1,0,1,0,0,1,0,1,2,1,1,0,0,1,1,1,1,0,5,0,3,0,1,0,1,0,1,3,1,0,1,1,1,6,3,2.0,2,3,3,1,3,0,0,1,2,1,2,1,1,1,0,1,1,1,5,2,0,1,1,0
9,201600014586,0,0,1,0,1,0,2,1,1,0,1,0,1,0,2,2016,4,2,2,0,1,1,0,1,2,2,1,0,1,2,1,1,0,0,1,1,2,2,0,1,1,3,4,3,5,1,1,1,3,2,0,0,1,1,4,1,4.0,2,3,3,1,3,1,1,1,2,1,3,1,1,1,2,1,1,1,2,1,0,1,1,2



Start Loop

['DR_PRES', 20]
20


array(['1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,618873,618893,20
1,0.0,134,134,0
2,,20,0,0



['HIT_RUN', 29]
29


array(['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0'], dtype=object)

(619027, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,586963,586992,29
1,1.0,32035,32035,0
2,,29,0,0



['HAZ_PLAC', 31]
31


array(['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '1'], dtype=object)

(619027, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,618782,618812,30
1,1.0,214,215,1
2,,31,0,0



['HAZ_REL', 48]
48


array(['1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1'], dtype=object)

(619027, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,618782,618830,48
1,2.0,156,156,0
2,0.0,41,41,0
3,,48,0,0



['HAZ_CNO', 105]
105


array(['1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1', '1',
       '1'], dtype=object)

(619027, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,618782,618887,105
1,2.0,133,133,0
2,0.0,7,7,0
3,,105,0,0



['HARM_EV', 217]
217


array(['3', '3', '3', '3', '3', '3', '1', '1', '1', '1', '1', '3', '3',
       '3', '3', '1', '1', '3', '3', '3', '3', '3', '3', '3', '3', '3',
       '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '1', '3', '3',
       '3', '3', '3', '3', '3', '1', '3', '3', '3', '3', '3', '3', '3',
       '3', '3', '3', '3', '1', '3', '3', '1', '1', '1', '3', '3', '3',
       '3', '3', '3', '1', '3', '1', '3', '3', '3', '1', '1', '3', '3',
       '1', '3', '1', '1', '3', '1', '3', '1', '3', '3', '3', '3', '3',
       '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3', '3',
       '3', '3', '3', '3', '3', '3', '3', '3', '1', '1', '1', '1', '1',
       '1', '3', '3', '1', '1', '3', '3', '1', '3', '3', '3', '3', '3',
       '3', '3', '3', '3', '3', '3', '3', '3', '1', '1', '1', '1', '1',
       '1', '1', '3', '3', '3', '3', '3', '3', '3', '3', '1', '1', '1',
       '3', '3', '3', '3', '3', '3', '3', '1', '3', '1', '3', '3', '1',
       '1', '1', '1', '3', '1', '3', '3', '3', '3', '1', '1', '1

(619027, 82)

['1', '3', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,504392,504461,69
1,3.0,51187,51335,148
2,2.0,52012,52012,0
3,0.0,11219,11219,0
4,,217,0,0



['M_HARM', 231]
231


array(['0', '0', '0', '0', '0', '0', '1', '1', '0', '0', '0', '0', '1',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '1',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0',
       '1', '1', '1', '0', '0', '0', '0', '0', '0', '1', '0', '1', '0',
       '0', '0', '1', '1', '0', '0', '1', '0', '1', '1', '0', '0', '1',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0',
       '1', '1', '1', '1', '1', '1', '0', '0', '1', '1', '0', '0', '1',
       '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '1',
       '1', '1', '1', '1', '1', '0', '0', '0', '0', '0', '0', '0', '0',
       '1', '1', '1', '0', '0', '0', '0', '0', '0', '0', '1', '0', '1',
       '0', '0', '1', '1', '1', '1', '0', '1', '0', '0', '0', '0', '1',
       '1', '1', '1', '1', '1', '0', '1', '1', '1', '1', '0', '0

(619027, 82)

['1', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,501517,501604,87
1,0.0,62275,62419,144
2,2.0,55004,55004,0
3,,231,0,0



['TOW_VEH', 1043]
1043


array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

(619027, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,602193,603236,1043
1,1.0,15791,15791,0
2,,1043,0,0



['HOUR', 1684]
1684


array(['3', '2', '2', ..., '3', '3', '2'], dtype=object)

(619027, 82)

['2', '4', '0', '5', '1', '3', '6']


Unnamed: 0,Value,Original,Imputed,Difference
0,2.0,150696,150963,267
1,4.0,73637,73637,0
2,0.0,22512,22512,0
3,5.0,59087,59088,1
4,1.0,106547,106547,0
5,3.0,164271,165476,1205
6,6.0,40593,40804,211
7,,1684,0,0



['MAN_COLL', 2677]
2677


array(['1', '1', '1', ..., '3', '3', '3'], dtype=object)

(619027, 82)

['4', '0', '1', '3', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,4.0,73402,73402,0
1,0.0,23800,23800,0
2,1.0,114521,114613,92
3,3.0,227013,228665,1652
4,2.0,177614,178547,933
5,,2677,0,0



['PCRASH5', 2928]
2928


array(['1', '3', '3', ..., '3', '3', '3'], dtype=object)

(619027, 82)

['2', '3', '1', '4', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,2.0,66116,66116,0
1,3.0,472322,475142,2820
2,1.0,66664,66772,108
3,4.0,9782,9782,0
4,0.0,1215,1215,0
5,,2928,0,0



['ACC_TYPE', 3717]
3717


array(['0', '4', '4', ..., '0', '0', '0'], dtype=object)

(619027, 82)

['4', '3', '1', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,4.0,134507,135540,1033
1,3.0,137956,138691,735
2,1.0,85650,86022,372
3,0.0,118949,120370,1421
4,2.0,138248,138404,156
5,,3717,0,0



['LGT_COND', 3814]
3814


array(['3', '3', '3', ..., '3', '3', '3'], dtype=object)

(619027, 82)

['3', '1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,441683,445496,3813
1,1.0,108464,108465,1
2,2.0,14716,14716,0
3,0.0,50350,50350,0
4,,3814,0,0



['P_CRASH2', 4492]
4492


array(['0', '0', '0', ..., '5', '5', '0'], dtype=object)

(619027, 82)

['5', '4', '1', '0', '2', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,5.0,123122,124817,1695
1,4.0,75942,75942,0
2,1.0,100941,101245,304
3,0.0,122557,123637,1080
4,2.0,71215,71215,0
5,3.0,120758,122171,1413
6,,4492,0,0



['EMER_USE', 5320]
5320


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,611771,617091,5320
1,0.0,1186,1186,0
2,2.0,750,750,0
3,,5320,0,0



['NUM_INJ', 5621]
5621


array(['0', '0', '0', ..., '0', '1', '0'], dtype=object)

(619027, 82)

['0', '4', '1', '2', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,286580,291576,4996
1,4.0,28200,28200,0
2,1.0,191273,191898,625
3,2.0,75486,75486,0
4,3.0,31867,31867,0
5,,5621,0,0



['BUS_USE', 6690]
6690


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,609511,616201,6690
1,2.0,2699,2699,0
2,0.0,127,127,0
3,,6690,0,0



['SEAT_POS', 7820]
7820


array(['3', '3', '3', ..., '3', '4', '3'], dtype=object)

(619027, 82)

['3', '0', '4', '2', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,456233,463461,7228
1,0.0,4617,4617,0
2,4.0,40265,40401,136
3,2.0,25433,25433,0
4,1.0,84659,85115,456
5,,7820,0,0



['P_CRASH1', 9963]
9963


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['5', '1', '2', '4', '0', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,5.0,62674,62674,0
1,1.0,307073,317007,9934
2,2.0,62808,62808,0
3,4.0,94273,94302,29
4,0.0,41797,41797,0
5,3.0,40439,40439,0
6,,9963,0,0



['CARGO_BT', 11255]
11255


array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

(619027, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,585969,597216,11247
1,1.0,21803,21811,8
2,,11255,0,0



['SPEEDREL', 11578]
11578


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,570294,581872,11578
1,0.0,37155,37155,0
2,,11578,0,0



['SPEC_USE', 11955]
11955


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,599758,611713,11955
1,2.0,4262,4262,0
2,0.0,3052,3052,0
3,,11955,0,0



['IMPACT1', 13288]
13288


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['3', '0', '1', '2', '4', '5']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,87388,87388,0
1,0.0,57966,57966,0
2,1.0,256647,269620,12973
3,2.0,44423,44423,0
4,4.0,135680,135995,315
5,5.0,23635,23635,0
6,,13288,0,0



['MAKE', 13328]
13328


array(['8', '8', '8', ..., '8', '8', '8'], dtype=object)

(619027, 82)

['0', '4', '8', '6', '7', '2', '1', '3', '5']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,120860,121560,700
1,4.0,71217,71217,0
2,8.0,122439,135067,12628
3,6.0,81273,81273,0
4,7.0,24997,24997,0
5,2.0,76578,76578,0
6,1.0,86857,86857,0
7,3.0,12328,12328,0
8,5.0,9150,9150,0
9,,13328,0,0



['VSURCOND', 14471]
14471


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '3', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,493663,508134,14471
1,3.0,27839,27839,0
2,2.0,81404,81404,0
3,0.0,1650,1650,0
4,,14471,0,0



['VEH_AGE', 19218]
19218


array(['0.0', '0.0', '0.0', ..., '0.0', '0.0', '0.0'], dtype=object)

(619027, 82)

['3.0', '0.0', '4.0', '2.0', '1.0']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,90266,90266,0
1,0.0,180116,199282,19166
2,4.0,41959,41959,0
3,2.0,132252,132304,52
4,1.0,155216,155216,0
5,,19218,0,0



['MAX_VSEV', 19513]
19513


array(['2', '2', '2', ..., '2', '2', '2'], dtype=object)

(619027, 82)

['2', '1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,2.0,392571,412084,19513
1,1.0,104043,104043,0
2,0.0,102900,102900,0
3,,19513,0,0



['NUM_INJV', 19513]
19513


array(['3', '3', '3', ..., '3', '3', '3'], dtype=object)

(619027, 82)

['3', '1', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,392587,412100,19513
1,1.0,144920,144920,0
2,0.0,61948,61948,0
3,2.0,59,59,0
4,,19513,0,0



['INJ_SEV', 21474]
21474


array(['3', '3', '3', ..., '3', '3', '3'], dtype=object)

(619027, 82)

['3', '1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,427555,448904,21349
1,1.0,87001,87004,3
2,0.0,82997,83119,122
3,,21474,0,0



['BODY_TYP', 21859]
21859


array(['5', '5', '5', ..., '1', '5', '5'], dtype=object)

(619027, 82)

['1', '2', '5', '0', '3', '4']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,217572,221331,3759
1,2.0,59376,59376,0
2,5.0,134955,153055,18100
3,0.0,46961,46961,0
4,3.0,92952,92952,0
5,4.0,45352,45352,0
6,,21859,0,0



['NUMOCCS', 22142]
22142


array(['3', '3', '3', ..., '3', '3', '3'], dtype=object)

(619027, 82)

['3', '5', '6', '1', '2', '0', '4']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,326882,348270,21388
1,5.0,62203,62203,0
2,6.0,51666,51666,0
3,1.0,147112,147866,754
4,2.0,6974,6974,0
5,0.0,1565,1565,0
6,4.0,483,483,0
7,,22142,0,0



['SEX', 25579]
25579


array(['1', '1', '1', ..., '0', '0', '0'], dtype=object)

(619027, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,322987,345479,22492
1,0.0,270461,273548,3087
2,,25579,0,0



['WEATHER', 28669]
28669


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0', '2', '3', '4']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,432490,461159,28669
1,0.0,2787,2787,0
2,2.0,54953,54953,0
3,3.0,89628,89628,0
4,4.0,10500,10500,0
5,,28669,0,0



['PCRASH4', 30069]
30069


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,566103,596172,30069
1,0.0,22855,22855,0
2,,30069,0,0



['TOWED', 35788]
35788


array(['2', '2', '2', ..., '0', '0', '0'], dtype=object)

(619027, 82)

['2', '0', '3', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,2.0,341947,375169,33222
1,0.0,192534,195100,2566
2,3.0,29525,29525,0
3,1.0,19233,19233,0
4,,35788,0,0



['EJECTION', 35963]
35963


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,564298,600261,35963
1,0.0,18766,18766,0
2,,35963,0,0



['AGE', 39525]
39525


array(['2', '2', '2', ..., '2', '2', '2'], dtype=object)

(619027, 82)

['3', '2', '1', '0', '4']


Unnamed: 0,Value,Original,Imputed,Difference
0,3.0,111949,111949,0
1,2.0,355678,395203,39525
2,1.0,42134,42134,0
3,0.0,47951,47951,0
4,4.0,21790,21790,0
5,,39525,0,0



['VTRAFCON', 39795]
39795


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '2', '3', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,360589,400366,39777
1,2.0,144621,144639,18
2,3.0,62181,62181,0
3,0.0,11841,11841,0
4,,39795,0,0



['VTCONT_F', 39981]
39981


array(['1', '1', '1', ..., '3', '1', '1'], dtype=object)

(619027, 82)

['1', '3', '4', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,360589,400462,39873
1,3.0,217411,217519,108
2,4.0,413,413,0
3,0.0,598,598,0
4,2.0,35,35,0
5,,39981,0,0



['VALIGN', 42739]
42739


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,512374,555113,42739
1,2.0,15593,15593,0
2,0.0,48321,48321,0
3,,42739,0,0



['RELJCT2', 43534]
43534


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0', '2', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,211368,251423,40055
1,0.0,166005,168063,2058
2,2.0,52868,52868,0
3,3.0,145252,146673,1421
4,,43534,0,0



['REST_USE', 55764]
55764


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0', '2']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,484005,539720,55715
1,0.0,41923,41972,49
2,2.0,37335,37335,0
3,,55764,0,0



['AIR_BAG', 62123]
62123


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,445939,508062,62123
1,0.0,110965,110965,0
2,,62123,0,0



['TYP_INT', 67861]
67861


array(['1', '1', '1', ..., '2', '2', '2'], dtype=object)

(619027, 82)

['1', '2', '0', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,315373,343683,28310
1,2.0,165072,204623,39551
2,0.0,66083,66083,0
3,3.0,4638,4638,0
4,,67861,0,0



['VSPD_LIM', 83155]
83155


array(['2', '2', '7', ..., '2', '7', '2'], dtype=object)

(619027, 82)

['7', '1', '5', '2', '4', '0', '3']


Unnamed: 0,Value,Original,Imputed,Difference
0,7.0,109175,134613,25438
1,1.0,78944,81817,2873
2,5.0,102842,102842,0
3,2.0,115321,170165,54844
4,4.0,60440,60440,0
5,0.0,58578,58578,0
6,3.0,10572,10572,0
7,,83155,0,0



['VPROFILE', 84506]
84506


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '2', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,443798,528304,84506
1,2.0,58774,58774,0
2,0.0,31949,31949,0
3,,84506,0,0



['ALC_RES', 106366]
106366


array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

(619027, 82)

['0', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,0.0,510076,616442,106366
1,1.0,2585,2585,0
2,,106366,0,0



['ALC_STATUS', 106366]
106366


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,502521,608887,106366
1,0.0,10140,10140,0
2,,106366,0,0



['VEH_ALCH', 107580]
107580


array(['1', '1', '1', ..., '1', '1', '1'], dtype=object)

(619027, 82)

['1', '0']


Unnamed: 0,Value,Original,Imputed,Difference
0,1.0,495982,603562,107580
1,0.0,15465,15465,0
2,,107580,0,0



['VTRAFWAY', 111189]
111189


array(['0', '0', '0', ..., '0', '0', '0'], dtype=object)

(619027, 82)

['2', '0', '4', '3', '1']


Unnamed: 0,Value,Original,Imputed,Difference
0,2.0,29236,29236,0
1,0.0,229494,336731,107237
2,4.0,43374,43374,0
3,3.0,118518,122470,3952
4,1.0,87216,87216,0
5,,111189,0,0






IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



[[201600014311, 2, 2],
 [201600014315, 4, 4],
 [201600014316, 1, 1],
 [201600014335, 2, 2],
 [201600014586, 1, 1],
 [201600014593, 5, 5],
 [201600014603, 4, 4],
 [201600014610, 2, 2],
 [201600014622, 3, 3],
 [201600014624, 1, 1],
 [201600015222, 2, 2],
 [201600015227, 4, 4],
 [201600015251, 1, 1],
 [201600015256, 2, 2],
 [201600015257, 1, 1],
 [201600015268, 1, 1],
 [201600015305, 3, 3],
 [201600015805, 3, 3],
 [201600015883, 3, 3],
 [201600015924, 3, 3],
 [201600015934, 2, 2],
 [201600015940, 1, 1],
 [201600015944, 2, 2],
 [201600015948, 1, 1],
 [201600015958, 1, 1],
 [201600015960, 2, 2],
 [201600015967, 1, 1],
 [201600016011, 3, 3],
 [201600016014, 2, 2],
 [201600016016, 4, 4],
 [201600016018, 5, 5],
 [201600016026, 1, 1],
 [201600016027, 2, 2],
 [201600016163, 3, 3],
 [201600016164, 2, 2],
 [201600016166, 2, 2],
 [201600016167, 2, 2],
 [201600016169, 1, 1],
 [201600016171, 3, 3],
 [201600016175, 3, 3],
 [201600016176, 3, 3],
 [201600016184, 1, 1],
 [201600016195, 2, 2],
 [201600016


['2', '4', '0', '5', '1', '3', '6', nan]


[['2', 150696, 150963],
 ['4', 73637, 73637],
 ['0', 22512, 22512],
 ['5', 59087, 59088],
 ['1', 106547, 106547],
 ['3', 164271, 165476],
 ['6', 40593, 40804],
 [nan, 0, 0]]


[0, 1]


[[0, 555566, 555566], [1, 63461, 63461]]


['3', '1', '2', nan, '0']


[['3', 441683, 445496],
 ['1', 108464, 108465],
 ['2', 14716, 14716],
 [nan, 0, 0],
 ['0', 50350, 50350]]


[0, 1, 2]


[[0, 194101, 194101], [1, 208215, 208215], [2, 216711, 216711]]


[0, 1, 2]


[[0, 591304, 591304], [1, 26801, 26801], [2, 922, 922]]


[1, 2, 0]


[[1, 200622, 200622], [2, 348822, 348822], [0, 69583, 69583]]


[1, 0, 2]


[[1, 549305, 549305], [0, 60920, 60920], [2, 8802, 8802]]


['1', '0', '2', '3', nan]


[['1', 211368, 251423],
 ['0', 166005, 168063],
 ['2', 52868, 52868],
 ['3', 145252, 146673],
 [nan, 0, 0]]


[0, 1]


[[0, 615928, 615928], [1, 3099, 3099]]


[2, 1]


[[2, 137109, 137109], [1, 481918, 481918]]


[2, 1, 4, 3]


[[2, 431474, 431474], [1, 95724, 95724], [4, 23147, 23147], [3, 68682, 68682]]


['1', '0', nan, '2', '3', '4']


[['1', 432490, 461159],
 ['0', 2787, 2787],
 [nan, 0, 0],
 ['2', 54953, 54953],
 ['3', 89628, 89628],
 ['4', 10500, 10500]]


[1, 0]


[[1, 465260, 465260], [0, 153767, 153767]]


[0, 1, 2, 3]


[[0, 606925, 606925], [1, 11193, 11193], [2, 750, 750], [3, 159, 159]]


[2, 1, 4, 3]


[[2, 423804, 423804],
 [1, 107796, 107796],
 [4, 21733, 21733],
 [3, 65694, 65694]]


[0, 1]


[[0, 603745, 603745], [1, 15282, 15282]]


[0, 1]


[[0, 589234, 589234], [1, 29793, 29793]]


['0', '4', '1', nan, '2', '3']


[['0', 286580, 291576],
 ['4', 28200, 28200],
 ['1', 191273, 191898],
 [nan, 0, 0],
 ['2', 75486, 75486],
 ['3', 31867, 31867]]


[4, 0, 2, 1, 3]


[[4, 132084, 132084],
 [0, 118700, 118700],
 [2, 118444, 118444],
 [1, 115378, 115378],
 [3, 134421, 134421]]


[4, 2, 0, 1, 3]


[[4, 126842, 126842],
 [2, 124293, 124293],
 [0, 120046, 120046],
 [1, 124066, 124066],
 [3, 123780, 123780]]


['4', '0', '1', '3', '2', nan]


[['4', 73402, 73402],
 ['0', 23800, 23800],
 ['1', 114521, 114613],
 ['3', 227013, 228665],
 ['2', 177614, 178547],
 [nan, 0, 0]]


['1', '3', '2', '0', nan]


[['1', 504392, 504461],
 ['3', 51187, 51335],
 ['2', 52012, 52012],
 ['0', 11219, 11219],
 [nan, 0, 0]]


['1', nan, '2', '0', '3']


[['1', 315373, 343683],
 [nan, 0, 0],
 ['2', 165072, 204623],
 ['0', 66083, 66083],
 ['3', 4638, 4638]]


[2016, 2017, 2018, 2019, 2020]


[[2016, 113405, 113405],
 [2017, 133408, 133408],
 [2018, 115774, 115774],
 [2019, 129980, 129980],
 [2020, 126460, 126460]]


[4, 2, 3, 1]


[[4, 102472, 102472],
 [2, 109989, 109989],
 [3, 336845, 336845],
 [1, 69721, 69721]]


[2, 9, 1, 8]


[[2, 461409, 461409], [9, 132204, 132204], [1, 25343, 25343], [8, 71, 71]]


[0, 1, 3, 2, 9, 4, 5, 6]


[[0, 286569, 286569],
 [1, 155312, 155312],
 [3, 64571, 64571],
 [2, 92607, 92607],
 [9, 5621, 5621],
 [4, 11673, 11673],
 [5, 2663, 2663],
 [6, 11, 11]]


[0, 1, 8, 9]


[[0, 437984, 437984], [1, 19460, 19460], [8, 161319, 161319], [9, 264, 264]]


[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]


[[1, 336769, 336769],
 [2, 247932, 247932],
 [3, 27495, 27495],
 [4, 5202, 5202],
 [5, 1107, 1107],
 [6, 299, 299],
 [7, 107, 107],
 [8, 57, 57],
 [9, 27, 27],
 [10, 9, 9],
 [11, 6, 6],
 [12, 4, 4],
 [13, 6, 6],
 [14, 2, 2],
 [15, 5, 5]]


['4', '3', '1', '0', '2', nan]


[['4', 134507, 135540],
 ['3', 137956, 138691],
 ['1', 85650, 86022],
 ['0', 118949, 120370],
 ['2', 138248, 138404],
 [nan, 0, 0]]


['1', '2', '5', '0', nan, '3', '4']


[['1', 217572, 221331],
 ['2', 59376, 59376],
 ['5', 134955, 153055],
 ['0', 46961, 46961],
 [nan, 0, 0],
 ['3', 92952, 92952],
 ['4', 45352, 45352]]


['1', '2', nan, '0']


[['1', 609511, 616201], ['2', 2699, 2699], [nan, 0, 0], ['0', 127, 127]]


['0', '1', nan]


[['0', 585969, 597216], ['1', 21803, 21811], [nan, 0, 0]]


['1', '0', nan]


[['1', 618873, 618893], ['0', 134, 134], [nan, 0, 0]]


['1', nan, '0', '2']


[['1', 611771, 617091], [nan, 0, 0], ['0', 1186, 1186], ['2', 750, 750]]


[1, 0]


[[1, 617742, 617742], [0, 1285, 1285]]


['1', nan, '2', '0']


[['1', 618782, 618887], [nan, 0, 0], ['2', 133, 133], ['0', 7, 7]]


[0, 1]


[[0, 618782, 618782], [1, 245, 245]]


['0', '1', nan]


[['0', 618782, 618812], ['1', 214, 215], [nan, 0, 0]]


['1', '2', '0', nan]


[['1', 618782, 618830], ['2', 156, 156], ['0', 41, 41], [nan, 0, 0]]


['0', '1', nan]


[['0', 586963, 586992], ['1', 32035, 32035], [nan, 0, 0]]


['3', '0', '1', '2', '4', '5', nan]


[['3', 87388, 87388],
 ['0', 57966, 57966],
 ['1', 256647, 269620],
 ['2', 44423, 44423],
 ['4', 135680, 135995],
 ['5', 23635, 23635],
 [nan, 0, 0]]


[1, 2, 0]


[[1, 603256, 603256], [2, 15472, 15472], [0, 299, 299]]


['1', '0', '2', nan]


[['1', 501517, 501604], ['0', 62275, 62419], ['2', 55004, 55004], [nan, 0, 0]]


[2, 4, 3, 1, 0]


[[2, 124713, 124713],
 [4, 124337, 124337],
 [3, 123879, 123879],
 [1, 122924, 122924],
 [0, 123174, 123174]]


['0', '4', '8', '6', nan, '7', '2', '1', '3', '5']


[['0', 120860, 121560],
 ['4', 71217, 71217],
 ['8', 122439, 135067],
 ['6', 81273, 81273],
 [nan, 0, 0],
 ['7', 24997, 24997],
 ['2', 76578, 76578],
 ['1', 86857, 86857],
 ['3', 12328, 12328],
 ['5', 9150, 9150]]


['2', '1', '0', nan]


[['2', 392571, 412084],
 ['1', 104043, 104043],
 ['0', 102900, 102900],
 [nan, 0, 0]]


[1, 4, 3, 0, 2]


[[1, 145569, 145569],
 [4, 136958, 136958],
 [3, 122451, 122451],
 [0, 101920, 101920],
 [2, 112129, 112129]]


['3', '1', '0', nan, '2']


[['3', 392587, 412100],
 ['1', 144920, 144920],
 ['0', 61948, 61948],
 [nan, 0, 0],
 ['2', 59, 59]]


['3', '5', nan, '6', '1', '2', '0', '4']


[['3', 326882, 348270],
 ['5', 62203, 62203],
 [nan, 0, 0],
 ['6', 51666, 51666],
 ['1', 147112, 147866],
 ['2', 6974, 6974],
 ['0', 1565, 1565],
 ['4', 483, 483]]


['5', '1', '2', '4', nan, '0', '3']


[['5', 62674, 62674],
 ['1', 307073, 317007],
 ['2', 62808, 62808],
 ['4', 94273, 94302],
 [nan, 0, 0],
 ['0', 41797, 41797],
 ['3', 40439, 40439]]


['5', '4', '1', '0', '2', '3', nan]


[['5', 123122, 124817],
 ['4', 75942, 75942],
 ['1', 100941, 101245],
 ['0', 122557, 123637],
 ['2', 71215, 71215],
 ['3', 120758, 122171],
 [nan, 0, 0]]


['1', '0', nan]


[['1', 566103, 596172], ['0', 22855, 22855], [nan, 0, 0]]


['2', '3', '1', nan, '4', '0']


[['2', 66116, 66116],
 ['3', 472322, 475142],
 ['1', 66664, 66772],
 [nan, 0, 0],
 ['4', 9782, 9782],
 ['0', 1215, 1215]]


[1, 0]


[[1, 600759, 600759], [0, 18268, 18268]]


['1', '2', nan, '0']


[['1', 599758, 611713], ['2', 4262, 4262], [nan, 0, 0], ['0', 3052, 3052]]


['1', '0', nan]


[['1', 570294, 581872], ['0', 37155, 37155], [nan, 0, 0]]


['0', '1', nan]


[['0', 602193, 603236], ['1', 15791, 15791], [nan, 0, 0]]


['2', '0', nan, '3', '1']


[['2', 341947, 375169],
 ['0', 192534, 195100],
 [nan, 0, 0],
 ['3', 29525, 29525],
 ['1', 19233, 19233]]


['1', '2', nan, '0']


[['1', 512374, 555113], ['2', 15593, 15593], [nan, 0, 0], ['0', 48321, 48321]]


['1', nan, '0']


[['1', 495982, 603562], [nan, 0, 0], ['0', 15465, 15465]]


['1', '2', '0', nan]


[['1', 443798, 528304], ['2', 58774, 58774], ['0', 31949, 31949], [nan, 0, 0]]


['7', '1', '5', '2', '4', '0', nan, '3']


[['7', 109175, 134613],
 ['1', 78944, 81817],
 ['5', 102842, 102842],
 ['2', 115321, 170165],
 ['4', 60440, 60440],
 ['0', 58578, 58578],
 [nan, 0, 0],
 ['3', 10572, 10572]]


['1', '3', '2', nan, '0']


[['1', 493663, 508134],
 ['3', 27839, 27839],
 ['2', 81404, 81404],
 [nan, 0, 0],
 ['0', 1650, 1650]]


[nan, '1', '3', '4', '0', '2']


[[nan, 0, 0],
 ['1', 360589, 400462],
 ['3', 217411, 217519],
 ['4', 413, 413],
 ['0', 598, 598],
 ['2', 35, 35]]


[nan, '1', '2', '3', '0']


[[nan, 0, 0],
 ['1', 360589, 400366],
 ['2', 144621, 144639],
 ['3', 62181, 62181],
 ['0', 11841, 11841]]


[nan, '2', '0', '4', '3', '1']


[[nan, 0, 0],
 ['2', 29236, 29236],
 ['0', 229494, 336731],
 ['4', 43374, 43374],
 ['3', 118518, 122470],
 ['1', 87216, 87216]]


['3', '2', '1', nan, '0', '4']


[['3', 111949, 111949],
 ['2', 355678, 395203],
 ['1', 42134, 42134],
 [nan, 0, 0],
 ['0', 47951, 47951],
 ['4', 21790, 21790]]


['1', '0', nan]


[['1', 445939, 508062], ['0', 110965, 110965], [nan, 0, 0]]


[nan, '0', '1']


[[nan, 0, 0], ['0', 510076, 616442], ['1', 2585, 2585]]


[nan, '1', '0']


[[nan, 0, 0], ['1', 502521, 608887], ['0', 10140, 10140]]


['1', '0', nan]


[['1', 564298, 600261], ['0', 18766, 18766], [nan, 0, 0]]


[0, 5, 8, 6, 3, 9, 2, 1, 4]


[[0, 513792, 513792],
 [5, 51980, 51980],
 [8, 11280, 11280],
 [6, 3744, 3744],
 [3, 26817, 26817],
 [9, 1024, 1024],
 [2, 551, 551],
 [1, 2368, 2368],
 [4, 7471, 7471]]


['3', '1', '0', nan]


[['3', 427555, 448904], ['1', 87001, 87004], ['0', 82997, 83119], [nan, 0, 0]]


[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75]


[[1, 456454, 456454],
 [2, 108263, 108263],
 [3, 34168, 34168],
 [4, 13352, 13352],
 [5, 4499, 4499],
 [6, 1303, 1303],
 [7, 485, 485],
 [8, 201, 201],
 [9, 86, 86],
 [10, 49, 49],
 [11, 40, 40],
 [12, 24, 24],
 [13, 14, 14],
 [14, 11, 11],
 [15, 7, 7],
 [16, 2, 2],
 [17, 2, 2],
 [18, 2, 2],
 [19, 2, 2],
 [20, 2, 2],
 [21, 2, 2],
 [22, 2, 2],
 [23, 2, 2],
 [24, 2, 2],
 [25, 2, 2],
 [26, 2, 2],
 [27, 1, 1],
 [28, 1, 1],
 [29, 1, 1],
 [30, 1, 1],
 [31, 1, 1],
 [32, 1, 1],
 [33, 1, 1],
 [34, 1, 1],
 [35, 1, 1],
 [36, 1, 1],
 [37, 1, 1],
 [38, 1, 1],
 [39, 1, 1],
 [40, 1, 1],
 [41, 1, 1],
 [42, 1, 1],
 [43, 1, 1],
 [44, 1, 1],
 [45, 1, 1],
 [46, 1, 1],
 [47, 1, 1],
 [48, 1, 1],
 [49, 1, 1],
 [50, 1, 1],
 [51, 1, 1],
 [52, 1, 1],
 [53, 1, 1],
 [54, 1, 1],
 [55, 1, 1],
 [56, 1, 1],
 [57, 1, 1],
 [58, 1, 1],
 [59, 1, 1],
 [60, 1, 1],
 [61, 1, 1],
 [62, 1, 1],
 [63, 1, 1],
 [64, 1, 1],
 [65, 1, 1],
 [66, 1, 1],
 [67, 1, 1],
 [68, 1, 1],
 [69, 1, 1],
 [70, 1, 1],
 [71, 1, 1],
 [72, 1, 1],
 [73,


[2, 1, 0]


[[2, 456292, 456292], [1, 162609, 162609], [0, 126, 126]]


[1, 0]


[[1, 574413, 574413], [0, 44614, 44614]]


['1', '0', nan, '2']


[['1', 484005, 539720], ['0', 41923, 41972], [nan, 0, 0], ['2', 37335, 37335]]


['3', '0', '4', '2', '1', nan]


[['3', 456233, 463461],
 ['0', 4617, 4617],
 ['4', 40265, 40401],
 ['2', 25433, 25433],
 ['1', 84659, 85115],
 [nan, 0, 0]]


['1', '0', nan]


[['1', 322987, 345479], ['0', 270461, 273548], [nan, 0, 0]]


['3.0', '0.0', '4.0', '2.0', nan, '1.0']


[['3.0', 90266, 90266],
 ['0.0', 180116, 199282],
 ['4.0', 41959, 41959],
 ['2.0', 132252, 132304],
 [nan, 0, 0],
 ['1.0', 155216, 155216]]




0