# ZerOne Recruitment Assignment
## By Omid Kayhani - Submitted on Jul.11, 2020
This jupyter notebook will solve the problems and present the results step by step.

In [2]:
# Importing the required packages
import pandas as pd
import numpy as np
import researchpy as rpy
from scipy.stats import linregress

## Problem 1: The playing card game
The problem could be solved in two ways: analytical or experimental. The analytical approach did not seem to be so straightforward; therefore, the experimental approach was employed to solve the problem. In this approach, the decks will be shuffled for a number of times (the default is 10,000), then the cards are drawn, and then the point for each game is calculated. It is safe to say that when the number of experiments is high enough, it can resemble an approximation of infinite number of games that is explained by an analytical solution.

Let us first define a function that generates games based on the number of cards and suits in the deck and eventually generates a Pandas dataframe comprising all the games played 

In [3]:
def GamesGen(N, M, NoE=int(1e4)):
    """
    This is a function that generates a pandas dataframe that logs the games played over the designated number of
    experiments with the last column indicating the points acquired in that attempt.
    
    The function accepts the following arguments:
    N = number of cards in the deck
    M = number of suits in the deck
    NoE = number of experiments. Equals to 10000 if not specified.
    
    The code will generate an error if the number of cards (N) is not divisible by the number of suits (M)
    """
    rem = N % M
    if rem != 0:
        raise Exception("Cannot split the cards amongs the suits uniformly. Please indicate another combo.")
    
    # Number of suit members
    nm = int(N/M)
    
    # Number of draws (equal to the number of cards in the deck)
    draws = pd.DataFrame(columns=range(N+1)).drop([0],axis=1)
    
    # making the deck of cards
    deck = np.empty([1,draws.shape[1]], dtype=int)
#     deck = pd.Series([])
    for i in range(M):
        x = np.empty([1,nm],dtype=int)[0,:]
        x.fill(i+1)
        deck[0,np.array(range(nm))+i*nm] = x
    
    # Making a dataframe of all games played
    for i in range(NoE):
        experiment = np.random.permutation(deck.reshape(deck.size,1)).reshape(1,deck.size)
        ex_df = pd.DataFrame(experiment, columns=draws.columns)
        draws = pd.concat([draws, ex_df], axis=0, ignore_index=True)
        
    # Incidents of comparing the drawn card's suit with the previous one's
    incidents = pd.DataFrame(index=draws.index)
    for i in range(N-1):
        x = draws.iloc[:,i+1] - draws.iloc[:,i]
        x = pd.DataFrame(x, columns=[str(i+1)])
        incidents = pd.concat([incidents,x],axis=1)
        
    # Games played and their acquired points
    games = draws
    games['P'] = (incidents[incidents.columns] == 0).sum(axis=1)
    
    return games

### Questions 1, 2, and 5
We can now use the function for M=2 and N=26 and calculate the mean of possible acquired points as well as their standard deviation. The conditional probablity can also be computed at this point.

In [4]:
df1 = GamesGen(26,2)

In [5]:
df1.head()

Unnamed: 0,1,2,3,4,5,6,7,8,9,10,...,18,19,20,21,22,23,24,25,26,P
0,1,2,1,1,1,1,2,2,1,2,...,1,2,1,1,2,2,2,2,1,13
1,2,1,2,2,1,2,2,2,2,2,...,2,1,1,1,1,1,1,2,1,12
2,2,2,1,2,1,1,2,2,1,1,...,1,2,1,1,2,1,2,1,1,10
3,1,2,2,2,1,1,2,1,2,1,...,1,1,2,1,1,2,2,2,2,12
4,1,2,1,1,1,1,2,2,2,2,...,2,2,1,2,1,2,2,1,2,12


In [6]:
print('The mean of points to get for N=26 and M=2 is', df1.P.mean(axis=0))
print('The standard deviation of points to get for N=26 and M=2 is', df1.P.std(axis=0))
print('The conditional probability of P>12 given P>6 is',round(df1[df1.P>12].shape[0]/df1[df1.P>6].shape[0]*100, 2))

The mean of points to get for N=26 and M=2 is 12.0257
The standard deviation of points to get for N=26 and M=2 is 2.511029676647326
The conditional probability of P>12 given P>6 is 42.48


### Questions 3, 4, and 6
This deck has twice the number of cards and suits than the previous one. So, what we can infer is that the mean value of the points to achieve would be the same.

In [7]:
df2 = GamesGen(52,4)

In [8]:
df2.head()

Unnamed: 0,1,2,3,4,5,6,7,8,9,10,...,44,45,46,47,48,49,50,51,52,P
0,4,4,4,4,4,2,4,2,3,2,...,2,3,2,3,1,4,1,1,1,13
1,3,4,2,3,1,4,3,3,4,4,...,1,3,1,1,2,1,1,4,3,15
2,4,4,1,4,2,4,3,3,2,2,...,2,3,2,3,4,2,4,1,3,8
3,2,1,2,3,1,1,2,1,2,2,...,2,4,1,2,4,3,3,3,4,9
4,2,3,3,3,2,4,4,1,1,1,...,4,4,2,3,4,1,2,2,2,16


In [9]:
print('Frequency different points acquired throughout the attempts:')
df2.P.value_counts()

Frequency different points acquired throughout the attempts:


12    1324
11    1228
13    1208
10    1162
14     981
9      836
15     792
8      576
16     549
17     335
7      325
18     212
6      179
19     104
5       68
20      46
21      23
4       22
3       11
22      11
23       4
24       2
28       1
1        1
Name: P, dtype: int64

In [10]:
print('The mean of points to get for N=52 and M=4 is', df2.P.mean(axis=0))
print('The standard deviation of points to get for N=52 and M=4 is', df2.P.std(axis=0))
print('The conditional probability of P>12 given P>6 is',round(df2[df2.P>12].shape[0]/df2[df2.P>6].shape[0]*100, 2))

The mean of points to get for N=52 and M=4 is 12.0361
The standard deviation of points to get for N=52 and M=4 is 3.0334002277468444
The conditional probability of P>12 given P>6 is 43.91


## Problem 2: Traffic stops
The datasets for this problem are the logs of stopped cards in the two states of Montana (MT) and Vermont (VT). The datasets include information regarding the profiles of stopped drivers and the correponding outcome for each stop such an arrest or warning. Let us first load the datasets.

In [11]:
# Reading the datasets
mt = pd.read_csv('MT-clean.csv', low_memory=False)
vt = pd.read_csv('VT-clean.csv',low_memory=False)

### Meet and greet data
Let us first examine the datasets a little.

In [12]:
# The size of the dataset for Montana stops
mt.shape

(825118, 33)

In [13]:
# The size of the dataset for Vermont stops
vt.shape

(283285, 23)

In [14]:
# The attributes logged regarding each stop for Montana state
print(mt.columns)

# A preview of the Montana stops dataset
mt.head()

Index(['id', 'state', 'stop_date', 'stop_time', 'location_raw', 'county_name',
       'county_fips', 'fine_grained_location', 'police_department',
       'driver_gender', 'driver_age_raw', 'driver_age', 'driver_race_raw',
       'driver_race', 'violation_raw', 'violation', 'search_conducted',
       'search_type_raw', 'search_type', 'contraband_found', 'stop_outcome',
       'is_arrested', 'lat', 'lon', 'ethnicity', 'city', 'out_of_state',
       'vehicle_year', 'vehicle_make', 'vehicle_model', 'vehicle_style',
       'search_reason', 'stop_outcome_raw'],
      dtype='object')


Unnamed: 0,id,state,stop_date,stop_time,location_raw,county_name,county_fips,fine_grained_location,police_department,driver_gender,...,lon,ethnicity,city,out_of_state,vehicle_year,vehicle_make,vehicle_model,vehicle_style,search_reason,stop_outcome_raw
0,MT-2009-00001,MT,2009-01-01,02:10,CASCADE,Cascade County,30013.0,US 89 N MM10 (SB),,F,...,-111.802932,N,,False,1994,FORD,EXPLORER,SPORT UTILITY,,"TRAFFIC CITATION,WARNING"
1,MT-2009-00002,MT,2009-01-02,11:34,MISSOULA,Missoula County,30063.0,HWY 93 SO AND ANNS LANE S/B,,M,...,-114.081142,N,,False,1996,GMC,TK,TRUCK,,"INFFRACTION ARREST,WARNING"
2,MT-2009-00003,MT,2009-01-03,11:36,MISSOULA,Missoula County,30063.0,P007 HWY 93 MM 77 N/B,,M,...,-114.073505,N,,False,1999,GMC,YUKON,SPORT UTILITY,,INFFRACTION ARREST
3,MT-2009-00004,MT,2009-01-04,10:33,MISSOULA,Missoula County,30063.0,P007 HWY 93 MM 81 S/B,,F,...,-114.079027,,,False,2002,HOND,CR-V,SPORT UTILITY,,INFFRACTION ARREST
4,MT-2009-00005,MT,2009-01-04,10:46,MISSOULA,Missoula County,30063.0,P007 HWY 93 MM 81 N/B,,M,...,-114.07915,,,False,1992,TOYT,TERCEL,SEDAN,,INFFRACTION ARREST


In [15]:
# The attributes logged regarding each stop for Vermont state
print(vt.columns)

# A preview of the Vermont stops dataset
vt.head()

Index(['id', 'state', 'stop_date', 'stop_time', 'location_raw', 'county_name',
       'county_fips', 'fine_grained_location', 'police_department',
       'driver_gender', 'driver_age_raw', 'driver_age', 'driver_race_raw',
       'driver_race', 'violation_raw', 'violation', 'search_conducted',
       'search_type_raw', 'search_type', 'contraband_found', 'stop_outcome',
       'is_arrested', 'officer_id'],
      dtype='object')


Unnamed: 0,id,state,stop_date,stop_time,location_raw,county_name,county_fips,fine_grained_location,police_department,driver_gender,...,driver_race,violation_raw,violation,search_conducted,search_type_raw,search_type,contraband_found,stop_outcome,is_arrested,officer_id
0,VT-2010-00001,VT,2010-07-01,00:10,East Montpelier,Washington County,50023.0,COUNTY RD,MIDDLESEX VSP,M,...,White,Moving Violation,Moving violation,False,No Search Conducted,,False,Citation,False,-1562157000.0
1,VT-2010-00002,VT,2010-07-01,00:10,,,,COUNTY RD; Fitch Road,MIDDLESEX VSP,F,...,White,Externally Generated Stop,Other,False,No Search Conducted,,False,Arrest for Violation,True,-1562157000.0
2,VT-2010-00003,VT,2010-07-01,00:10,,,,COUNTY RD; Fitch Road,MIDDLESEX VSP,F,...,White,Externally Generated Stop,Other,False,No Search Conducted,,False,Arrest for Violation,True,-1562157000.0
3,VT-2010-00004,VT,2010-07-01,00:11,Whiting,Addison County,50001.0,N MAIN ST,NEW HAVEN VSP,F,...,White,Moving Violation,Moving violation,False,No Search Conducted,,False,Arrest for Violation,True,-312684400.0
4,VT-2010-00005,VT,2010-07-01,00:35,Hardwick,Caledonia County,50005.0,i91 nb mm 62,ROYALTON VSP,M,...,White,Moving Violation,Moving violation,False,No Search Conducted,,False,Written Warning,False,922566100.0


In [16]:
print('Missing values for different attributes of MT stops:')
mt.isnull().sum()

Missing values for different attributes of MT stops:


id                            0
state                         0
stop_date                    11
stop_time                    11
location_raw                  4
county_name                4056
county_fips                4056
fine_grained_location      3741
police_department        825118
driver_gender               119
driver_age_raw                0
driver_age                 3480
driver_race_raw             106
driver_race                2739
violation_raw                73
violation                    73
search_conducted              0
search_type_raw          822092
search_type              822092
contraband_found           3026
stop_outcome                 53
is_arrested                   0
lat                         436
lon                         436
ethnicity                    98
city                     549630
out_of_state               4699
vehicle_year               6610
vehicle_make               4268
vehicle_model             23805
vehicle_style             65362
search_r

In [17]:
print('Missing values for different attributes of VT stops:')
vt.isnull().sum()

Missing values for different attributes of VT stops:


id                            0
state                         0
stop_date                     0
stop_time                     0
location_raw                694
county_name                 705
county_fips                 705
fine_grained_location       347
police_department             0
driver_gender              1712
driver_age_raw             1171
driver_age                 1286
driver_race_raw            3984
driver_race                4817
violation_raw              2178
violation                  2178
search_conducted              0
search_type_raw            2240
search_type              279866
contraband_found             34
stop_outcome               2325
is_arrested                   0
officer_id                   12
dtype: int64

There are missing values present for various attributes of the dataset. We will handle these missing values based on the questions to come upon neccesity.
### Question 1
Proportion of the male stopped drivers in MT.

In [18]:
mt.driver_gender.value_counts()

M    556934
F    268065
Name: driver_gender, dtype: int64

In [19]:
mt_male = mt[mt.driver_gender=='M']

### Question 2
'OOS' stands for 'Out of State'. Some stops led to arrests, while the others did not. Let us first get a glimpse of these for MT.

In [20]:
mt.out_of_state.value_counts()

False    616778
True     203641
Name: out_of_state, dtype: int64

In [21]:
mt.is_arrested.value_counts()

False    807923
True      17195
Name: is_arrested, dtype: int64

Now, we want to see if a plate from outside Montana has more likelihood of resulting an arrest after stop. In order to do so, we need to take a Pearson’s chi-square test of association, which is because our features are of the categorical type.

In [22]:
# Conducting the likelihood ratio chi-square test
crosstab, LR = rpy.crosstab(mt['out_of_state'], mt['is_arrested'], test= "g-test")

LR

Unnamed: 0,G-test,results
0,Log-likelihood ratio ( 1.0) =,125.0526
1,p-value =,0.0
2,Cramer's phi =,0.0123


In [23]:
# Conducting the Pearson's chi-square test of independence
crosstab, chi2, expected = rpy.crosstab(mt['out_of_state'], mt['is_arrested'], test= "chi-square", expected_freqs= True)

chi2

Unnamed: 0,Chi-square test,results
0,Pearson Chi-square ( 1.0) =,128.9325
1,p-value =,0.0
2,Cramer's phi =,0.0125


It is concluded from the infinitesimal p-value that the null-hypothesis of independence is rejected and and these two attributes are correlated.

Now, I must say that the term factor increase is a little vague for me. I would first think of it as percent chnage, but the attributes are categorical (could be resolved by counting values of arrests periodically), and we are comparing two different attributes with each other. Therefore, let us assume it is defined as the ratio of the number of OOS plates for which there was an arrest over the number of OOS plates for which no arrests occured. Using the contingency table derived a while ago, the factor increase would be as follows:

In [24]:
crosstab

Unnamed: 0_level_0,is_arrested,is_arrested,is_arrested
is_arrested,False,True,All
out_of_state,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2
False,604588,12190,616778
True,198773,4868,203641
All,803361,17058,820419


In [25]:
print('Factor increase of arrests from OOS plates:', round(crosstab.iloc[1,1]/crosstab.iloc[1,0],2))

Factor increase of arrests from OOS plates: 0.02


### Question 3
The proportion of the stops in MT in which there was some sort of problem with the speeding of the driver is as below:

In [26]:
mt_speed = mt[mt.violation_raw.str.contains('SPEED',regex=True)==True]

### Question 4
Let us calculate the log likelihood ratio of Driving Under Influence (DUI) on MT over VT.

In [27]:
print('Number of stops due to DUI in MT:')
print(mt.violation.str.contains('DUI',regex=True).value_counts())
print('Number of stops due to DUI in VT:')
print(vt.violation.str.contains('DUI',regex=True).value_counts())

Number of stops due to DUI in MT:
False    816131
True       8914
Name: violation, dtype: int64
Number of stops due to DUI in VT:
False    280358
True        749
Name: violation, dtype: int64


With a similar understanding of the term 'factor increase' as before, we need to calculate the ratio of number of DUI-related stops in MT over the same in VT.

In [28]:
print('The factor increase of DUI-realated stops in MT over VT', round((8914/816131)/(749/280358),2))

The factor increase of DUI-realated stops in MT over VT 4.09


### Question 5
Let us first get the stops in 2020.

In [29]:
type(mt.stop_date[0])

str

In [30]:
mt_2020 = mt[mt.stop_date.str.contains('2020',regex=True)==True]

mt_2020.head()

Unnamed: 0,id,state,stop_date,stop_time,location_raw,county_name,county_fips,fine_grained_location,police_department,driver_gender,...,lon,ethnicity,city,out_of_state,vehicle_year,vehicle_make,vehicle_model,vehicle_style,search_reason,stop_outcome_raw


The latest arrests recorded in the dataset occured in 2016 (mentioned in the README file of the data repo). We can solve the same question for 2016 for the sake of demonstration.

In [31]:
mt_2016 = mt[mt.stop_date.str.contains('2016',regex=True)==True]

mt_2016.head()

Unnamed: 0,id,state,stop_date,stop_time,location_raw,county_name,county_fips,fine_grained_location,police_department,driver_gender,...,lon,ethnicity,city,out_of_state,vehicle_year,vehicle_make,vehicle_model,vehicle_style,search_reason,stop_outcome_raw
723010,MT-2016-000001,MT,2016-01-01,00:02,CASCADE,Cascade County,30013.0,700 BLOCK OF 57TH ST SOUTH,,M,...,-111.215375,N,GREAT FALLS,True,2015,NISSAN,ROGUE,SPORT UTILITY,,WARNING
723011,MT-2016-000002,MT,2016-01-01,00:03,FERGUS,Fergus County,30027.0,MM1 US 191,,F,...,-109.428566,N,NOT IN CITY LIMITS,False,2009,SUBARU (SUBA),FOR,HATCHBACK,,WARNING
723012,MT-2016-000003,MT,2016-01-01,00:10,YELLOWSTONE,Yellowstone County,30111.0,"STATE ST AND SUGAR AVE, BLGS",,F,...,-108.494791,N,,False,1996,MERCURY (MERC),MARQUIS,SEDAN,,"TRAFFIC CITATION,WARNING"
723013,MT-2016-000004,MT,2016-01-01,00:10,LEWIS AND CLARK,Lewis And Clark County,30049.0,N MONTANA AVE SB NEAR TOWNSEND AVE.,,M,...,-112.020485,N,HELENA,False,2010,SUBAR,FORESTER,SPORT UTILITY,,WARNING
723014,MT-2016-000005,MT,2016-01-01,00:11,RAVALLI,Ravalli County,30081.0,US 93 SB MM 53,,M,...,-114.151441,N,,False,2014,HONDA (HOND),ODYSSEY,VAN,,"TRAFFIC CITATION,WARNING"


In [32]:
mt_2016[mt_2016.vehicle_year.notnull()].vehicle_year.astype('int32')

ValueError: invalid literal for int() with base 10: 'UNK'

In [33]:
pd.DataFrame(mt_2016.vehicle_year.value_counts()).loc['UNK']

vehicle_year    195
Name: UNK, dtype: int64

In [34]:
mt_2016[mt_2016.vehicle_year.notnull()].vehicle_year[mt_2016.vehicle_year!='UNK'].astype('int32')

723010    2015
723011    2009
723012    1996
723013    2010
723014    2014
          ... 
825102    1995
825103    1999
825104    1993
825105    1996
825106    1997
Name: vehicle_year, Length: 101297, dtype: int32

In [35]:
print('The aveage of stopped vehicle manufacture year for 2016:',mt_2016[mt_2016.vehicle_year.notnull()].vehicle_year[mt_2016.vehicle_year!='UNK'].astype('int32').mean(axis=0))

The aveage of stopped vehicle manufacture year for 2016: 2005.8721284934402


In order to fit a regression, we are considering a x to be the year of vehicle manufacuring and y to be the number of stops for those vehicles.

In [36]:
xy = pd.DataFrame(mt_2016[mt_2016.vehicle_year.notnull()].vehicle_year[mt_2016.vehicle_year!='UNK'].astype('int32').value_counts())
xy = xy.reset_index().rename(columns={'index':'vehicle_year', 'vehicle_year':'stops'})
x = xy.vehicle_year
y = xy.stops

slope, intercept, r_value, p_value, std_err = linregress(x, y)
print('The p-value for the regression is:', p_value)

The p-value for the regression is: 1.2627895092894874e-13


### Question 5
The question is a little vague, but we can interpret it as the difference between maximum daily stops for each state and the minimum daily stops for that state, which will be implemented as follows:
* First we need to calculate the number of daily stops in each of the datasets. In order to do so, we can get a value count of each 'stop_date' stored in another dataframe. 
* Now we can subtract the max value from the min value for each of the dataframes.


In [37]:
daily_stops_mt = pd.DataFrame(mt.stop_date.value_counts()).sort_values(by=['stop_date'], ascending=False)
daily_stops_vt = pd.DataFrame(vt.stop_date.value_counts()).sort_values(by=['stop_date'], ascending=False)
print('The difference between maximum daily stops and minimum daily stops in MT', max(daily_stops_mt.stop_date)-min(daily_stops_mt.stop_date))
print('The difference between maximum daily stops and minimum daily stops in VT', max(daily_stops_vt.stop_date)-min(daily_stops_vt.stop_date))

The difference between maximum daily stops and minimum daily stops in MT 950
The difference between maximum daily stops and minimum daily stops in VT 488


### Question 6
One can make use of relevant datasets that has the area information for the correponding FIPS codes for the state of Montana. The information, however, can be easily accessed over the below Wikipedia webpage:

https://en.wikipedia.org/wiki/List_of_counties_in_Montana

The answer would be found by a scraping the table in the link above.

In [38]:
stop_counties = pd.DataFrame(mt.county_name.value_counts()).reset_index().drop(columns=['county_name']).rename(columns={"index": "County"})

In [39]:
stop_counties.head()

Unnamed: 0,County
0,Flathead County
1,Gallatin County
2,Yellowstone County
3,Cascade County
4,Missoula County


In [40]:
stop_counties['Area'] = np.zeros(stop_counties.shape[0])
stop_counties.head()

Unnamed: 0,County,Area
0,Flathead County,0.0
1,Gallatin County,0.0
2,Yellowstone County,0.0
3,Cascade County,0.0
4,Missoula County,0.0


In [41]:
page_url = 'https://en.wikipedia.org/wiki/List_of_counties_in_Montana'

table = pd.io.html.read_html(page_url, attrs={'class':'wikitable'})

mt_c = table[0][['County','Area']]
print(mt_c.shape)
mt_c.head()

(56, 2)


Unnamed: 0,County,Area
0,Beaverhead County,"5,543 sq mi(14,356 km2)"
1,Big Horn County,"4,995 sq mi(12,937 km2)"
2,Blaine County,"4,226 sq mi(10,945 km2)"
3,Broadwater County,"1,192 sq mi(3,087 km2)"
4,Carbon County,"2,048 sq mi(5,304 km2)"


In [42]:
for i in range(stop_counties.shape[0]-1):
    if mt_c.County.str.contains(stop_counties.County[i]).any()==True:
        stop_counties.Area[i] = mt_c.Area[mt_c.index[mt_c.County == stop_counties.County[i]][0]]

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_with_indexer(indexer, value)


In [43]:
stop_counties.head()

Unnamed: 0,County,Area
0,Flathead County,"5,099 sq mi(13,206 km2)"
1,Gallatin County,"2,507 sq mi(6,493 km2)"
2,Yellowstone County,"2,635 sq mi(6,825 km2)"
3,Cascade County,"2,698 sq mi(6,988 km2)"
4,Missoula County,"2,598 sq mi(6,729 km2)"


In [44]:
stop_counties.drop(index=stop_counties.index[(stop_counties.Area==0)==True], inplace=True)

In [45]:
stop_counties.sort_values(by=['Area'], ascending=False).head(10)

Unnamed: 0,County,Area
47,Treasure County,"979 sq mi(2,536 km2)"
45,Wibaux County,"889 sq mi(2,302 km2)"
13,Deer Lodge County,"737 sq mi(1,909 km2)"
11,Silver Bow County,"718 sq mi(1,860 km2)"
33,Beaverhead County,"5,543 sq mi(14,356 km2)"
40,Phillips County,"5,140 sq mi(13,313 km2)"
0,Flathead County,"5,099 sq mi(13,206 km2)"
18,Rosebud County,"5,012 sq mi(12,981 km2)"
21,Big Horn County,"4,995 sq mi(12,937 km2)"
29,Valley County,"4,921 sq mi(12,745 km2)"


It can be seen that among the Montana counties in which some traffic stop has been recorded, **Beaverhead County** is **the largest**.