In [75]:
import pandas as pd

neiss_df = pd.read_csv('NEISS2014.csv')
bp_df = pd.read_csv('BodyParts.csv')

# Add bodyParts data to NEISS dataframe
main_df = pd.merge(neiss_df, bp_df, left_on='body_part', right_on='Code')

# Get bodyParts frequency dataframe ordered in descending frequency and rename columns
bp_freq_df = pd.DataFrame(data=main_df['BodyPart'].value_counts()).reset_index().rename(columns={'index': 'BodyPart', 'BodyPart': 'Frequency'})

# Print out BodyParts Frequency dataframe
print(bp_freq_df)


          BodyPart  Frequency
0             Head       9891
1             Face       5786
2           Finger       5783
3     Trunk, lower       5717
4     Trunk, upper       3868
5            Ankle       3781
6             Knee       3616
7             Hand       3369
8             Foot       3090
9         Shoulder       2675
10      Arm, lower       2561
11      Leg, lower       2239
12           Wrist       2116
13           Elbow       1612
14    >50% of body       1422
15             Toe       1280
16           Mouth       1254
17            Neck       1080
18         Eyeball        847
19             Ear        782
20      Leg, upper        756
21      Arm, upper        745
22        Internal        549
23    Not Recorded        390
24    Pubic region        286
25  25-50% of body          4


---
We now have a dataframe that lists in descending order the frequency of bodyparts.

---

In [76]:
# Get the top three body parts most frequently represented in the dataset
bp_freq_df.head(3)

Unnamed: 0,BodyPart,Frequency
0,Head,9891
1,Face,5786
2,Finger,5783


In [77]:
# Get the top three body parts that are least frequently represented in the dataset
bp_freq_df.tail(3)

Unnamed: 0,BodyPart,Frequency
23,Not Recorded,390
24,Pubic region,286
25,25-50% of body,4


### Important note

As we can see, **Not Recorded** and **25-50% of body** are listed as the least frequently represented body parts. But we may consider that they aren't actually "body parts". We can create a new data frame that removes these values from the list like so:

In [78]:
bp_freq_df = bp_freq_df[(bp_freq_df['BodyPart'] != 'Not Recorded') & (bp_freq_df['BodyPart'] != '25-50% of body')]
bp_freq_df.tail(3)

Unnamed: 0,BodyPart,Frequency
21,"Arm, upper",745
22,Internal,549
24,Pubic region,286


# Results

The top 3 most frequently body parts represented in this dataset are **Head**, **Face** and **Finger**:

BodyPart | Frequency
--- | ---
Head | 9891
Face | 5786
Finger | 5783



---

The top 3 less frequently body parts represented in this dataset are **Arm, upper**, **Internal** and **Pubic region**:

BodyPart | Frequency
--- | ---
Arm, upper | 745
Internal | 549
Pubic region | 286

