# Headache
Several variables deal with recording whether the patient had headaches.

*HA_verb* is "Headache at time of ED evaluation?"

*HASeverity* is "Severity of headache"

*HAStart* is "When did the headache start?"

In [1]:
# call the 02-data-cleaning.ipynb notebook to bring the pecarn_tbi dataframe and the cleaned dataframe into the environment
%cd -q ../notebooks
%run ./02-data-cleaning.ipynb
%cd -q -

START: 00-load-raw-data.ipynb
  PECARN TBI data read from c:\Jan\Capstone\data/TBI PUD 10-08-2013.csv into "pecarn_tbi" dataframe
START: 01-data-cleaning.ipynb
  Dropping AgeInMonth
  Renaming AgeinYears to Age
  Dropping AgeTwoPlus
  Dropping EmplType
  Dropping AgeInMonth
  The cleaned dataset is now available in a dataframe named "data"


In [2]:
df = pecarn_tbi[pecarn_tbi['High_impact_InjSev'] != 1]

ha = df[['HA_verb', 'HASeverity', 'HAStart']]

## Missing Data
There are quite a few missing (NaN) values across all three headache variables.

In [3]:
ha.isna().sum()

HA_verb        565
HASeverity     850
HAStart       1114
dtype: int64

## HA_verb
The *HA_verb* is No (0), Yes (1), Pre-verbal/Non-verbal (92), or NaN (missing).

Pre-verbal is marked if the patient is too young to speak.  Non-verbal is marked if the patient is intubated or otherwise unable to give an understandable verbal response.  Pre-verbal and non-verbal were determined by the physician.

Sanity check that when *HA_verb* is NaN the *HASeverity* and *HAStart* is actualy 92.

In [4]:
ha[ha['HA_verb'].isna()].head()

Unnamed: 0_level_0,HA_verb,HASeverity,HAStart
PatNum,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
3,,92,92
193,,92,92
428,,92,92
444,,92,92
526,,92,92


In [5]:
ha[ha['HA_verb'].isna() & ((ha['HASeverity'] != 92) | (ha['HAStart'] != 92))].count()

HA_verb       0
HASeverity    0
HAStart       0
dtype: int64

In [6]:
for col in ha:
    print(ha[col].value_counts())

0     12993
91    11709
1     10610
Name: HA_verb, dtype: int64
92    25267
2      4778
1      4264
3       718
Name: HASeverity, dtype: int64
92    25267
2      8783
3       507
4       140
1        66
Name: HAStart, dtype: int64


## HASeverity

In [7]:
ha[ha['HASeverity'].isna()].head()

Unnamed: 0_level_0,HA_verb,HASeverity,HAStart
PatNum,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
29,1,,
61,1,,2.0
195,1,,2.0
349,1,,
505,1,,2.0


So, the only information in the *HA_verb* variable is the records that indicate Pre-verbal/Non-verbal

An option here might be to encode Pre-verbal/Non-verbal into *HASeverity* or *HAStart*