# Explorando as tabelas de condições e encontros para obter outras *features* interessantes

Leitura dos arquivos .csv de condições e encontros já filtrados para os pacientes de interesse, além dos arquivos com as *features* básicas definidas através do notebook "definition_and_basic_features.ipynb". As colunas referentes a datas também foram convertidas para *datetime*.

In [209]:
import pandas as pd
import numpy as np

#Cenário 01

conditions01_i = pd.read_csv('https://github.com/caiops/P2_IA368X/raw/main/data/interim/interest/conditions01_i.csv')
conditions01_i['START'] = pd.to_datetime(conditions01_i['START'])
conditions01_i['STOP'] = pd.to_datetime(conditions01_i['STOP'])

encounters01_i = pd.read_csv('https://github.com/caiops/P2_IA368X/raw/main/data/interim/interest/encounters01_i.csv')
encounters01_i['START'] = pd.to_datetime(encounters01_i['START'])
encounters01_i['STOP'] = pd.to_datetime(encounters01_i['STOP'])

scenario01 = pd.read_csv('https://github.com/caiops/P2_IA368X/raw/main/data/interim/features/basic_features_01.csv')
scenario01['BIRTHDATE'] = pd.to_datetime(scenario01['BIRTHDATE'])

#Cenário 02

conditions02_i = pd.read_csv('https://github.com/caiops/P2_IA368X/raw/main/data/interim/interest/conditions02_i.csv')
conditions02_i['START'] = pd.to_datetime(conditions02_i['START'])
conditions02_i['STOP'] = pd.to_datetime(conditions02_i['STOP'])

encounters02_i = pd.read_csv('https://github.com/caiops/P2_IA368X/raw/main/data/interim/interest/encounters02_i.csv')
encounters02_i['START'] = pd.to_datetime(encounters02_i['START'])
encounters02_i['STOP'] = pd.to_datetime(encounters02_i['STOP'])

scenario02 = pd.read_csv('https://github.com/caiops/P2_IA368X/raw/main/data/interim/features/basic_features_02.csv')
scenario02['BIRTHDATE'] = pd.to_datetime(scenario02['BIRTHDATE'])

In [210]:
#Obtendo as listas de pacientes
patients01 = scenario01['PATIENT'].tolist()
patients02 = scenario02['PATIENT'].tolist()
#Obtendo as listas de pacientes de interesse que sobreviveram ou não à NF
death01 = scenario01[scenario01['DEATH_FN'] == 1]['PATIENT'].tolist()
death02 = scenario02[scenario02['DEATH_FN'] == 1]['PATIENT'].tolist()
survive01 = scenario01[scenario01['DEATH_FN'] == 0]['PATIENT'].tolist()
survive02 = scenario02[scenario02['DEATH_FN'] == 0]['PATIENT'].tolist()

## Condições

Explorando as condições que os pacientes de interesse apresentaram antes ou na data em que a NF começou.

In [211]:
#Obtendo os dados de NF dos pacientes
nf01 = conditions01_i.loc[conditions01_i['CODE'] == 409089005]
nf02 = conditions02_i.loc[conditions02_i['CODE'] == 409089005]

In [212]:
#Obtendo as condições que os pacientes de interesse apresentaram antes ou no dia que a NF começou
conditions01_i_b = pd.DataFrame(columns=['PATIENT', 'START', 'DESCRIPTION'])
for p in patients01:
  date = nf01.loc[nf01['PATIENT'] == p, 'START'].values[0]
  conditions01_i_b = pd.concat([conditions01_i_b, conditions01_i.query('PATIENT == @p & START <= @date')[['PATIENT', 'START', 'DESCRIPTION']]], ignore_index=True)

display(conditions01_i_b)

conditions02_i_b = pd.DataFrame(columns=['PATIENT', 'START', 'DESCRIPTION'])
for p in patients02:
  date = nf02.loc[nf02['PATIENT'] == p, 'START'].values[0]
  conditions02_i_b = pd.concat([conditions02_i_b, conditions02_i.query('PATIENT == @p & START <= @date')[['PATIENT', 'START', 'DESCRIPTION']]], ignore_index=True)

display(conditions02_i_b)

Unnamed: 0,PATIENT,START,DESCRIPTION
0,4288f90b-4774-c329-3176-c1482e824c04,2012-07-12,Acute myeloid leukemia disease (disorder)
1,4288f90b-4774-c329-3176-c1482e824c04,2012-07-12,Febrile neutropenia (disorder)
2,f03f50be-20b1-3eae-2ed1-bb478bceb320,2013-01-02,Second degree burn
3,f03f50be-20b1-3eae-2ed1-bb478bceb320,2017-10-28,Viral sinusitis (disorder)
4,f03f50be-20b1-3eae-2ed1-bb478bceb320,2018-07-08,Acute myeloid leukemia disease (disorder)
...,...,...,...
730,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1944-04-18,Viral sinusitis (disorder)
731,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1946-04-12,Streptococcal sore throat (disorder)
732,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1947-06-01,Viral sinusitis (disorder)
733,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1948-05-19,Acute myeloid leukemia disease (disorder)


Unnamed: 0,PATIENT,START,DESCRIPTION
0,bb37561b-ba65-7c47-db5b-0641bca883b4,2013-02-08,Viral sinusitis (disorder)
1,bb37561b-ba65-7c47-db5b-0641bca883b4,2013-04-11,Acute allergic reaction
2,bb37561b-ba65-7c47-db5b-0641bca883b4,2014-10-25,Otitis media
3,bb37561b-ba65-7c47-db5b-0641bca883b4,2014-12-19,Fracture of ankle
4,bb37561b-ba65-7c47-db5b-0641bca883b4,2015-02-07,Acute myeloid leukemia disease (disorder)
...,...,...,...
609,2e3402a9-45ff-3558-8450-375f17107ca0,2020-05-28,Fracture subluxation of wrist
610,2e3402a9-45ff-3558-8450-375f17107ca0,2021-06-12,Fracture of forearm
611,2e3402a9-45ff-3558-8450-375f17107ca0,2021-06-24,Acute myeloid leukemia disease (disorder)
612,2e3402a9-45ff-3558-8450-375f17107ca0,2021-06-24,Febrile neutropenia (disorder)


In [213]:
#Verificando quantas vezes as condições apareceram nos conjuntos de dados
cd01 = conditions01_i_b.query('PATIENT in @death01')['DESCRIPTION'].value_counts()
for d in cd01.index:
  print(d, cd01[d])
print()
cs01 = conditions01_i_b.query('PATIENT in @survive01')['DESCRIPTION'].value_counts()
for d in cs01.index:
  print(d, cs01[d])
print()
cd02 = conditions02_i_b.query('PATIENT in @death02')['DESCRIPTION'].value_counts()
for d in cd02.index:
  print(d, cd02[d])
print()
cs02 = conditions02_i_b.query('PATIENT in @survive02')['DESCRIPTION'].value_counts()
for d in cs02.index:
  print(d, cs02[d])

Febrile neutropenia (disorder) 26
Acute myeloid leukemia  disease (disorder) 25
Viral sinusitis (disorder) 21
Otitis media 18
Acute viral pharyngitis (disorder) 11
Bacteremia (finding) 10
Acute bronchitis (disorder) 10
Streptococcal sore throat (disorder) 3
Concussion with no loss of consciousness 2
Risk activity involvement (finding) 2
Sprain of ankle 2
Whiplash injury to neck 1
Perennial allergic rhinitis 1
History of appendectomy 1
Appendicitis 1
Rupture of appendix 1
Acute allergic reaction 1
Chronic sinusitis (disorder) 1
Sinusitis (disorder) 1
Stress (finding) 1
Tear of meniscus of knee 1
Normal pregnancy 1
Laceration of thigh 1
Second degree burn 1
Injury of tendon of the rotator cuff of shoulder 1
Fracture of forearm 1
Received higher education (finding) 1
Unemployed (finding) 1
Social isolation (finding) 1
Sprain of wrist 1

Acute myeloid leukemia  disease (disorder) 113
Febrile neutropenia (disorder) 113
Viral sinusitis (disorder) 44
Acute viral pharyngitis (disorder) 29
Bact

A condição "bacteremia" chamou nossa atenção por aparecer um número razoável de vezes e em proporção maior nos pacientes que morreram com NF.

In [214]:
#Verificando quantas vezes cada paciente teve a "bacteremia"
print(conditions01_i_b.query("DESCRIPTION == 'Bacteremia (finding)'")[['PATIENT', 'DESCRIPTION']].value_counts())
print()
print(conditions02_i_b.query("DESCRIPTION == 'Bacteremia (finding)'")[['PATIENT', 'DESCRIPTION']].value_counts())

PATIENT                               DESCRIPTION         
009c30e1-5fa4-f50c-dfa5-bf3092116bbb  Bacteremia (finding)    1
98342e81-23da-dae8-5506-8da094c7d29e  Bacteremia (finding)    1
ac3634cb-cfcd-6cf8-26c2-3e6436a0504a  Bacteremia (finding)    1
adf63106-46a4-e19c-349a-f0a3eec09e33  Bacteremia (finding)    1
b60922aa-a536-e6a8-9e4b-d523d17b6929  Bacteremia (finding)    1
b7c67b14-bc45-951f-4be5-a7942bfd533a  Bacteremia (finding)    1
bee38fdb-f140-2c23-fe2e-50e0adc00c7b  Bacteremia (finding)    1
c022272d-ecd2-92b6-32f4-3769ef3514f9  Bacteremia (finding)    1
c9176ee2-d94c-6930-d132-0120cf1eff92  Bacteremia (finding)    1
c9e0aadd-a5b9-e1d2-eb34-2d9bdb872a73  Bacteremia (finding)    1
ccecb758-0a9b-9670-23e6-9e405f69a690  Bacteremia (finding)    1
d5c0cc48-5f8a-4533-325c-25a9c3284185  Bacteremia (finding)    1
dd5328d0-4c53-d310-8411-2d3a7fa5ed58  Bacteremia (finding)    1
dd902384-4955-83d4-4b52-7909f0f91e51  Bacteremia (finding)    1
eabfa666-29b5-64f3-105a-e8ea7653ae70  Bactere

In [215]:
#Verificando se todos eles tiveram a "bacteremia" no mesmo dia que a NF

#Listas dos pacientes com "bacteremia"
bacteremia01 = conditions01_i_b.query("DESCRIPTION == 'Bacteremia (finding)'")
bacteremia02 = conditions02_i_b.query("DESCRIPTION == 'Bacteremia (finding)'")
bacteremia01_pl = bacteremia01['PATIENT'].tolist()
bacteremia02_pl = bacteremia02['PATIENT'].tolist()

print('Número de pacientes com bacteremia (cenário 01):')
print(len(bacteremia01_pl))
print('\nNúmero de pacientes com bacteremia (cenário 02):')
print(len(bacteremia02_pl))

#Contando quantos desses pacientes tiveram a condição no dia em que a NF começou e imprimindo aqueles em que isso não ocorreu
cont = 0;
print('\nPacientes que não tiveram a bacteremia no dia em que a NF começou (cenário 01):')
for p in bacteremia01_pl:
  date = nf01.loc[nf01['PATIENT'] == p, 'START'].values[0]
  if bacteremia01[bacteremia01['PATIENT'] == p]['START'].values[0] == date:
    cont += 1
  else:
    print(p)

print('\nNúmero de pacientes que tiveram a bacteremia no dia em que a NF começou (cenário 01):')
print(cont)

cont = 0;
print('\nPacientes que não tiveram a bacteremia no dia em que a NF começou (cenário 02):')
for p in bacteremia02_pl:
  date = nf02.loc[nf02['PATIENT'] == p, 'START'].values[0]
  if bacteremia02[bacteremia02['PATIENT'] == p]['START'].values[0] == date:
    cont += 1
  else:
    print(p)

print('\nNúmero de pacientes que tiveram a bacteremia no dia em que a NF começou (cenário 02):')
print(cont)

Número de pacientes com bacteremia (cenário 01):
37

Número de pacientes com bacteremia (cenário 02):
40

Pacientes que não tiveram a bacteremia no dia em que a NF começou (cenário 01):

Número de pacientes que tiveram a bacteremia no dia em que a NF começou (cenário 01):
37

Pacientes que não tiveram a bacteremia no dia em que a NF começou (cenário 02):

Número de pacientes que tiveram a bacteremia no dia em que a NF começou (cenário 02):
40


In [216]:
#Acrescentando a presenção ou não de bacteremia nas tabelas pro orange:
scenario01 = scenario01.merge(bacteremia01[['PATIENT', 'DESCRIPTION']], on='PATIENT', how='left')

scenario01['BACTEREMIA'] = scenario01['DESCRIPTION'].apply(lambda x: 0 if x != x else 1)
scenario01 = scenario01.drop(['DESCRIPTION'], axis=1)

display(scenario01)

scenario02 = scenario02.merge(bacteremia02[['PATIENT', 'DESCRIPTION']], on='PATIENT', how='left')

scenario02['BACTEREMIA'] = scenario02['DESCRIPTION'].apply(lambda x: 0 if x != x else 1)
scenario02 = scenario02.drop(['DESCRIPTION'], axis=1)

display(scenario02)

Unnamed: 0,PATIENT,BIRTHDATE,RACE,ETHNICITY,GENDER,DEATH_FN,AGE_FN_YEARS,BACTEREMIA
0,4288f90b-4774-c329-3176-c1482e824c04,2010-07-13,white,nonhispanic,M,0,2,0
1,f03f50be-20b1-3eae-2ed1-bb478bceb320,2002-07-12,white,nonhispanic,M,0,16,1
2,11089781-c268-6838-642e-2c2c9edbb694,2011-07-17,white,nonhispanic,F,0,9,0
3,d7b9725d-889f-d178-cf41-dbf8b373eda9,2010-10-18,black,nonhispanic,F,0,6,0
4,ed2bb6aa-1f3c-b72b-46a6-54f05cac7da7,2008-04-09,asian,nonhispanic,M,0,12,0
...,...,...,...,...,...,...,...,...
134,7d87bdff-2df1-8162-80e4-2e406304515d,2011-06-17,white,nonhispanic,M,0,2,0
135,d5c0cc48-5f8a-4533-325c-25a9c3284185,2017-10-05,other,nonhispanic,F,0,3,1
136,a4cabcbc-8282-5599-6b69-01e28e69c045,2004-09-24,black,nonhispanic,M,0,16,0
137,eca28495-500f-ab7e-356b-176f31382569,2017-03-03,white,nonhispanic,F,0,3,0


Unnamed: 0,PATIENT,BIRTHDATE,RACE,ETHNICITY,GENDER,DEATH_FN,AGE_FN_YEARS,BACTEREMIA
0,bb37561b-ba65-7c47-db5b-0641bca883b4,2012-02-08,white,nonhispanic,M,0,3,0
1,678fc07c-1cb1-acc6-3553-d848200626e9,2006-02-11,white,nonhispanic,F,0,16,0
2,04efa71e-b8ed-980b-94a1-cd25e94b6015,2018-04-17,other,nonhispanic,F,0,4,0
3,1e466cbc-5018-4c1a-b132-cf8a4a4d87cf,1997-02-18,white,nonhispanic,F,0,21,1
4,8e9d1dd0-085f-a629-3ea6-9033f330b383,1996-03-07,white,nonhispanic,F,0,20,1
...,...,...,...,...,...,...,...,...
112,84c9e441-fa88-33c8-5d87-91b863998d26,1944-08-04,white,nonhispanic,F,1,2,1
113,7837ca92-1dc3-b3ef-a7f9-4207e439775c,1996-01-13,white,nonhispanic,M,0,17,0
114,3f3ad6c2-337b-9d48-47b7-b2197a0a0500,1969-07-18,white,nonhispanic,F,1,2,1
115,32d8410a-cc4b-4d8b-601d-3ad2b8ef912b,2006-08-23,white,nonhispanic,M,0,9,0


Exploramos outras condições que apareceram em grande número: "Viral sinusitis (disorder)", "Otitis media", "Acute viral pharyngitis (disorder)" e "Acute bronchitis (disorder)".

Verificamos a ocorrência dessas condições até 30 dias antes do começo da NF.

In [217]:
def cond_30days(cond):
  #Pegando a condição específica
  c01 = conditions01_i_b.query("DESCRIPTION == @cond")
  c02 = conditions02_i_b.query("DESCRIPTION == @cond")
  #Pegando a lista de pacientes únicos com essa condição
  c01_pl = c01['PATIENT'].value_counts().index
  c02_pl = c02['PATIENT'].value_counts().index
  
  print('Número de pacientes com a condição (cenário 01):')
  print(len(c01_pl))
  print('\nNúmero de pacientes com a condição (cenário 02):')
  print(len(c02_pl))

  #Pegando só os que tiveram no intervalo de 30 dias antes da NF
  print('\nPacientes com a condição no intervalo de 30 dias antes da NF (cenário 01):')
  for p in c01_pl:
    date = nf01.loc[nf01['PATIENT'] == p, 'START'].values[0]
    date_i = date - np.timedelta64(30,'D')
    if not c01.query('PATIENT == @p & START >= @date_i & START <= @date').empty:
      display(c01.query('PATIENT == @p & START >= @date_i & START <= @date'))

  print('\nPacientes com a condição no intervalo de 30 dias antes da NF (cenário 02):')
  for p in c02_pl:
    date = nf02.loc[nf02['PATIENT'] == p, 'START'].values[0]
    date_i = date - np.timedelta64(30,'D')
    if not c02.query('PATIENT == @p & START >= @date_i & START <= @date').empty:
      display(c02.query('PATIENT == @p & START >= @date_i & START <= @date'))

In [218]:
print('Viral sinusitis (disorder)\n')
cond_30days('Viral sinusitis (disorder)')
print('\nOtitis media\n')
cond_30days('Otitis media')
print('\nAcute viral pharyngitis (disorder)\n')
cond_30days('Acute viral pharyngitis (disorder)')
print('\nAcute bronchitis (disorder)\n')
cond_30days('Acute bronchitis (disorder)')

Viral sinusitis (disorder)

Número de pacientes com a condição (cenário 01):
42

Número de pacientes com a condição (cenário 02):
40

Pacientes com a condição no intervalo de 30 dias antes da NF (cenário 01):


Unnamed: 0,PATIENT,START,DESCRIPTION
458,11d7ecb8-1e4b-ff44-ff40-80755767512c,2013-09-07,Viral sinusitis (disorder)



Pacientes com a condição no intervalo de 30 dias antes da NF (cenário 02):


Unnamed: 0,PATIENT,START,DESCRIPTION
427,9d96f2d8-333d-8d8d-19ba-10ed334c06a5,2013-07-20,Viral sinusitis (disorder)


Unnamed: 0,PATIENT,START,DESCRIPTION
36,8e9d1dd0-085f-a629-3ea6-9033f330b383,2016-02-21,Viral sinusitis (disorder)


Unnamed: 0,PATIENT,START,DESCRIPTION
242,cf9f4af6-2430-13d6-6f3f-605714b8a6ab,2020-09-30,Viral sinusitis (disorder)



Otitis media

Número de pacientes com a condição (cenário 01):
36

Número de pacientes com a condição (cenário 02):
29

Pacientes com a condição no intervalo de 30 dias antes da NF (cenário 01):


Unnamed: 0,PATIENT,START,DESCRIPTION
263,59541c40-a53e-37c0-9703-d88038b6eced,1932-03-26,Otitis media



Pacientes com a condição no intervalo de 30 dias antes da NF (cenário 02):

Acute viral pharyngitis (disorder)

Número de pacientes com a condição (cenário 01):
35

Número de pacientes com a condição (cenário 02):
29

Pacientes com a condição no intervalo de 30 dias antes da NF (cenário 01):


Unnamed: 0,PATIENT,START,DESCRIPTION
152,af0d6b3e-9631-827d-6651-7565821260e2,2016-01-02,Acute viral pharyngitis (disorder)



Pacientes com a condição no intervalo de 30 dias antes da NF (cenário 02):

Acute bronchitis (disorder)

Número de pacientes com a condição (cenário 01):
31

Número de pacientes com a condição (cenário 02):
16

Pacientes com a condição no intervalo de 30 dias antes da NF (cenário 01):


Unnamed: 0,PATIENT,START,DESCRIPTION
357,4b115fe6-e241-6571-e959-e9340b298d75,1961-04-22,Acute bronchitis (disorder)


Unnamed: 0,PATIENT,START,DESCRIPTION
142,e8ee0f85-558b-0c10-9ef8-c7a498983aed,2018-02-20,Acute bronchitis (disorder)



Pacientes com a condição no intervalo de 30 dias antes da NF (cenário 02):


Unnamed: 0,PATIENT,START,DESCRIPTION
463,3edd7198-2494-cc95-d844-ab3a3adb562f,2017-08-09,Acute bronchitis (disorder)


Como foram muito poucas ocorrências dessas condições no intervalo de 30 dias antes da NF, descartamos a ideia de utilizá-las como possíveis *features*.

## Encontros

Explorando as condições que os pacientes de interesse apresentaram antes ou na data em que a NF começou.

In [219]:
#Mudando as datas iniciais dos encontros do formato data/hora para apenas data, de modo a realizar a comparação seguinte
encounters01_i['START'] = encounters01_i['START'].dt.date
encounters01_i['START'] = pd.to_datetime(encounters01_i['START'])

encounters02_i['START'] = encounters02_i['START'].dt.date
encounters02_i['START'] = pd.to_datetime(encounters02_i['START'])

In [220]:
#Obtendo os encontros dos pacientes de interesse antes ou no dia que a NF começou
encounters01_i_b = pd.DataFrame(columns=['PATIENT', 'START', 'ENCOUNTERCLASS', 'DESCRIPTION', 'REASONDESCRIPTION'])
for p in patients01:
  date = nf01.loc[nf01['PATIENT'] == p, 'START'].values[0]
  encounters01_i_b = pd.concat([encounters01_i_b, encounters01_i.query('PATIENT == @p & START <= @date')[['PATIENT', 'START', 'ENCOUNTERCLASS', 'DESCRIPTION', 'REASONDESCRIPTION']]], ignore_index=True)

display(encounters01_i_b)

encounters02_i_b = pd.DataFrame(columns=['PATIENT', 'START', 'ENCOUNTERCLASS', 'DESCRIPTION', 'REASONDESCRIPTION'])
for p in patients02:
  date = nf02.loc[nf02['PATIENT'] == p, 'START'].values[0]
  encounters02_i_b = pd.concat([encounters02_i_b, encounters02_i.query('PATIENT == @p & START <= @date')[['PATIENT', 'START', 'ENCOUNTERCLASS', 'DESCRIPTION', 'REASONDESCRIPTION']]], ignore_index=True)

display(encounters02_i_b)

Unnamed: 0,PATIENT,START,ENCOUNTERCLASS,DESCRIPTION,REASONDESCRIPTION
0,4288f90b-4774-c329-3176-c1482e824c04,2012-06-19,wellness,Well child visit (procedure),
1,4288f90b-4774-c329-3176-c1482e824c04,2012-07-12,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
2,f03f50be-20b1-3eae-2ed1-bb478bceb320,2003-10-07,ambulatory,Encounter for problem,
3,f03f50be-20b1-3eae-2ed1-bb478bceb320,2003-10-23,ambulatory,Encounter for problem,
4,f03f50be-20b1-3eae-2ed1-bb478bceb320,2012-07-20,wellness,Well child visit (procedure),
...,...,...,...,...,...
1874,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1946-04-12,ambulatory,Encounter for symptom,Streptococcal sore throat (disorder)
1875,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1946-06-18,wellness,Well child visit (procedure),
1876,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1947-06-01,ambulatory,Encounter for symptom,Viral sinusitis (disorder)
1877,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1947-06-24,wellness,Well child visit (procedure),


Unnamed: 0,PATIENT,START,ENCOUNTERCLASS,DESCRIPTION,REASONDESCRIPTION
0,bb37561b-ba65-7c47-db5b-0641bca883b4,2012-05-16,wellness,Well child visit (procedure),
1,bb37561b-ba65-7c47-db5b-0641bca883b4,2012-07-18,wellness,Well child visit (procedure),
2,bb37561b-ba65-7c47-db5b-0641bca883b4,2012-10-17,wellness,Well child visit (procedure),
3,bb37561b-ba65-7c47-db5b-0641bca883b4,2013-01-16,wellness,Well child visit (procedure),
4,bb37561b-ba65-7c47-db5b-0641bca883b4,2013-02-08,ambulatory,Encounter for symptom,Viral sinusitis (disorder)
...,...,...,...,...,...
1676,2e3402a9-45ff-3558-8450-375f17107ca0,2020-05-29,emergency,Emergency room admission (procedure),
1677,2e3402a9-45ff-3558-8450-375f17107ca0,2020-07-27,ambulatory,Encounter for 'check-up',Fracture subluxation of wrist
1678,2e3402a9-45ff-3558-8450-375f17107ca0,2021-06-05,wellness,Well child visit (procedure),
1679,2e3402a9-45ff-3558-8450-375f17107ca0,2021-06-13,emergency,Emergency room admission (procedure),


Verificando as descrições e razões dos encontros, por classe.

Cenário 01:

In [221]:
#Ambulatory
encounters01_i_b.query("ENCOUNTERCLASS == 'ambulatory'")[['DESCRIPTION', 'REASONDESCRIPTION']].value_counts()

DESCRIPTION                                                            REASONDESCRIPTION                                  
Encounter for symptom                                                  Viral sinusitis (disorder)                             68
Prenatal visit                                                         Normal pregnancy                                       46
Encounter for symptom                                                  Acute viral pharyngitis (disorder)                     40
                                                                       Acute bronchitis (disorder)                            37
                                                                       Streptococcal sore throat (disorder)                   16
Encounter for problem                                                  Child attention deficit disorder                       15
Hypertension follow-up encounter                                       Hypertension                    

In [222]:
#Emergency
encounters01_i_b.query("ENCOUNTERCLASS == 'emergency'")[['DESCRIPTION', 'REASONDESCRIPTION']].value_counts()

DESCRIPTION                              REASONDESCRIPTION
Obstetric emergency hospital admission   Normal pregnancy     5
Emergency Room Admission                 Appendicitis         2
                                         Seizure disorder     2
Emergency hospital admission for asthma  Childhood asthma     2
dtype: int64

In [223]:
#Inpatient
encounters01_i_b.query("ENCOUNTERCLASS == 'inpatient'")[['DESCRIPTION', 'REASONDESCRIPTION']].value_counts()

DESCRIPTION                        REASONDESCRIPTION                               
Encounter for problem (procedure)  Acute myeloid leukemia  disease (disorder)          120
Encounter Inpatient                Appendicitis                                          2
Encounter for problem (procedure)  Anemia (disorder)                                     1
Non-urgent orthopedic admission    Injury of tendon of the rotator cuff of shoulder      1
dtype: int64

In [224]:
#Wellness
encounters01_i_b.query("ENCOUNTERCLASS == 'wellness'")[['DESCRIPTION']].value_counts()

DESCRIPTION                               
Well child visit (procedure)                  962
General examination of patient (procedure)     25
dtype: int64

In [225]:
#Urgentcare
encounters01_i_b.query("ENCOUNTERCLASS == 'urgentcare'")[['DESCRIPTION']].value_counts()

DESCRIPTION                   
Urgent care clinic (procedure)    13
dtype: int64

Cenário 02:

In [226]:
#Ambulatory
encounters02_i_b.query("ENCOUNTERCLASS == 'ambulatory'")[['DESCRIPTION', 'REASONDESCRIPTION']].value_counts()

DESCRIPTION                                                            REASONDESCRIPTION                                  
Encounter for symptom                                                  Viral sinusitis (disorder)                             56
                                                                       Acute viral pharyngitis (disorder)                     35
Asthma follow-up                                                       Childhood asthma                                       24
Prenatal visit                                                         Normal pregnancy                                       24
Encounter for symptom                                                  Acute bronchitis (disorder)                            21
                                                                       Sinusitis (disorder)                                   14
                                                                       Streptococcal sore throat (disor

In [227]:
#Emergency
encounters02_i_b.query("ENCOUNTERCLASS == 'emergency'")[['DESCRIPTION', 'REASONDESCRIPTION']].value_counts()

DESCRIPTION                              REASONDESCRIPTION
Emergency hospital admission for asthma  Childhood asthma     4
Obstetric emergency hospital admission   Normal pregnancy     3
Emergency Room Admission                 Seizure disorder     2
dtype: int64

In [228]:
#Inpatient
encounters02_i_b.query("ENCOUNTERCLASS == 'inpatient'")[['DESCRIPTION', 'REASONDESCRIPTION']].value_counts()

DESCRIPTION                        REASONDESCRIPTION                           
Encounter for problem (procedure)  Acute myeloid leukemia  disease (disorder)      112
Encounter Inpatient                Chronic intractable migraine without aura         3
                                   Impacted molars                                   2
                                   Chronic pain                                      1
Non-urgent orthopedic admission    Injury of medial collateral ligament of knee      1
dtype: int64

In [229]:
#Wellness
encounters02_i_b.query("ENCOUNTERCLASS == 'wellness'")[['DESCRIPTION']].value_counts()

DESCRIPTION                               
Well child visit (procedure)                  832
General examination of patient (procedure)     19
dtype: int64

In [230]:
#Urgentcare
encounters02_i_b.query("ENCOUNTERCLASS == 'urgentcare'")[['DESCRIPTION']].value_counts()

DESCRIPTION                   
Urgent care clinic (procedure)    11
dtype: int64

Investigamos os encontros do tipo "inpatient" cuja razão foi "Acute myeloid leukemia disease (disorder)", comparando com os dados de condição para a NF.

Cenário 01:

In [231]:
inpatient01 = encounters01_i_b.query("ENCOUNTERCLASS == 'inpatient' & REASONDESCRIPTION == 'Acute myeloid leukemia  disease (disorder)'")
display(inpatient01)
display(nf01)

Unnamed: 0,PATIENT,START,ENCOUNTERCLASS,DESCRIPTION,REASONDESCRIPTION
1,4288f90b-4774-c329-3176-c1482e824c04,2012-07-12,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
15,f03f50be-20b1-3eae-2ed1-bb478bceb320,2018-07-08,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
34,11089781-c268-6838-642e-2c2c9edbb694,2020-07-14,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
46,d7b9725d-889f-d178-cf41-dbf8b373eda9,2016-10-16,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
70,ed2bb6aa-1f3c-b72b-46a6-54f05cac7da7,2020-04-06,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
...,...,...,...,...,...
1818,c022272d-ecd2-92b6-32f4-3769ef3514f9,2013-02-16,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
1823,7d87bdff-2df1-8162-80e4-2e406304515d,2013-06-16,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
1836,d5c0cc48-5f8a-4533-325c-25a9c3284185,2020-10-04,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
1859,eca28495-500f-ab7e-356b-176f31382569,2020-03-02,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)


Unnamed: 0,START,STOP,PATIENT,ENCOUNTER,CODE,DESCRIPTION
1,2012-07-12,2012-07-12,4288f90b-4774-c329-3176-c1482e824c04,7478a188-6e9b-6e3b-6f6c-b4f761647917,409089005,Febrile neutropenia (disorder)
11,2018-07-08,2018-07-08,f03f50be-20b1-3eae-2ed1-bb478bceb320,3952605e-1f86-c0ca-2cd4-526690fa035e,409089005,Febrile neutropenia (disorder)
26,2020-07-14,2020-07-14,11089781-c268-6838-642e-2c2c9edbb694,23f44790-ea99-7c28-af8d-9cdd2ac28a79,409089005,Febrile neutropenia (disorder)
31,2016-10-16,2016-10-16,d7b9725d-889f-d178-cf41-dbf8b373eda9,302a121f-18bc-ce22-5d53-d300b676318a,409089005,Febrile neutropenia (disorder)
38,2020-04-06,2020-04-06,ed2bb6aa-1f3c-b72b-46a6-54f05cac7da7,340d5009-b1c2-6b4e-b723-61e729c0c49f,409089005,Febrile neutropenia (disorder)
...,...,...,...,...,...,...
1356,2013-06-16,2013-06-16,7d87bdff-2df1-8162-80e4-2e406304515d,28aabf74-2e03-26ef-d05c-6620789247a8,409089005,Febrile neutropenia (disorder)
1371,2020-10-04,2020-10-04,d5c0cc48-5f8a-4533-325c-25a9c3284185,159e09f4-e928-9b74-046a-c7a05e56e404,409089005,Febrile neutropenia (disorder)
1377,2020-09-20,2020-09-20,a4cabcbc-8282-5599-6b69-01e28e69c045,10cd460d-c8fe-415e-53c8-d4474b6a3b2e,409089005,Febrile neutropenia (disorder)
1381,2020-03-02,2020-03-02,eca28495-500f-ab7e-356b-176f31382569,5989068e-8c3b-61a3-acb8-49773e26e4bc,409089005,Febrile neutropenia (disorder)


Cenário 02:

In [232]:
inpatient02 = encounters02_i_b.query("ENCOUNTERCLASS == 'inpatient' & REASONDESCRIPTION == 'Acute myeloid leukemia  disease (disorder)'")
display(inpatient02)
display(nf02)

Unnamed: 0,PATIENT,START,ENCOUNTERCLASS,DESCRIPTION,REASONDESCRIPTION
16,bb37561b-ba65-7c47-db5b-0641bca883b4,2015-02-07,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
30,678fc07c-1cb1-acc6-3553-d848200626e9,2022-02-07,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
46,04efa71e-b8ed-980b-94a1-cd25e94b6015,2022-04-16,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
77,1e466cbc-5018-4c1a-b132-cf8a4a4d87cf,2018-02-13,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
109,fe4a9a3a-206c-048a-94ff-5c15a2ce745a,2017-11-09,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
...,...,...,...,...,...
1591,0b299455-7c54-03f3-2812-0d6509d5b286,2016-02-11,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
1602,84c9e441-fa88-33c8-5d87-91b863998d26,1946-08-04,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
1625,7837ca92-1dc3-b3ef-a7f9-4207e439775c,2013-01-08,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)
1635,3f3ad6c2-337b-9d48-47b7-b2197a0a0500,1971-07-18,inpatient,Encounter for problem (procedure),Acute myeloid leukemia disease (disorder)


Unnamed: 0,START,STOP,PATIENT,ENCOUNTER,CODE,DESCRIPTION
5,2015-02-07,2015-02-07,bb37561b-ba65-7c47-db5b-0641bca883b4,2628ed4a-c240-0c28-0462-84c39c3f1dad,409089005,Febrile neutropenia (disorder)
20,2022-02-07,2022-02-07,678fc07c-1cb1-acc6-3553-d848200626e9,398930ec-d19a-7ac2-5524-673994779f88,409089005,Febrile neutropenia (disorder)
24,2022-04-16,2022-04-16,04efa71e-b8ed-980b-94a1-cd25e94b6015,324892c0-ec71-bce5-2128-77987ec564be,409089005,Febrile neutropenia (disorder)
40,2018-02-13,2018-02-13,1e466cbc-5018-4c1a-b132-cf8a4a4d87cf,8c7ab1db-1c49-1274-e079-6fcdde5c3419,409089005,Febrile neutropenia (disorder)
52,2016-03-02,2016-03-02,8e9d1dd0-085f-a629-3ea6-9033f330b383,d1dd8300-d657-0c65-c8d7-444e8da4339c,409089005,Febrile neutropenia (disorder)
...,...,...,...,...,...,...
1093,1946-08-04,NaT,84c9e441-fa88-33c8-5d87-91b863998d26,94679abe-8fc5-27ac-9494-c36aa765f6eb,409089005,Febrile neutropenia (disorder)
1101,2013-01-08,2013-01-08,7837ca92-1dc3-b3ef-a7f9-4207e439775c,3e67f888-861d-6103-c4fe-9eef191b06b1,409089005,Febrile neutropenia (disorder)
1121,1971-07-18,NaT,3f3ad6c2-337b-9d48-47b7-b2197a0a0500,ae91d7a2-204f-62c3-eecc-a13353a190ba,409089005,Febrile neutropenia (disorder)
1128,2015-08-21,2015-08-21,32d8410a-cc4b-4d8b-601d-3ad2b8ef912b,dacf6281-6ef4-52f6-de6a-dd7c8271a4a0,409089005,Febrile neutropenia (disorder)


Pela análise das tabelas anteriores, de modo geral, os pacientes foram internados no mesmo dia em que apresentaram a NF. Assim, não utilizamos os dados de internação para as *features*.

### Wellness

Decidimos investigar a quantidade de vezes que cada paciente passou por um encontro do tipo "wellness".

Cenário 01:

In [233]:
#Número de vezes que cada paciente passou por um encontro do tipo "wellness"
wellness01 = encounters01_i_b.query("ENCOUNTERCLASS == 'wellness'")['PATIENT'].value_counts().reset_index()
wellness01.columns = ['PATIENT', 'WELLNESS_CONT']
wellness01

Unnamed: 0,PATIENT,WELLNESS_CONT
0,cfe7cd0f-57de-4f43-d870-221bba80520b,17
1,6ada49d6-cce2-faea-781d-5f4c282c49b4,17
2,94933579-3900-8b7d-8c05-e473910e791a,16
3,8a9517bb-bd45-da96-f962-668c905ceecb,16
4,2b7f1733-b575-269e-39d0-a96df81c410d,15
...,...,...
131,ae5f15a1-427f-ed8c-f258-f25cfb0fd85a,1
132,adf63106-46a4-e19c-349a-f0a3eec09e33,1
133,1cee6029-ca41-7786-fc4f-04aea9778336,1
134,2dfb6d26-2f71-ecc5-6b9d-ea20dc1ae490,1


In [234]:
#Adicionando a informação se o paciente morreu com NF e a idade em que ele teve NF
wellness01 = wellness01.merge(scenario01[['PATIENT', 'DEATH_FN', 'AGE_FN_YEARS']], on='PATIENT', how='left')
wellness01

Unnamed: 0,PATIENT,WELLNESS_CONT,DEATH_FN,AGE_FN_YEARS
0,cfe7cd0f-57de-4f43-d870-221bba80520b,17,0,19
1,6ada49d6-cce2-faea-781d-5f4c282c49b4,17,1,9
2,94933579-3900-8b7d-8c05-e473910e791a,16,0,19
3,8a9517bb-bd45-da96-f962-668c905ceecb,16,1,7
4,2b7f1733-b575-269e-39d0-a96df81c410d,15,0,18
...,...,...,...,...
131,ae5f15a1-427f-ed8c-f258-f25cfb0fd85a,1,0,0
132,adf63106-46a4-e19c-349a-f0a3eec09e33,1,1,0
133,1cee6029-ca41-7786-fc4f-04aea9778336,1,0,0
134,2dfb6d26-2f71-ecc5-6b9d-ea20dc1ae490,1,0,12


In [235]:
#Verificando de forma simples se existe uma diferença no número de mortos entre os que passaram por mais encontros desse tipo
wellness01_m = wellness01.loc[wellness01['WELLNESS_CONT'] >=8]
print('Número de pacientes que passaram por pelo menos 8 encontros do tipo "wellness":')
print(len(wellness01_m))
print('Número de mortos que passaram por pelo menos 8 encontros do tipo "wellness":')
print(wellness01_m['DEATH_FN'].sum())
wellness01_l = wellness01.loc[wellness01['WELLNESS_CONT'] <8]
print('\nNúmero de pacientes que passaram por menos de 8 encontros do tipo "wellness":')
print(len(wellness01_l))
print('Número de mortos que passaram por menos de 8 encontros do tipo "wellness":')
print(wellness01_l['DEATH_FN'].sum())

Número de pacientes que passaram por pelo menos 8 encontros do tipo "wellness":
68
Número de mortos que passaram por pelo menos 8 encontros do tipo "wellness":
21

Número de pacientes que passaram por menos de 8 encontros do tipo "wellness":
68
Número de mortos que passaram por menos de 8 encontros do tipo "wellness":
5


In [236]:
#Verificando as estatísticas descritivas básicas para as idades em que os pacientes tiveram a NF
print('Estatísticas descritivas das idades dos pacientes que passaram por pelo menos 8 encontros:')
print(wellness01_m['AGE_FN_YEARS'].describe())
print('\nEstatísticas descritivas das idades dos pacientes que passaram por menos de 8 encontros:')
print(wellness01_l['AGE_FN_YEARS'].describe())

Estatísticas descritivas das idades dos pacientes que passaram por pelo menos 8 encontros:
count    68.000000
mean      6.602941
std       5.707130
min       2.000000
25%       2.000000
50%       4.000000
75%       9.000000
max      21.000000
Name: AGE_FN_YEARS, dtype: float64

Estatísticas descritivas das idades dos pacientes que passaram por menos de 8 encontros:
count    68.000000
mean      9.705882
std       7.062744
min       0.000000
25%       3.000000
50%      10.000000
75%      16.000000
max      21.000000
Name: AGE_FN_YEARS, dtype: float64


Cenário 02:

In [237]:
#Número de vezes que cada paciente passou por um encontro do tipo "wellness"
wellness02 = encounters02_i_b.query("ENCOUNTERCLASS == 'wellness'")['PATIENT'].value_counts().reset_index()
wellness02.columns = ['PATIENT', 'WELLNESS_CONT']
wellness02

Unnamed: 0,PATIENT,WELLNESS_CONT
0,d8ffc486-d61a-8dcf-fd09-19b12534eef9,17
1,c8c17f8f-5d4d-5f59-0f25-87ae32e6e16a,17
2,8178457c-4eba-11f7-14a1-21b06e02c761,17
3,c3a6a691-c2b9-a81e-3b0e-46362af0d52e,16
4,1748fa10-1d91-69e5-bc0f-2b4041712e11,16
...,...,...
110,88d26be4-93fc-0254-98a2-4cc3f8259220,1
111,4907e10f-4a61-2ea0-598a-633faa36da2c,1
112,a1462169-9a8a-2090-f8ba-5a156f097e35,1
113,243d3ef4-0fb2-edb0-441f-45b5701bf2db,1


In [238]:
#Adicionando a informação se o paciente morreu com NF e a idade em que ele teve NF
wellness02 = wellness02.merge(scenario02[['PATIENT', 'DEATH_FN', 'AGE_FN_YEARS']], on='PATIENT', how='left')
wellness02

Unnamed: 0,PATIENT,WELLNESS_CONT,DEATH_FN,AGE_FN_YEARS
0,d8ffc486-d61a-8dcf-fd09-19b12534eef9,17,0,18
1,c8c17f8f-5d4d-5f59-0f25-87ae32e6e16a,17,0,9
2,8178457c-4eba-11f7-14a1-21b06e02c761,17,0,8
3,c3a6a691-c2b9-a81e-3b0e-46362af0d52e,16,1,7
4,1748fa10-1d91-69e5-bc0f-2b4041712e11,16,1,7
...,...,...,...,...
110,88d26be4-93fc-0254-98a2-4cc3f8259220,1,0,7
111,4907e10f-4a61-2ea0-598a-633faa36da2c,1,0,5
112,a1462169-9a8a-2090-f8ba-5a156f097e35,1,0,3
113,243d3ef4-0fb2-edb0-441f-45b5701bf2db,1,0,0


In [239]:
#Verificando de forma simples se existe uma diferença no número de mortos entre os que passaram por mais encontros desse tipo
wellness02_m = wellness02.loc[wellness02['WELLNESS_CONT'] >=8]
print('Número de pacientes que passaram por pelo menos 8 encontros do tipo "wellness":')
print(len(wellness02_m))
print('Número de mortos que passaram por pelo menos 8 encontros do tipo "wellness":')
print(wellness02_m['DEATH_FN'].sum())
wellness02_l = wellness02.loc[wellness02['WELLNESS_CONT'] <8]
print('\nNúmero de pacientes que passaram por menos de 8 encontros do tipo "wellness":')
print(len(wellness02_l))
print('Número de mortos que passaram por menos de 8 encontros do tipo "wellness":')
print(wellness02_l['DEATH_FN'].sum())

Número de pacientes que passaram por pelo menos 8 encontros do tipo "wellness":
59
Número de mortos que passaram por pelo menos 8 encontros do tipo "wellness":
11

Número de pacientes que passaram por menos de 8 encontros do tipo "wellness":
56
Número de mortos que passaram por menos de 8 encontros do tipo "wellness":
2


In [240]:
#Verificando as estatísticas descritivas básicas para as idades em que os pacientes tiveram a NF
print('Estatísticas descritivas das idades dos pacientes que passaram por pelo menos 8 encontros:')
print(wellness02_m['AGE_FN_YEARS'].describe())
print('\nEstatísticas descritivas das idades dos pacientes que passaram por menos de 8 encontros:')
print(wellness02_l['AGE_FN_YEARS'].describe())

Estatísticas descritivas das idades dos pacientes que passaram por pelo menos 8 encontros:
count    59.000000
mean      7.152542
std       5.851064
min       2.000000
25%       2.500000
50%       5.000000
75%      10.000000
max      20.000000
Name: AGE_FN_YEARS, dtype: float64

Estatísticas descritivas das idades dos pacientes que passaram por menos de 8 encontros:
count    56.000000
mean      8.607143
std       6.735735
min       0.000000
25%       2.000000
50%       8.000000
75%      13.250000
max      21.000000
Name: AGE_FN_YEARS, dtype: float64


Com base nas análises, decidimos adicionar a contagem de encontros do tipo "wellness" como uma *feature* para a criação do modelo.

In [241]:
#Acrescentando a contagem de encontros do tipo "wellness" nas tabelas pro orange:
scenario01 = scenario01.merge(wellness01[['PATIENT', 'WELLNESS_CONT']], on='PATIENT', how='left')
#Colocando zero para aqueles que não tiveram encontros desse tipo
scenario01['WELLNESS_CONT'] = scenario01['WELLNESS_CONT'].apply(lambda x: 0 if x != x else x)

display(scenario01)

scenario02 = scenario02.merge(wellness02[['PATIENT', 'WELLNESS_CONT']], on='PATIENT', how='left')
#Colocando zero para aqueles que não tiveram encontros desse tipo
scenario02['WELLNESS_CONT'] = scenario02['WELLNESS_CONT'].apply(lambda x: 0 if x != x else x)

display(scenario02)

Unnamed: 0,PATIENT,BIRTHDATE,RACE,ETHNICITY,GENDER,DEATH_FN,AGE_FN_YEARS,BACTEREMIA,WELLNESS_CONT
0,4288f90b-4774-c329-3176-c1482e824c04,2010-07-13,white,nonhispanic,M,0,2,0,1.0
1,f03f50be-20b1-3eae-2ed1-bb478bceb320,2002-07-12,white,nonhispanic,M,0,16,1,6.0
2,11089781-c268-6838-642e-2c2c9edbb694,2011-07-17,white,nonhispanic,F,0,9,0,12.0
3,d7b9725d-889f-d178-cf41-dbf8b373eda9,2010-10-18,black,nonhispanic,F,0,6,0,7.0
4,ed2bb6aa-1f3c-b72b-46a6-54f05cac7da7,2008-04-09,asian,nonhispanic,M,0,12,0,7.0
...,...,...,...,...,...,...,...,...,...
134,7d87bdff-2df1-8162-80e4-2e406304515d,2011-06-17,white,nonhispanic,M,0,2,0,4.0
135,d5c0cc48-5f8a-4533-325c-25a9c3284185,2017-10-05,other,nonhispanic,F,0,3,1,11.0
136,a4cabcbc-8282-5599-6b69-01e28e69c045,2004-09-24,black,nonhispanic,M,0,16,0,8.0
137,eca28495-500f-ab7e-356b-176f31382569,2017-03-03,white,nonhispanic,F,0,3,0,11.0


Unnamed: 0,PATIENT,BIRTHDATE,RACE,ETHNICITY,GENDER,DEATH_FN,AGE_FN_YEARS,BACTEREMIA,WELLNESS_CONT
0,bb37561b-ba65-7c47-db5b-0641bca883b4,2012-02-08,white,nonhispanic,M,0,3,0,9.0
1,678fc07c-1cb1-acc6-3553-d848200626e9,2006-02-11,white,nonhispanic,F,0,16,0,9.0
2,04efa71e-b8ed-980b-94a1-cd25e94b6015,2018-04-17,other,nonhispanic,F,0,4,0,13.0
3,1e466cbc-5018-4c1a-b132-cf8a4a4d87cf,1997-02-18,white,nonhispanic,F,0,21,1,4.0
4,8e9d1dd0-085f-a629-3ea6-9033f330b383,1996-03-07,white,nonhispanic,F,0,20,1,3.0
...,...,...,...,...,...,...,...,...,...
112,84c9e441-fa88-33c8-5d87-91b863998d26,1944-08-04,white,nonhispanic,F,1,2,1,9.0
113,7837ca92-1dc3-b3ef-a7f9-4207e439775c,1996-01-13,white,nonhispanic,M,0,17,0,15.0
114,3f3ad6c2-337b-9d48-47b7-b2197a0a0500,1969-07-18,white,nonhispanic,F,1,2,1,9.0
115,32d8410a-cc4b-4d8b-601d-3ad2b8ef912b,2006-08-23,white,nonhispanic,M,0,9,0,8.0


### Ambulatory

Por fim, investigamos os encontros do tipo "ambulatory".

Cenário 01:

In [242]:
encounters01_i_b.query("ENCOUNTERCLASS == 'ambulatory'")

Unnamed: 0,PATIENT,START,ENCOUNTERCLASS,DESCRIPTION,REASONDESCRIPTION
2,f03f50be-20b1-3eae-2ed1-bb478bceb320,2003-10-07,ambulatory,Encounter for problem,
3,f03f50be-20b1-3eae-2ed1-bb478bceb320,2003-10-23,ambulatory,Encounter for problem,
7,f03f50be-20b1-3eae-2ed1-bb478bceb320,2013-02-03,ambulatory,Encounter for 'check-up',Second degree burn
10,f03f50be-20b1-3eae-2ed1-bb478bceb320,2015-07-17,ambulatory,Allergic disorder initial assessment,
14,f03f50be-20b1-3eae-2ed1-bb478bceb320,2017-10-28,ambulatory,Encounter for symptom,Viral sinusitis (disorder)
...,...,...,...,...,...
1868,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1943-05-03,ambulatory,Encounter for symptom,Acute viral pharyngitis (disorder)
1870,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1943-09-12,ambulatory,Encounter for symptom,Streptococcal sore throat (disorder)
1871,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1944-04-18,ambulatory,Encounter for symptom,Viral sinusitis (disorder)
1874,c815ffa4-2917-2c09-1569-e90d5a89eeb2,1946-04-12,ambulatory,Encounter for symptom,Streptococcal sore throat (disorder)


In [243]:
encounters01_i_b.query("ENCOUNTERCLASS == 'ambulatory'")[['DESCRIPTION','REASONDESCRIPTION']].value_counts()

DESCRIPTION                                                            REASONDESCRIPTION                                  
Encounter for symptom                                                  Viral sinusitis (disorder)                             68
Prenatal visit                                                         Normal pregnancy                                       46
Encounter for symptom                                                  Acute viral pharyngitis (disorder)                     40
                                                                       Acute bronchitis (disorder)                            37
                                                                       Streptococcal sore throat (disorder)                   16
Encounter for problem                                                  Child attention deficit disorder                       15
Hypertension follow-up encounter                                       Hypertension                    

In [244]:
#Número de encontros do tipo "ambulatory" por paciente
ambulatory01 = encounters01_i_b.query("ENCOUNTERCLASS == 'ambulatory'")['PATIENT'].value_counts().reset_index()
ambulatory01.columns = ['PATIENT', 'AMBULATORY_CONT']
ambulatory01

Unnamed: 0,PATIENT,AMBULATORY_CONT
0,a259d333-6bb2-30f5-ecb9-d5ff168905b0,68
1,94933579-3900-8b7d-8c05-e473910e791a,56
2,98342e81-23da-dae8-5506-8da094c7d29e,33
3,160a53cf-63b1-790d-5e95-307e88c1dc9c,18
4,ed2bb6aa-1f3c-b72b-46a6-54f05cac7da7,16
...,...,...
96,3e2371ed-a630-0958-84b2-c136090b8a75,1
97,840526e5-34c6-8620-f27e-f7639bf1c09f,1
98,c8d9f38a-fe45-0b77-3814-6e01a9c59702,1
99,0bf53c97-99af-af4a-9c2a-e80366de43ee,1


In [245]:
#Adicionando a informação se o paciente morreu com NF e a idade em que ele teve NF
ambulatory01 = ambulatory01.merge(scenario01[['PATIENT', 'DEATH_FN', 'AGE_FN_YEARS']], on='PATIENT', how='left')
ambulatory01

Unnamed: 0,PATIENT,AMBULATORY_CONT,DEATH_FN,AGE_FN_YEARS
0,a259d333-6bb2-30f5-ecb9-d5ff168905b0,68,0,21
1,94933579-3900-8b7d-8c05-e473910e791a,56,0,19
2,98342e81-23da-dae8-5506-8da094c7d29e,33,0,21
3,160a53cf-63b1-790d-5e95-307e88c1dc9c,18,1,21
4,ed2bb6aa-1f3c-b72b-46a6-54f05cac7da7,16,0,12
...,...,...,...,...
96,3e2371ed-a630-0958-84b2-c136090b8a75,1,0,2
97,840526e5-34c6-8620-f27e-f7639bf1c09f,1,0,4
98,c8d9f38a-fe45-0b77-3814-6e01a9c59702,1,0,1
99,0bf53c97-99af-af4a-9c2a-e80366de43ee,1,0,2


In [246]:
#Verificando de forma simples se existe uma diferença no número de mortos entre os que passaram por mais encontros desse tipo
threshold = 3
ambulatory01_m = ambulatory01.loc[ambulatory01['AMBULATORY_CONT'] >= threshold]
print(f'Número de pacientes que passaram por pelo menos {threshold} encontros do tipo "ambulatory":')
print(len(ambulatory01_m))
print(f'Número de mortos que passaram por pelo menos {threshold} encontros do tipo "ambulatory":')
print(ambulatory01_m['DEATH_FN'].sum())
ambulatory01_l = ambulatory01.loc[ambulatory01['AMBULATORY_CONT'] < threshold]
print(f'\nNúmero de pacientes que passaram por menos de {threshold} encontros do tipo "ambulatory":')
print(len(ambulatory01_l))
print(f'Número de mortos que passaram por menos de {threshold} encontros do tipo "ambulatory":')
print(ambulatory01_l['DEATH_FN'].sum())

Número de pacientes que passaram por pelo menos 3 encontros do tipo "ambulatory":
45
Número de mortos que passaram por pelo menos 3 encontros do tipo "ambulatory":
8

Número de pacientes que passaram por menos de 3 encontros do tipo "ambulatory":
56
Número de mortos que passaram por menos de 3 encontros do tipo "ambulatory":
10


Cenário 02:

In [247]:
encounters02_i_b.query("ENCOUNTERCLASS == 'ambulatory'")

Unnamed: 0,PATIENT,START,ENCOUNTERCLASS,DESCRIPTION,REASONDESCRIPTION
4,bb37561b-ba65-7c47-db5b-0641bca883b4,2013-02-08,ambulatory,Encounter for symptom,Viral sinusitis (disorder)
7,bb37561b-ba65-7c47-db5b-0641bca883b4,2013-04-12,ambulatory,Encounter for problem,
8,bb37561b-ba65-7c47-db5b-0641bca883b4,2013-04-29,ambulatory,Encounter for problem,
15,bb37561b-ba65-7c47-db5b-0641bca883b4,2015-01-22,ambulatory,Encounter for 'check-up',Fracture of ankle
17,678fc07c-1cb1-acc6-3553-d848200626e9,2012-07-10,ambulatory,Encounter for symptom,Sinusitis (disorder)
...,...,...,...,...,...
1653,32d8410a-cc4b-4d8b-601d-3ad2b8ef912b,2014-04-30,ambulatory,Encounter for symptom,Acute viral pharyngitis (disorder)
1655,32d8410a-cc4b-4d8b-601d-3ad2b8ef912b,2014-06-18,ambulatory,Asthma follow-up,Childhood asthma
1661,2e3402a9-45ff-3558-8450-375f17107ca0,2017-02-22,ambulatory,Encounter for symptom,Acute viral pharyngitis (disorder)
1663,2e3402a9-45ff-3558-8450-375f17107ca0,2017-05-09,ambulatory,Encounter for symptom,Acute viral pharyngitis (disorder)


In [248]:
encounters02_i_b.query("ENCOUNTERCLASS == 'ambulatory'")[['DESCRIPTION','REASONDESCRIPTION']].value_counts()

DESCRIPTION                                                            REASONDESCRIPTION                                  
Encounter for symptom                                                  Viral sinusitis (disorder)                             56
                                                                       Acute viral pharyngitis (disorder)                     35
Asthma follow-up                                                       Childhood asthma                                       24
Prenatal visit                                                         Normal pregnancy                                       24
Encounter for symptom                                                  Acute bronchitis (disorder)                            21
                                                                       Sinusitis (disorder)                                   14
                                                                       Streptococcal sore throat (disor

In [249]:
#Número de encontros do tipo "ambulatory" por paciente
ambulatory02 = encounters02_i_b.query("ENCOUNTERCLASS == 'ambulatory'")['PATIENT'].value_counts().reset_index()
ambulatory02.columns = ['PATIENT', 'AMBULATORY_CONT']
ambulatory02

Unnamed: 0,PATIENT,AMBULATORY_CONT
0,cd1762bb-df22-3c40-7ff3-6a4b118a4985,68
1,c5afd70c-1466-5035-af33-d51418577f7a,45
2,03edc846-5e5d-00d6-e7a5-7e687f084b40,44
3,1e466cbc-5018-4c1a-b132-cf8a4a4d87cf,21
4,d8ffc486-d61a-8dcf-fd09-19b12534eef9,19
...,...,...
77,e1b08886-e93d-ff5b-5c80-dacbbd8375c0,1
78,6b17b083-39c0-f107-e905-ee89ad02d0ed,1
79,9fcdfd12-39a9-a5a9-cbe8-6b3f394663ec,1
80,7191d0ae-428f-da64-469a-82e9fb352db6,1


In [250]:
#Adicionando a informação se o paciente morreu com NF e a idade em que ele teve NF
ambulatory02 = ambulatory02.merge(scenario02[['PATIENT', 'DEATH_FN', 'AGE_FN_YEARS']], on='PATIENT', how='left')
ambulatory02

Unnamed: 0,PATIENT,AMBULATORY_CONT,DEATH_FN,AGE_FN_YEARS
0,cd1762bb-df22-3c40-7ff3-6a4b118a4985,68,0,19
1,c5afd70c-1466-5035-af33-d51418577f7a,45,0,19
2,03edc846-5e5d-00d6-e7a5-7e687f084b40,44,0,18
3,1e466cbc-5018-4c1a-b132-cf8a4a4d87cf,21,0,21
4,d8ffc486-d61a-8dcf-fd09-19b12534eef9,19,0,18
...,...,...,...,...
77,e1b08886-e93d-ff5b-5c80-dacbbd8375c0,1,0,5
78,6b17b083-39c0-f107-e905-ee89ad02d0ed,1,0,6
79,9fcdfd12-39a9-a5a9-cbe8-6b3f394663ec,1,0,12
80,7191d0ae-428f-da64-469a-82e9fb352db6,1,0,1


In [251]:
#Verificando de forma simples se existe uma diferença no número de mortos entre os que passaram por mais encontros desse tipo
threshold = 3
ambulatory02_m = ambulatory02.loc[ambulatory02['AMBULATORY_CONT'] >= threshold]
print(f'Número de pacientes que passaram por pelo menos {threshold} encontros do tipo "ambulatory":')
print(len(ambulatory02_m))
print(f'Número de mortos que passaram por pelo menos {threshold} encontros do tipo "ambulatory":')
print(ambulatory02_m['DEATH_FN'].sum())
ambulatory02_l = ambulatory02.loc[ambulatory02['AMBULATORY_CONT'] < threshold]
print(f'\nNúmero de pacientes que passaram por menos de {threshold} encontros do tipo "ambulatory":')
print(len(ambulatory02_l))
print(f'Número de mortos que passaram por menos de {threshold} encontros do tipo "ambulatory":')
print(ambulatory02_l['DEATH_FN'].sum())

Número de pacientes que passaram por pelo menos 3 encontros do tipo "ambulatory":
42
Número de mortos que passaram por pelo menos 3 encontros do tipo "ambulatory":
2

Número de pacientes que passaram por menos de 3 encontros do tipo "ambulatory":
40
Número de mortos que passaram por menos de 3 encontros do tipo "ambulatory":
6


Mesmo considerando valores diferentes de corte, não parece haver uma relação entre o número de encontros ambulatoriais e o número de pacientes mortos com NF, ao menos não diretamente.

Considerando que muitos desses encontros não parecem ter relação com o diagnóstico de câncer ou a NF (por exemplo, encontros por fratura, queimadura ou gravidez), focamos nos encontros ambulatoriais relacionados a sintomas - que também foram os mais comuns.

Cenário 01:

In [252]:
#Número de encontros do tipo "ambulatory" e descrição "Encounter for symptom" por paciente
ambulatory01_symptom = encounters01_i_b.query("ENCOUNTERCLASS == 'ambulatory' & DESCRIPTION == 'Encounter for symptom'")['PATIENT'].value_counts().reset_index()
ambulatory01_symptom.columns = ['PATIENT', 'AMBULATORY_SYMPTOM_CONT']
ambulatory01_symptom

Unnamed: 0,PATIENT,AMBULATORY_SYMPTOM_CONT
0,c815ffa4-2917-2c09-1569-e90d5a89eeb2,7
1,ccecb758-0a9b-9670-23e6-9e405f69a690,7
2,eabfa666-29b5-64f3-105a-e8ea7653ae70,7
3,6ada49d6-cce2-faea-781d-5f4c282c49b4,5
4,3075db0f-918e-3d24-0782-5b359d23e357,5
...,...,...
85,659eac70-eee6-e516-e203-51ad92734512,1
86,8b35b4c6-aa09-d002-619b-ac63d58d4603,1
87,b54ad3df-4750-0941-f1ae-8b7236f8f6dc,1
88,a08eb05d-46f4-c6cf-feeb-b0ac56f61458,1


In [253]:
#Adicionando a informação se o paciente morreu com NF e a idade em que ele teve NF
ambulatory01_symptom = ambulatory01_symptom.merge(scenario01[['PATIENT', 'DEATH_FN', 'AGE_FN_YEARS']], on='PATIENT', how='left')
ambulatory01_symptom

Unnamed: 0,PATIENT,AMBULATORY_SYMPTOM_CONT,DEATH_FN,AGE_FN_YEARS
0,c815ffa4-2917-2c09-1569-e90d5a89eeb2,7,1,15
1,ccecb758-0a9b-9670-23e6-9e405f69a690,7,1,12
2,eabfa666-29b5-64f3-105a-e8ea7653ae70,7,1,13
3,6ada49d6-cce2-faea-781d-5f4c282c49b4,5,1,9
4,3075db0f-918e-3d24-0782-5b359d23e357,5,1,12
...,...,...,...,...
85,659eac70-eee6-e516-e203-51ad92734512,1,0,4
86,8b35b4c6-aa09-d002-619b-ac63d58d4603,1,0,21
87,b54ad3df-4750-0941-f1ae-8b7236f8f6dc,1,0,14
88,a08eb05d-46f4-c6cf-feeb-b0ac56f61458,1,0,20


In [254]:
#Verificando de forma simples se existe uma diferença no número de mortos entre os que passaram por mais encontros desse tipo
threshold = 3
ambulatory01_symptom_m = ambulatory01_symptom.loc[ambulatory01_symptom['AMBULATORY_SYMPTOM_CONT'] >= threshold]
print(f'Número de pacientes que passaram por pelo menos {threshold} encontros do tipo "ambulatory" com descrição "Encounter for symptom":')
print(len(ambulatory01_symptom_m))
print(f'Número de mortos que passaram por pelo menos {threshold} encontros do tipo "ambulatory" com descrição "Encounter for symptom":')
print(ambulatory01_symptom_m['DEATH_FN'].sum())
ambulatory01_symptom_l = ambulatory01_symptom.loc[ambulatory01_symptom['AMBULATORY_SYMPTOM_CONT'] < threshold]
print(f'\nNúmero de pacientes que passaram por menos de {threshold} encontros do tipo "ambulatory" com descrição "Encounter for symptom":')
print(len(ambulatory01_symptom_l))
print(f'Número de mortos que passaram por menos de {threshold} encontros do tipo "ambulatory" com descrição "Encounter for symptom":')
print(ambulatory01_symptom_l['DEATH_FN'].sum())

Número de pacientes que passaram por pelo menos 3 encontros do tipo "ambulatory" com descrição "Encounter for symptom":
26
Número de mortos que passaram por pelo menos 3 encontros do tipo "ambulatory" com descrição "Encounter for symptom":
7

Número de pacientes que passaram por menos de 3 encontros do tipo "ambulatory" com descrição "Encounter for symptom":
64
Número de mortos que passaram por menos de 3 encontros do tipo "ambulatory" com descrição "Encounter for symptom":
9


Cenário 02:

In [255]:
#Número de encontros do tipo "ambulatory" e descrição "Encounter for symptom" por paciente
ambulatory02_symptom = encounters02_i_b.query("ENCOUNTERCLASS == 'ambulatory' & DESCRIPTION == 'Encounter for symptom'")['PATIENT'].value_counts().reset_index()
ambulatory02_symptom.columns = ['PATIENT', 'AMBULATORY_SYMPTOM_CONT']
ambulatory02_symptom

Unnamed: 0,PATIENT,AMBULATORY_SYMPTOM_CONT
0,1e466cbc-5018-4c1a-b132-cf8a4a4d87cf,10
1,8178457c-4eba-11f7-14a1-21b06e02c761,10
2,3edd7198-2494-cc95-d844-ab3a3adb562f,6
3,fe9e8ecf-79e1-545d-a28b-26274781bd97,5
4,c8c17f8f-5d4d-5f59-0f25-87ae32e6e16a,4
...,...,...
68,0ffcd85c-f4c2-de73-7f70-10169af7fa24,1
69,774dd3ba-4ad9-d7b5-6cc1-1fd279c4a2ba,1
70,e3dc2754-0b44-08f0-daa3-25dcaef3709d,1
71,2a372f10-22e0-8773-72e1-108775c5aee9,1


In [256]:
#Adicionando a informação se o paciente morreu com NF e a idade em que ele teve NF
ambulatory02_symptom = ambulatory02_symptom.merge(scenario02[['PATIENT', 'DEATH_FN', 'AGE_FN_YEARS']], on='PATIENT', how='left')
ambulatory02_symptom

Unnamed: 0,PATIENT,AMBULATORY_SYMPTOM_CONT,DEATH_FN,AGE_FN_YEARS
0,1e466cbc-5018-4c1a-b132-cf8a4a4d87cf,10,0,21
1,8178457c-4eba-11f7-14a1-21b06e02c761,10,0,8
2,3edd7198-2494-cc95-d844-ab3a3adb562f,6,0,19
3,fe9e8ecf-79e1-545d-a28b-26274781bd97,5,0,12
4,c8c17f8f-5d4d-5f59-0f25-87ae32e6e16a,4,0,9
...,...,...,...,...
68,0ffcd85c-f4c2-de73-7f70-10169af7fa24,1,0,19
69,774dd3ba-4ad9-d7b5-6cc1-1fd279c4a2ba,1,0,2
70,e3dc2754-0b44-08f0-daa3-25dcaef3709d,1,0,19
71,2a372f10-22e0-8773-72e1-108775c5aee9,1,0,3


In [257]:
#Verificando de forma simples se existe uma diferença no número de mortos entre os que passaram por mais encontros desse tipo
threshold = 3
ambulatory02_symptom_m = ambulatory02_symptom.loc[ambulatory02_symptom['AMBULATORY_SYMPTOM_CONT'] >= threshold]
print(f'Número de pacientes que passaram por pelo menos {threshold} encontros do tipo "ambulatory" com descrição "Encounter for symptom":')
print(len(ambulatory02_symptom_m))
print(f'Número de mortos que passaram por pelo menos {threshold} encontros do tipo "ambulatory" com descrição "Encounter for symptom":')
print(ambulatory02_symptom_m['DEATH_FN'].sum())
ambulatory02_symptom_l = ambulatory02_symptom.loc[ambulatory02_symptom['AMBULATORY_SYMPTOM_CONT'] < threshold]
print(f'\nNúmero de pacientes que passaram por menos de {threshold} encontros do tipo "ambulatory" com descrição "Encounter for symptom":')
print(len(ambulatory02_symptom_l))
print(f'Número de mortos que passaram por menos de {threshold} encontros do tipo "ambulatory" com descrição "Encounter for symptom":')
print(ambulatory02_symptom_l['DEATH_FN'].sum())

Número de pacientes que passaram por pelo menos 3 encontros do tipo "ambulatory" com descrição "Encounter for symptom":
19
Número de mortos que passaram por pelo menos 3 encontros do tipo "ambulatory" com descrição "Encounter for symptom":
2

Número de pacientes que passaram por menos de 3 encontros do tipo "ambulatory" com descrição "Encounter for symptom":
54
Número de mortos que passaram por menos de 3 encontros do tipo "ambulatory" com descrição "Encounter for symptom":
5


No cenário 01 encontramos uma proporção maior de pacientes que morreram com NF e passaram por mais encontros do tipo ambulatorial relacionados a sintomas. Essa mesma relação não foi observada no cenário 02. Ainda assim, optamos por adicionar a contagem de encontros ambulatoriais relacionados a sintomas às *features* para serem utilizadas no modelo.

In [258]:
#Acrescentando a contagem de encontros do tipo "ambulatory" com descrição "Encounter for symptom" nas tabelas pro orange:
scenario01 = scenario01.merge(ambulatory01_symptom[['PATIENT', 'AMBULATORY_SYMPTOM_CONT']], on='PATIENT', how='left')
#Colocando zero para aqueles que não tiveram encontros desse tipo
scenario01['AMBULATORY_SYMPTOM_CONT'] = scenario01['AMBULATORY_SYMPTOM_CONT'].apply(lambda x: 0 if x != x else x)

display(scenario01)

scenario02 = scenario02.merge(ambulatory02_symptom[['PATIENT', 'AMBULATORY_SYMPTOM_CONT']], on='PATIENT', how='left')
#Colocando zero para aqueles que não tiveram encontros desse tipo
scenario02['AMBULATORY_SYMPTOM_CONT'] = scenario02['AMBULATORY_SYMPTOM_CONT'].apply(lambda x: 0 if x != x else x)

display(scenario02)

Unnamed: 0,PATIENT,BIRTHDATE,RACE,ETHNICITY,GENDER,DEATH_FN,AGE_FN_YEARS,BACTEREMIA,WELLNESS_CONT,AMBULATORY_SYMPTOM_CONT
0,4288f90b-4774-c329-3176-c1482e824c04,2010-07-13,white,nonhispanic,M,0,2,0,1.0,0.0
1,f03f50be-20b1-3eae-2ed1-bb478bceb320,2002-07-12,white,nonhispanic,M,0,16,1,6.0,1.0
2,11089781-c268-6838-642e-2c2c9edbb694,2011-07-17,white,nonhispanic,F,0,9,0,12.0,2.0
3,d7b9725d-889f-d178-cf41-dbf8b373eda9,2010-10-18,black,nonhispanic,F,0,6,0,7.0,1.0
4,ed2bb6aa-1f3c-b72b-46a6-54f05cac7da7,2008-04-09,asian,nonhispanic,M,0,12,0,7.0,2.0
...,...,...,...,...,...,...,...,...,...,...
134,7d87bdff-2df1-8162-80e4-2e406304515d,2011-06-17,white,nonhispanic,M,0,2,0,4.0,0.0
135,d5c0cc48-5f8a-4533-325c-25a9c3284185,2017-10-05,other,nonhispanic,F,0,3,1,11.0,1.0
136,a4cabcbc-8282-5599-6b69-01e28e69c045,2004-09-24,black,nonhispanic,M,0,16,0,8.0,0.0
137,eca28495-500f-ab7e-356b-176f31382569,2017-03-03,white,nonhispanic,F,0,3,0,11.0,0.0


Unnamed: 0,PATIENT,BIRTHDATE,RACE,ETHNICITY,GENDER,DEATH_FN,AGE_FN_YEARS,BACTEREMIA,WELLNESS_CONT,AMBULATORY_SYMPTOM_CONT
0,bb37561b-ba65-7c47-db5b-0641bca883b4,2012-02-08,white,nonhispanic,M,0,3,0,9.0,1.0
1,678fc07c-1cb1-acc6-3553-d848200626e9,2006-02-11,white,nonhispanic,F,0,16,0,9.0,2.0
2,04efa71e-b8ed-980b-94a1-cd25e94b6015,2018-04-17,other,nonhispanic,F,0,4,0,13.0,2.0
3,1e466cbc-5018-4c1a-b132-cf8a4a4d87cf,1997-02-18,white,nonhispanic,F,0,21,1,4.0,10.0
4,8e9d1dd0-085f-a629-3ea6-9033f330b383,1996-03-07,white,nonhispanic,F,0,20,1,3.0,1.0
...,...,...,...,...,...,...,...,...,...,...
112,84c9e441-fa88-33c8-5d87-91b863998d26,1944-08-04,white,nonhispanic,F,1,2,1,9.0,1.0
113,7837ca92-1dc3-b3ef-a7f9-4207e439775c,1996-01-13,white,nonhispanic,M,0,17,0,15.0,2.0
114,3f3ad6c2-337b-9d48-47b7-b2197a0a0500,1969-07-18,white,nonhispanic,F,1,2,1,9.0,0.0
115,32d8410a-cc4b-4d8b-601d-3ad2b8ef912b,2006-08-23,white,nonhispanic,M,0,9,0,8.0,4.0


Salvamos essas tabelas com as *features* iniciais em formato .csv.

In [259]:
scenario01.to_csv('features_conditions_encounters_01.csv', index = False)
scenario02.to_csv('features_conditions_encounters_02.csv', index = False)

Por fim, salvamos também as tabelas de condições filtradas contendo apenas os dados de NF.

In [260]:
nf01.to_csv('nf01.csv', index = False)
nf02.to_csv('nf02.csv', index = False)