---

_You are currently looking at **version 1.1** of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the [Jupyter Notebook FAQ](https://www.coursera.org/learn/python-text-mining/resources/d9pwm) course resource._

---

# Assignment 1

In this assignment, you'll be working with messy medical data and using regex to extract relevant infromation from the data. 

Each line of the `dates.txt` file corresponds to a medical note. Each note has a date that needs to be extracted, but each date is encoded in one of many formats.

The goal of this assignment is to correctly identify all of the different date variants encoded in this dataset and to properly normalize and sort the dates. 

Here is a list of some of the variants you might encounter in this dataset:
* 04/20/2009; 04/20/09; 4/20/09; 4/3/09
* Mar-20-2009; Mar 20, 2009; March 20, 2009;  Mar. 20, 2009; Mar 20 2009;
* 20 Mar 2009; 20 March 2009; 20 Mar. 2009; 20 March, 2009
* Mar 20th, 2009; Mar 21st, 2009; Mar 22nd, 2009
* Feb 2009; Sep 2009; Oct 2010
* 6/2008; 12/2009
* 2009; 2010

Once you have extracted these date patterns from the text, the next step is to sort them in ascending chronological order accoring to the following rules:
* Assume all dates in xx/xx/xx format are mm/dd/yy
* Assume all dates where year is encoded in only two digits are years from the 1900's (e.g. 1/5/89 is January 5th, 1989)
* If the day is missing (e.g. 9/2009), assume it is the first day of the month (e.g. September 1, 2009).
* If the month is missing (e.g. 2010), assume it is the first of January of that year (e.g. January 1, 2010).
* Watch out for potential typos as this is a raw, real-life derived dataset.

With these rules in mind, find the correct date in each note and return a pandas Series in chronological order of the original Series' indices.

For example if the original series was this:

    0    1999
    1    2010
    2    1978
    3    2015
    4    1985

Your function should return this:

    0    2
    1    4
    2    0
    3    1
    4    3

Your score will be calculated using [Kendall's tau](https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient), a correlation measure for ordinal data.

*This function should return a Series of length 500 and dtype int.*

In [255]:
import pandas as pd
import re

doc = []
with open('dates.txt') as file:
    for line in file:
        doc.append(line)

df = pd.Series(doc)
df.head(5)

0         03/25/93 Total time of visit (in minutes):\n
1                       6/18/85 Primary Care Doctor:\n
2    sshe plans to move as of 7/8/71 In-Home Servic...
3                7 on 9/27/75 Audit C Score Current:\n
4    2/6/96 sleep studyPain Treatment Pain Level (N...
dtype: object

In [324]:
def date_sorter():
    
    # DONE 04/20/2009; 04/20/09; 4/20/09; 4/3/09
    regex1 = '(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})'
    # DONE Mar-20-2009; Mar 20, 2009; March 20, 2009; Mar. 20, 2009; Mar 20 2009;
    # DONE Mar 20th, 2009; Mar 21st, 2009; Mar 22nd, 2009
    regex2 = '((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[- ,.] ?\d?\d(?:st|nd|rd|th)?[- ,.] ?\d{2,4})'
    # DONE 20 Mar 2009; 20 March 2009; 20 Mar. 2009; 20 March, 2009
    regex3 = '((?:(?:\d?\d(?:st|nd|rd|th)?)[- ,.]? ?)(?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)[- ,.] ?(?:\d{0,3}\d))'
    # DONE Feb 2009; Sep 2009; Oct 2010
    regex4 = '((?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)[- ,.] ?(?:\d{0,3}\d))'
    # DONE 6/2008; 12/2009
    regex5 = '((?:\d{1,2})[/](?:\d{4}))'
    # DONE 1998
    regex6 = '(([1|2]\d{3}))'

    complete = f'({regex1}|{regex2}|{regex3}|{regex4}|{regex5}|{regex6})'

    parsed_dates = df.str.extractall(complete)
    dict = {'Decemeber':'December', 'Janaury':'January'}

    parsed_dates.iloc[:, 0].replace(dict, inplace=True)

    parsed_dates = parsed_dates.loc[:, 0].str.strip().str.replace(r'[\.,]', '').str.replace(r'(- )', '')

    parsed_dates = parsed_dates.replace('Decemeber 1978', 'December 1978').replace('Janaury 1993', 'January 1993').replace('2June, 1999', 'June, 1999').replace(r'[.]', '')

    parsed_dates[321] = 'June 1999'

    finale = pd.to_datetime(parsed_dates.loc[:, 0])
    
    return finale.sort_values().reset_index()['index']

In [325]:
date_sorter()

0        9
1       84
2        2
3       53
4       28
5      474
6      153
7       13
8      129
9       98
10     111
11     225
12      31
13     171
14     191
15     486
16     335
17     415
18      36
19     405
20     323
21     422
22     375
23     380
24     345
25      57
26     481
27     436
28     104
29     299
      ... 
470    220
471    208
472    243
473    139
474    320
475    383
476    244
477    286
478    480
479    431
480    279
481    198
482    381
483    463
484    366
485    439
486    255
487    401
488    475
489    257
490    152
491    235
492    464
493    253
494    427
495    231
496    141
497    186
498    161
499    413
Name: index, Length: 500, dtype: int64

In [317]:
parsed_dates.shape

(501,)

In [323]:
finale.shape

(500,)

In [3]:
# DONE 04/20/2009; 04/20/09; 4/20/09; 4/3/09
df.str.extractall(r'(?P<date>(?P<day>\d?\d)[/-](?P<month>\d?\d)[/-](?P<year>\d{2,4}))')

Unnamed: 0_level_0,Unnamed: 1_level_0,date,day,month,year
Unnamed: 0_level_1,match,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0,0,03/25/93,03,25,93
1,0,6/18/85,6,18,85
2,0,7/8/71,7,8,71
3,0,9/27/75,9,27,75
4,0,2/6/96,2,6,96
5,0,7/06/79,7,06,79
6,0,5/18/78,5,18,78
7,0,10/24/89,10,24,89
8,0,3/7/86,3,7,86
9,0,4/10/71,4,10,71


In [191]:
len(res)

34

In [4]:
# DONE Mar-20-2009; Mar 20, 2009; March 20, 2009; Mar. 20, 2009; Mar 20 2009;
# 20 Mar 2009; 20 March 2009; 20 Mar. 2009; 20 March, 2009
# DONE Mar 20th, 2009; Mar 21st, 2009; Mar 22nd, 2009
#Feb 2009; Sep 2009; Oct 2010
pattern0 = '(?P<date>(?P<month>(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)[- ,.] ?(?P<day>\d?\d(?:st|nd|rd|th)?)[- ,.] ?(?P<year>\d{0,3}\d))'
df.str.extractall(pattern0, re.I)

Unnamed: 0_level_0,Unnamed: 1_level_0,date,month,day,year
Unnamed: 0_level_1,match,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
194,0,"April 11, 1990",April,11,1990
195,0,"May 30, 2001",May,30,2001
196,0,"Feb 18, 1994",Feb,18,1994
197,0,"February 18, 1981",February,18,1981
198,0,"October. 11, 2013",October,11,2013
199,0,Jan 24 1986,Jan,24,1986
200,0,"July 26, 1978",July,26,1978
201,0,"December 23, 1999",December,23,1999
202,0,"May 15, 1989",May,15,1989
203,0,"September 06, 1995",September,6,1995


In [185]:
df.iloc[354]

'Deviated septum, 3/1993 Activities of Daily Living (ADL) Bathing: Independent\n'

In [5]:
# DONE 20 Mar 2009; 20 March 2009; 20 Mar. 2009; 20 March, 2009
# DONE Feb 2009; Sep 2009; Oct 2010
pattern1 = '(?P<date>(?:(?P<day>\d?\d(?:st|nd|rd|th)?)?[- ,.]? ?)?(?P<month>(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)[- ,.] ?(?P<year>\d{0,3}\d))'
df.str.extractall(pattern1)

Unnamed: 0_level_0,Unnamed: 1_level_0,date,day,month,year
Unnamed: 0_level_1,match,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
125,0,24 Jan 2001,24,Jan,2001
126,0,10 Sep 2004,10,Sep,2004
127,0,26 May 1982,26,May,1982
128,0,28 June 2002,28,June,2002
129,0,06 May 1972,06,May,1972
130,0,25 Oct 1987,25,Oct,1987
131,0,14 Oct 1996,14,Oct,1996
132,0,30 Nov 2007,30,Nov,2007
133,0,28 June 1994,28,June,1994
134,0,14 Jan 1981,14,Jan,1981


In [6]:
# 6/2008; 12/2009
# 2009; 2010
df.str.extractall(r'(?P<date>(?:^| )(?P<day>\d{1,2})[/](?P<year>\d{4}))')

Unnamed: 0_level_0,Unnamed: 1_level_0,date,day,year
Unnamed: 0_level_1,match,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
343,0,6/1998,6,1998
344,0,6/2005,6,2005
345,0,10/1973,10,1973
346,0,9/2005,9,2005
347,0,03/1980,03,1980
348,0,12/2005,12,2005
349,0,5/1987,5,1987
350,0,5/2004,5,2004
351,0,8/1974,8,1974
352,0,3/1986,3,1986


In [136]:
df.iloc[195]

'MRI May 30, 2001 empty sella but no problems with endocrine functionPertinent Medical Review of Systems Constitutional:\n'

In [88]:
df.str.extractall(r'(?P<date>(?:^| )(?P<year>[1|2]\d{3})(?:\.| |,|$))')

Unnamed: 0_level_0,Unnamed: 1_level_0,date,year
Unnamed: 0_level_1,match,Unnamed: 2_level_1,Unnamed: 3_level_1
125,0,2001.,2001
126,0,2004,2004
127,0,1982,1982
128,0,2002,2002
129,0,1972,1972
130,0,1987,1987
131,0,1996,1996
132,0,2007,2007
133,0,1994,1994
134,0,1981,1981


In [295]:
# DONE 04/20/2009; 04/20/09; 4/20/09; 4/3/09
regex1 = '(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})'
# DONE Mar-20-2009; Mar 20, 2009; March 20, 2009; Mar. 20, 2009; Mar 20 2009;
# DONE Mar 20th, 2009; Mar 21st, 2009; Mar 22nd, 2009
regex2 = '((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[- ,.] ?\d?\d(?:st|nd|rd|th)?[- ,.] ?\d{4})'
# DONE 20 Mar 2009; 20 March 2009; 20 Mar. 2009; 20 March, 2009
regex3 = '((?:(?:\d?\d(?:st|nd|rd|th)?)[- ,.]? ?)(?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)[- ,.] ?(?:\d{0,3}\d))'
# DONE Feb 2009; Sep 2009; Oct 2010
regex4 = '((?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)[- ,.] ?(?:\d{0,3}\d))'
# DONE 6/2008; 12/2009
regex5 = '((?:^| )(?:\d{1,2})[/](?:\d{4}))'
# DONE 1998
regex6 = '((?:^| )([1|2]\d{3})(?:\.| |,|$))'

complete = f'({regex1}|{regex2}|{regex3}|{regex4}|{regex5}|{regex6})'

parsed_dates = df.str.extractall(complete)

dict = {'Decemeber':'December', 'Janaury':'January'}

parsed_dates.iloc[:, 0].replace(dict, inplace=True)

parsed_dates = parsed_dates.loc[:, 0].str.strip().str.replace(r'[\.,]', '').str.replace(r'(- )', '')

parsed_dates = parsed_dates.replace('Decemeber 1978', 'December 1978').replace('Janaury 1993', 'January 1993').replace('2June, 1999', 'June, 1999').replace(r'[.]', '')

parsed_dates[321] = 'June 1999'

finale = pd.to_datetime(parsed_dates.loc[:, 0])



In [303]:
finale.reset_index()[(finale.reset_index()['index'] -finale.reset_index()['index'] -1) != -1]

Unnamed: 0,index,0


In [231]:
parsed_dates = parsed_dates.replace('Decemeber 1978', 'December 1978').replace('Janaury 1993', 'January 1993').replace('2June, 1999', 'June, 1999').replace(r'[.]', '')


In [279]:
# DONE 04/20/2009; 04/20/09; 4/20/09; 4/3/09
regex10 = '(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})'
# DONE Mar-20-2009; Mar 20, 2009; March 20, 2009; Mar. 20, 2009; Mar 20 2009;
# DONE Mar 20th, 2009; Mar 21st, 2009; Mar 22nd, 2009
regex20 = '((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[- ,.] ?\d?\d(?:st|nd|rd|th)?[- ,.] ?\d{4})'
# DONE 20 Mar 2009; 20 March 2009; 20 Mar. 2009; 20 March, 2009
regex30 = '((?:(?:\d?\d(?:st|nd|rd|th)?)[- ,.]? ?)(?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)[- ,.] ?(?:\d{0,3}\d))'
# DONE Feb 2009; Sep 2009; Oct 2010
regex40 = '((?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)[- ,.] ?(?:\d{0,3}\d))'
# DONE 6/2008; 12/2009
regex50 = '((?:^| )(?:\d{1,2})[/](?:\d{4}))'
# DONE 1998
regex6 = '((?:^| )([1|2]\d{3})(?:\.| |,|$))'

In [287]:
df.str.extractall(regex4).shape, df.str.extractall(regex40).shape

((218, 1), (218, 1))

In [281]:

complete = f'({regex30}|{regex40})'

df.str.extractall(complete).shape

(218, 3)

In [278]:
parsed_dates[194]

match
0    April 11
1        1990
Name: 0, dtype: object

In [228]:
df[321]

'2June, 1999 Audit C Score Current:\n'

In [232]:
parsed_dates[321]

match
0    2 June 1999
Name: 0, dtype: object

In [192]:
parsed_dates

     match
0    0          03/25/93
1    0           6/18/85
2    0            7/8/71
3    0           9/27/75
4    0            2/6/96
5    0           7/06/79
6    0           5/18/78
7    0          10/24/89
8    0            3/7/86
9    0           4/10/71
10   0           5/11/85
11   0           4/09/75
12   0           8/01/98
13   0           1/26/72
14   0         5/24/1990
15   0         1/25/2011
16   0           4/12/82
17   0        10/13/1976
18   0           4/24/98
19   0           5/21/77
20   0           7/21/98
21   0          10/21/79
22   0           3/03/90
23   0           2/11/76
24   0        07/25/1984
25   0           4-13-82
26   0           9/22/89
27   0           9/02/76
28   0           9/12/71
29   0          10/24/86
                 ...    
460  0              2012
461  0              1991
463  0              2014
464  0              2016
465  0              1976
467  0              2011
468  0              1997
469  0              2003
471  0        

In [240]:
parsed_dates.to_csv('test.csv')

In [234]:
pd.to_datetime(parsed_dates.iloc[460:, ])

460   1994-01-01
461   2004-12-01
462   2003-03-01
463   1991-07-01
464   1982-07-01
465   1984-01-01
466   2000-01-01
467   2001-01-01
468   1982-01-01
469   1998-01-01
470   2012-01-01
471   1991-01-01
472   2014-01-01
473   2016-01-01
474   1976-01-01
475   2011-01-01
476   1997-01-01
477   2003-01-01
478   1999-01-01
479   1972-01-01
480   2015-01-01
481   1989-01-01
482   1994-01-01
483   1996-01-01
484   2013-01-01
485   1995-01-01
486   2004-01-01
487   1987-01-01
488   1973-01-01
489   1992-01-01
490   1977-01-01
491   1985-01-01
492   2007-01-01
493   2009-01-01
494   1986-01-01
495   2002-01-01
496   1979-01-01
497   2008-01-01
498   2005-01-01
499   1980-01-01
Name: 0, dtype: datetime64[ns]

In [288]:
# DONE 04/20/2009; 04/20/09; 4/20/09; 4/3/09 # (\.|,|\(|\)|$| )
regex1 = "(\d{1,2}[/-](\d{1,2})[/-](\d{2}(\d{2})?))"
ref1 = df.str.extractall(regex1)
ref10 = ref1[ref1.iloc[:, 1].astype('int') <= 31]
dt1 = pd.to_datetime(ref10.loc[:, 0])

In [289]:
# DONE Mar-20-2009; Mar 20, 2009; March 20, 2009; Mar. 20, 2009; Mar 20 2009;
# DONE Mar 20th, 2009; Mar 21st, 2009; Mar 22nd, 2009
regex3 = '((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*[- ,.] ?\d?\d(?:st|nd|rd|th)?[- ,.] ?\d{2,4})'
ref3 = df.str.extractall(regex3)
dt3 = pd.to_datetime(ref3.iloc[:, 0])



In [188]:
df[72] = df[72].replace('4.9/36/308', '')

In [180]:
parsed_dates.drop([72][0]).loc[69:75, ]

    match
69  0        11/3/1985
70  0          7/04/82
71  0          4-13-89
73  0          4/12/74
74  0         09/19/81
75  0           9/6/79
Name: 0, dtype: object

In [290]:
## DONE 20 Mar 2009; 20 March 2009; 20 Mar. 2009; 20 March, 2009
# DONE 20 Mar 2009; 20 March 2009; 20 Mar. 2009; 20 March, 2009
# DONE Feb 2009; Sep 2009; Oct 2010
#regex20 = '((?:(?:\d?\d(?:st|nd|rd|th)?)?[- ,.]? ?)?(?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)[- ,.] ?(?:\d{0,3}\d))'
regex2 = '(?P<date>(?:(?:\d?\d(?:st|nd|rd|th)?)[- ,.]? ?)?(?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)(?:[\. ,])? ?(?:\d{2,4}))'

ref2 = df.str.extractall(regex2)
ref2 = ref2.replace('Decemeber 1978', 'December 1978').replace('Janaury 1993', 'January 1993').replace('2June, 1999', 'June, 1999').replace(r'[.]', '')
#ref2.drop([321], inplace=True)
ref2 = ref2.apply(lambda x: x.str.replace(r'[,.]', ''))
#dt2 = pd.to_datetime(ref2.loc[:, 'date'])
#

In [266]:
ref2

Unnamed: 0_level_0,Unnamed: 1_level_0,date
Unnamed: 0_level_1,match,Unnamed: 2_level_1
125,0,24 Jan 2001
126,0,10 Sep 2004
127,0,26 May 1982
128,0,28 June 2002
129,0,06 May 1972
130,0,25 Oct 1987
131,0,14 Oct 1996
132,0,30 Nov 2007
133,0,28 June 1994
134,0,14 Jan 1981


In [145]:
df.loc[192]

'06 May 1993 CPT Code: 90792: With medical services\n'

In [148]:
df.loc[193]

's 22 year old single Caucasian/Latino woman, unemployed Cook recent college graduate, living along with pet rabbit, with long history of depression with SA but never hospitalized, as well as obesity, hypothyroidism, and PCO, referred by new NMH PCP Dr. Evelyn Julian for urgent evaluation and treatment till first visit with colleague, Dr.  Inez Burns, on 18 Jan 1995.\n'

In [147]:
df.loc[194]

'April 11, 1990 CPT Code: 90791: No medical services\n'

In [155]:
df.loc[199]

'.Came back to US on Jan 24 1986, saw Dr. Quackenbush at Beaufort Memorial Hospital.  Checked VPA level and found it to be therapeutic and confirmed BPAD dx.  Also, has a general physician exam and found to be in good general health, except for being slightly overwt.\n'

In [291]:
# DONE 6/2008; 12/2009
regex4 = '((?:^| )(?:\d{1,2})[/](?:\d{4}))'
ref4 = df.str.extractall(regex4)
dt4 = pd.to_datetime(ref4.loc[:, 0])

In [292]:
# DONE 1998
regex5 = '((?:^| )([1|2]\d{3})(?:\.| |,|$))'
ref5 = df.str.extractall(regex5)
dt5 = pd.to_datetime(ref5.loc[455:, 0])

In [293]:
frames = [dt1, dt2, dt3, dt4, dt5]

In [107]:
ttg = pd.concat(frames).reset_index()

In [113]:
ttg[(ttg['level_0'] - ttg['level_0'] -1) == -1]

Unnamed: 0,level_0,match,0
0,0,0,1993-03-25
1,1,0,1985-06-18
2,2,0,1971-07-08
3,3,0,1975-09-27
4,4,0,1996-02-06
5,5,0,1979-07-06
6,6,0,1978-05-18
7,7,0,1989-10-24
8,8,0,1986-03-07
9,9,0,1971-04-10


In [4]:
# DONE 20 Mar 2009; 20 March 2009; 20 Mar. 2009; 20 March, 2009
# DONE Feb 2009; Sep 2009; Oct 2010
regex4 = '((?:(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*)[- ,.] ?(?:\d{0,3}\d))'
ref4 = df.str.extractall(regex4)
ref4 = ref4.replace('Decemeber 1978', 'December 1978').replace('Janaury 1993', 'January 1993')

In [97]:
pd.DataFrame(dt2).reset_index()['level_0'][(pd.DataFrame(dt2).reset_index()['level_0'] - pd.DataFrame(dt2).reset_index()['level_0'] - 1) == -1]

0      125
1      126
2      127
3      128
4      129
5      130
6      131
7      132
8      133
9      134
10     135
11     136
12     137
13     138
14     139
15     140
16     141
17     142
18     143
19     144
20     145
21     146
22     147
23     148
24     149
25     150
26     151
27     152
28     153
29     154
      ... 
154    313
155    314
156    315
157    316
158    317
159    318
160    319
161    320
162    321
163    322
164    323
165    324
166    325
167    326
168    327
169    328
170    329
171    330
172    331
173    332
174    333
175    334
176    335
177    336
178    337
179    338
180    339
181    340
182    341
183    342
Name: level_0, Length: 184, dtype: int64

In [294]:
leg = 0
for i in frames:
    leg += len(i)
print(leg)

485


In [93]:
len(dt1)

125

In [127]:
df.str.extractall(regex20).shape, df.str.extractall(regex2).shape

((218, 1), (218, 1))

193    s 22 year old single Caucasian/Latino woman, u...
194    April 11, 1990 CPT Code: 90791: No medical ser...
195    MRI May 30, 2001 empty sella but no problems w...
196    .Feb 18, 1994: made a phone call to Mom and Mo...
197    Brother died February 18, 1981 Parental/Caregi...
198    none; but currently has appt with new HJH PCP ...
199    .Came back to US on Jan 24 1986, saw Dr. Quack...
200    July 26, 1978 Total time of visit (in minutes):\n
201    father was depressed inpatient at DFC December...
202                   May 15, 1989 SOS-10 Total Score:\n
203    September 06, 1995 Total time of visit (in min...
204    Mar. 10, 1976 CPT Code: 90791: No medical serv...
205                    .Got back to U.S. Jan 27, 1983.\n
206    Queen Hamilton in Bonita Springs courthouse.  ...
207    r August 12 2004 - diagnosed with Parkinson's ...
208                            September 01, 2012 Age:\n
209    July 25, 1983 Total time of visit (in minutes):\n
210    August 11, 1989 Total ti

In [134]:
df

0           03/25/93 Total time of visit (in minutes):\n
1                         6/18/85 Primary Care Doctor:\n
2      sshe plans to move as of 7/8/71 In-Home Servic...
3                  7 on 9/27/75 Audit C Score Current:\n
4      2/6/96 sleep studyPain Treatment Pain Level (N...
5                      .Per 7/06/79 Movement D/O note:\n
6      4, 5/18/78 Patient's thoughts about current su...
7      10/24/89 CPT Code: 90801 - Psychiatric Diagnos...
8                           3/7/86 SOS-10 Total Score:\n
9               (4/10/71)Score-1Audit C Score Current:\n
10     (5/11/85) Crt-1.96, BUN-26; AST/ALT-16/22; WBC...
11                         4/09/75 SOS-10 Total Score:\n
12     8/01/98 Communication with referring physician...
13     1/26/72 Communication with referring physician...
14     5/24/1990 CPT Code: 90792: With medical servic...
15     1/25/2011 CPT Code: 90792: With medical servic...
16           4/12/82 Total time of visit (in minutes):\n
17          1; 10/13/1976 Audit