A family book is a table that list the names of all people in the city, their basic information and their direct parent. 

Assuming there can be no two people with the same name, each person can only have one parent listed in the family book and one parent can have multiple children.

You are a doctor who will be seeing a lot of patients every week. In your hospital, you have a record of past patients and their medical information. By referring to both the family book and the past records, you would like to profile your upcoming patients by looking at an existing medical record of their closest ancestor (parent / parent of parent) in case there is possibility of a hereditary disease.


Example of family tree:
```
                A (has record)
                |
                B 
               / \ 
 (has record) C   D (has record)
             / \ 
            E   F
                 \ 
                  G (upcoming patient)
```
For upcoming patient G, even though A, C and D (ancestors/relatives) have medical record, we are only interested in the closest one which is C.

Additionally, for upcoming patient G, although F is its direct parent, there are no past medical records for F. Therefore, we need to keep searching patient G's ancestry to find the closest ancestor with a medical record, which is C.

**Download Files**

In [1]:
import requests

url = 'https://drive.google.com/uc?export=download&id=19YZ4Bsj5nKn2PlZ54DlfHyTyz8IiwnXT'
req = requests.get(url)
with open('./family_book.csv', 'wb') as f:
    f.write(req.content)

url = 'https://drive.google.com/uc?export=download&id=1-26M72xLOQM8J90YEw-yIAL10JYuFCpr'
req = requests.get(url)
with open('./past_medical_record.csv', 'wb') as f:
    f.write(req.content)

url = 'https://drive.google.com/uc?export=download&id=1--9lQVCZc1VOt589ods1QyTkbtYXbsJe'
req = requests.get(url)
with open('./upcoming_patients.csv', 'wb') as f:
    f.write(req.content)

In [2]:
import pandas as pd

In [3]:
family_book = pd.read_csv('./family_book.csv')

In [4]:
family_book.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 4 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   FirstName        10000 non-null  object
 1   LastName         10000 non-null  object
 2   ParentFirstName  6911 non-null   object
 3   ParentLastName   6911 non-null   object
dtypes: object(4)
memory usage: 312.6+ KB


In [5]:
family_book.head()

Unnamed: 0,FirstName,LastName,ParentFirstName,ParentLastName
0,Morgana,Chris,,
1,Andree,Alica,,
2,Rochelle,Peh,,
3,Tanitansy,Bergwall,,
4,Rosana,Blain,,


In [6]:
past_medical_record = pd.read_csv('./past_medical_record.csv')

In [7]:
past_medical_record.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 9 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   FirstName     5000 non-null   object
 1   LastName      5000 non-null   object
 2   Height        5000 non-null   int64 
 3   Weight        5000 non-null   int64 
 4   Occupation    5000 non-null   object
 5   Diabetic      5000 non-null   object
 6   HeartDisease  5000 non-null   object
 7   Smoking       5000 non-null   object
 8   DrinkAlcohol  5000 non-null   object
dtypes: int64(2), object(7)
memory usage: 351.7+ KB


In [8]:
past_medical_record.head()

Unnamed: 0,FirstName,LastName,Height,Weight,Occupation,Diabetic,HeartDisease,Smoking,DrinkAlcohol
0,Audre,Edbert,153,107,Coil Winder,Yes,No,Yes,No
1,Faye,Nora,151,60,Animator,No,No,No,Yes
2,Rozalie,Valonia,170,89,Tender,No,No,Stopped,Yes
3,Alvina,Camey,167,67,Food and Tobacco Roasting,No,No,Stopped,Yes
4,Madel,Edbert,160,61,Precious Stone and Metal Worker,No,Yes,Yes,Yes


In [9]:
upcoming_patients = pd.read_csv('./upcoming_patients.csv')

In [10]:
upcoming_patients.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 2 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   FirstName  5000 non-null   object
 1   LastName   5000 non-null   object
dtypes: object(2)
memory usage: 78.2+ KB


In [11]:
upcoming_patients.head()

Unnamed: 0,FirstName,LastName
0,Morgana,Chris
1,Andree,Alica
2,Tanitansy,Bergwall
3,Rosana,Blain
4,Rheba,Doersten


**Expected Output**

A table with all upcoming patient names, and the names and medical records of their closest ancestor (prefixed "Parent" in the table). Where no medical records are found in an upcoming patient's ancestry, the columns are simply filled with "None".

Example:

| FirstName | LastName | ParentFirstName | ParentLastName | ParentHeight | ParentWeight | ParentOccupation | ParentDiabetic | ParentHeartDisease | ParentSmoking | ParentDrinkAlcohol |
|-----------|----------|-----------------|----------------|--------------|--------------|------------------|----------------|--------------------|---------------|--------------------|
| Justin    | Bieber   | Margaret        | Bieber         | 180          | 70           | Office Worker    | Yes            | Yes                | No            | Stopped            |
| Rebecca   | Black    | None            | None           | None         | None         | None             | None           | None               | None          | None               |



In [12]:
# Display all rows
upcoming_patients

Unnamed: 0,FirstName,LastName
0,Morgana,Chris
1,Andree,Alica
2,Tanitansy,Bergwall
3,Rosana,Blain
4,Rheba,Doersten
...,...,...
4995,Elysia,Beattie
4996,Yolande,Pool
4997,Cherish,Beauvais
4998,Mellisa,Giorgio


In [14]:
# Clone upcoming patients table
clone = upcoming_patients.copy()
clone

Unnamed: 0,FirstName,LastName
0,Morgana,Chris
1,Andree,Alica
2,Tanitansy,Bergwall
3,Rosana,Blain
4,Rheba,Doersten
...,...,...
4995,Elysia,Beattie
4996,Yolande,Pool
4997,Cherish,Beauvais
4998,Mellisa,Giorgio


In [20]:
# Find first name and last name of "results" from Family Book Table
clone
        # Assign the matched result into temp_row
        
        # Append the matched results into list



Unnamed: 0,self,other
0,Chris,Edbert
1,Alica,Nora
2,Bergwall,Valonia
3,Blain,Camey
4,Doersten,Edbert
...,...,...
4995,Beattie,Hilar
4996,Pool,Viviene
4997,Beauvais,Hardner
4998,Giorgio,Hilar


In [None]:
#Update list and create new columns


Unnamed: 0,FirstName,LastName,ParentFirstName,ParentLastName
0,Morgana,Chris,,
1,Andree,Alica,,
2,Tanitansy,Bergwall,,
3,Rosana,Blain,,
4,Rheba,Doersten,,
5,Page,Tabshey,,
6,Elise,Kirsten,,
7,Kass,Naamann,,
8,Carrissa,Mada,,
9,Valencia,Dearman,,


In [None]:
# Find the match rows of ParentFirstName and ParentLastName in "results" table from PastMedicalRecord Table

    
        # Assign the matched result into temp_row

        # Append the matched results into list

        # Append the NaN into list if there is no matched rows


In [None]:
# Check Length
len(ParentHeight)

5000

In [None]:
#Update list and create new columns


Unnamed: 0,FirstName,LastName,ParentFirstName,ParentLastName,ParentHeight,ParentWeight,ParentOccupation,ParentDiabetic,ParentHeartDisease,ParentSmoking,ParentDrinkAlcohol
0,Morgana,Chris,,,,,,,,,
1,Andree,Alica,,,,,,,,,
2,Tanitansy,Bergwall,,,,,,,,,
3,Rosana,Blain,,,,,,,,,
4,Rheba,Doersten,,,,,,,,,
5,Page,Tabshey,,,,,,,,,
6,Elise,Kirsten,,,,,,,,,
7,Kass,Naamann,,,,,,,,,
8,Carrissa,Mada,,,,,,,,,
9,Valencia,Dearman,,,,,,,,,


In [None]:
# Check a loop and find the closest ancestor within 5 generations



# Loop for 5 generations


        # If the ParentHeight is NaN, then we will looking for one level up of closest ancestor
  
            # Find ParentFirstName from Familybook
 
                # Found the matched details and assign to temp_row
 
                #If this temp_row gt medical record, assign the data into the list
                   
                    # Found the matched details and assign to temp_row2
                  
                    # Append the matched results into list
                # Append the results into list (Parents but no medical records)
          # Append the results into list (Top Node and NaN in Parents)
          # Append the results into list (Remain unchanged for the exisiting results)


results.head(150)   

Unnamed: 0,FirstName,LastName,ParentFirstName,ParentLastName,ParentHeight,ParentWeight,ParentOccupation,ParentDiabetic,ParentHeartDisease,ParentSmoking,ParentDrinkAlcohol
0,Morgana,Chris,,,,,,,,,
1,Andree,Alica,,,,,,,,,
2,Tanitansy,Bergwall,,,,,,,,,
3,Rosana,Blain,,,,,,,,,
4,Rheba,Doersten,,,,,,,,,
5,Page,Tabshey,,,,,,,,,
6,Elise,Kirsten,,,,,,,,,
7,Kass,Naamann,,,,,,,,,
8,Carrissa,Mada,,,,,,,,,
9,Valencia,Dearman,,,,,,,,,


In [None]:
# Export to CSV File
