# Question 1

Suppose we have some input data describing a graph of relationships between parents and children over multiple generations. The data is formatted as a list of (parent, child) pairs, where each individual is assigned a unique integer identifier.

For example, in this diagram, 3 is a child of 1 and 2, and 5 is a child of 4:

```
1   2    4
 \ /   / | \
  3   5  8  9
   \ / \     \
    6   7    11
```

Sample input/output (pseudodata):

```
parent_child_pairs = [
    (1, 3), (2, 3), (3, 6), (5, 6),
    (5, 7), (4, 5), (4, 8), (4, 9), (9, 11)
]
```

Write a function that takes this data as input and returns two collections: one containing all individuals with zero known parents, and one containing all individuals with exactly one known parent.


Output may be in any order:

```
find_nodes_with_zero_and_one_parents(parent_child_pairs) => [
  [1, 2, 4],       // Individuals with zero parents
  [5, 7, 8, 9, 11] // Individuals with exactly one parent
]
```

n: number of pairs in the input


In [30]:
# Solve q1
def find_nodes_with_zero_and_one(pairs):
    
    # Create a count dict
    count_dict = {}
    for p, c in pairs:
  
        count_dict[c] = count_dict.get(c, []) + [p]

#         if c in count_dict:
#             count_dict[c].append(p)
#         else:
#             count_dict[c] = [p]
        
        if p not in count_dict:
            count_dict[p] = []
        
    zero_nodes = [node for node in count_dict.keys() if len(count_dict[node]) == 0]
    one_nodes = [node for node in count_dict.keys() if len(count_dict[node]) == 1]

    return zero_nodes, one_nodes

In [33]:
parent_child_pairs = [
    (1, 3), (2, 3), (3, 6), (5, 6),
    (5, 7), (4, 5), (4, 8), (4, 9), (9, 11)
]

%timeit find_nodes_with_zero_and_one(parent_child_pairs)



4.56 µs ± 155 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


# Question 2

Suppose we have some input data describing a graph of relationships between parents and children over multiple generations. The data is formatted as a list of (parent, child) pairs, where each individual is assigned a unique integer identifier.

For example, in this diagram, 6 and 8 have common ancestors of 4 and 14.

```
         14  13
         |   |
1   2    4   12
 \ /   / | \ /
  3   5  8  9
   \ / \     \
    6   7     11
```

```
parent_child_pairs_1 = [
    (1, 3), (2, 3), (3, 6), (5, 6), (5, 7), (4, 5),
    (4, 8), (4, 9), (9, 11), (14, 4), (13, 12), (12, 9)
]
```

Write a function that takes the graph, as well as two of the individuals in our dataset, as its inputs and returns true if and only if they share at least one ancestor.

Sample input and output:

```
has_common_ancestor(parent_child_pairs_1, 3, 8) => false
has_common_ancestor(parent_child_pairs_1, 5, 8) => true
has_common_ancestor(parent_child_pairs_1, 6, 8) => true
has_common_ancestor(parent_child_pairs_1, 6, 9) => true
has_common_ancestor(parent_child_pairs_1, 1, 3) => false
has_common_ancestor(parent_child_pairs_1, 3, 1) => false
has_common_ancestor(parent_child_pairs_1, 7, 11) => true
has_common_ancestor(parent_child_pairs_1, 6, 5) => true
has_common_ancestor(parent_child_pairs_1, 5, 6) => true
```

Additional example: In this diagram, 4 and 12 have a common ancestor of 11.

```
        11
       /  \
      10   12
     /  \
1   2    5
 \ /    / \
  3    6   7
   \        \
    4        8
```

```
parent_child_pairs_2 = [
    (11, 10), (11, 12), (2, 3), (10, 2), (10, 5), 
    (1, 3), (3, 4), (5, 6), (5, 7), (7, 8),
]
```

```
has_common_ancestor(parent_child_pairs_2, 4, 12) => true
has_common_ancestor(parent_child_pairs_2, 1, 6) => false
has_common_ancestor(parent_child_pairs_2, 1, 12) => false
```

n: number of pairs in the input


In [56]:
def create_ancestory_set(cp_dict, start_node):
    
    visited = set()
    queue = [start_node]
    ancestors = set()
    
    while(queue):
        node = queue.pop()
        ancestors.add(node)
        for parent in cp_dict[node]:
            if parent not in visited:
                visited.add(parent)
                queue.append(parent)
    
    ancestors.remove(start_node)
    
    return ancestors
    
def has_common_ancestor(pairs, a, b):
    # C - P dictionary
    cp_dict = {}
    for p, c in pairs:
        cp_dict[c] = cp_dict.get(c, []) + [p]

        if p not in cp_dict:
            cp_dict[p] = []

    return len(set(create_ancestory_set(cp_dict, a)).intersection(set(create_ancestory_set(cp_dict, b)))) > 0

In [58]:
parent_child_pairs_1 = [
    (1, 3), (2, 3), (3, 6), (5, 6), (5, 7), (4, 5),
    (4, 8), (4, 9), (9, 11), (14, 4), (13, 12), (12, 9)
]

has_common_ancestor(parent_child_pairs_1, 3, 8)

False

# Question 3


Suppose we have some input data describing a graph of relationships between parents and children over multiple generations. The data is formatted as a list of (parent, child) pairs, where each individual is assigned a unique integer identifier.

For example, in this diagram, the earliest ancestor of 6 is 14, and the earliest ancestor of 15 is 2. 

```
         14
         |
  2      4
  |    / | \
  3   5  8  9
 / \ / \     \
15  6   7    11
```

Write a function that, for a given individual in our dataset, returns their earliest known ancestor -- the one at the farthest distance from the input individual. If there is more than one ancestor tied for "earliest", return any one of them. If the input individual has no parents, the function should return null (or -1).

Sample input and output:

```
parent_child_pairs_3 = [
    (2, 3), (3, 15), (3, 6), (5, 6), (5, 7),
    (4, 5), (4, 8), (4, 9), (9, 11), (14, 4),
]

find_earliest_ancestor(parent_child_pairs_3, 8) => 14
find_earliest_ancestor(parent_child_pairs_3, 7) => 14
find_earliest_ancestor(parent_child_pairs_3, 6) => 14
find_earliest_ancestor(parent_child_pairs_3, 15) => 2
find_earliest_ancestor(parent_child_pairs_3, 14) => null or -1
find_earliest_ancestor(parent_child_pairs_3, 11) => 14
```

Additional example:

```
  14
  |
  2      4    1
  |    / | \ /
  3   5  8  9
 / \ / \     \
15  6   7    11

parent_child_pairs_4 = [
    (2, 3), (3, 15), (3, 6), (5, 6), (5, 7),
    (4, 5), (4, 8), (4, 9), (9, 11), (14, 2), (1, 9)
]

find_earliest_ancestor(parent_child_pairs_4, 8) => 4
find_earliest_ancestor(parent_child_pairs_4, 7) => 4
find_earliest_ancestor(parent_child_pairs_4, 6) => 14
find_earliest_ancestor(parent_child_pairs_4, 15) => 14
find_earliest_ancestor(parent_child_pairs_4, 14) => null or -1
find_earliest_ancestor(parent_child_pairs_4, 11) => 4 or 1
```

n: number of pairs in the input


In [92]:
# Solve q3
def create_ancestory_set(cp_dict, start_node):
    
    visited = {}
    queue = [start_node]
    ancestors = list()
    
    while(queue):
        node = queue.pop()
        ancestors.append(node)
        for parent in cp_dict[node]:
            if parent not in visited:
                visited[parent] = visited.get(node, 0) + 1
                queue.append(parent)
    return visited
    
def earliest_ancestor(pairs, a):
    # C - P dictionary
    cp_dict = {}
    for p, c in pairs:
        cp_dict[c] = cp_dict.get(c, []) + [p]

        if p not in cp_dict:
            cp_dict[p] = []
            
    print(cp_dict)

    visited = create_ancestory_set(cp_dict, a)
    
    max_val = 0
    max_key = 0
    for k,v in visited.items():
        if v > max_val:
            max_key = k
            max_val = v
           
    return max_key

In [95]:
parent_child_pairs_3 = [
    (2, 3), (3, 15), (3, 6), (5, 6), (5, 7),
    (4, 5), (4, 8), (4, 9), (9, 11), (14, 4),
]

earliest_ancestor(parent_child_pairs_3, 15)

{3: [2], 2: [], 15: [3], 6: [3, 5], 5: [4], 7: [5], 4: [14], 8: [4], 9: [4], 11: [9], 14: []}


2