# The Immune System Challenge

A group of immunologists want to explore the Compliment System in the human body. They have access to a supercomputer and they plan to interatively simulate different Compliment System configurations.

In [0]:
compliment_system = ['C3b', 'C3a', 'Bb', 'Ba', 'C4b', 'C4a', 'C2b', 'C2a', 'D', 'P', 'C1q', 'C1r', 'C1s', 'MBL', 'MASP-1', 'MASP-2', 'C5b', 'C5a', 'C6', 'C7', 'C8', 'C9', 'C1INH', 'MCP', 'DAF', 'H', 'C4bp', 'CD59', 'CR1', 'CR2', 'CR3', 'CR4']

compression1 = {
    'detectors': [1,2],
    'responders': [4,22],
    'activators': [5,23,31],
    'enablers': [1,9,10],
    'catalyzer': [12,29],
    'upgraders': [23,24,25],
    'chains': [27],
    'trappers': [13],
    'finishers': [16]
}

The following code has been made for this purpose:

In [0]:
def expand(array, pair):
    """ 
      Take in an list of dictionaries and outputs an expanded list based on a [key, value] pair
      :param list array: list of all dictionaries such as [{'detectors': 1, ...}, {'detectors': 2, ...}]
      :param list pair: the pair such as ['responders', [4,22]]
      :returns output: an expanded list
      :rtype list:
    """
    output = [];
    for candidate in array: #go into the previously expanded list
        for option_index in pair[1]: #go into each item of the vlaue (the list of indexes)
            candidate[pair[0]] = compliment_system[option_index] #change the list into a single value
            output.append(candidate.copy()) #put it in the output
    return output #so out1 can keep expanding

def process(in1):
    """
      Go over items in a dictionary and calls expands for each to individualize the values.
      :param dict in1:
      :returns out1: individulized items of the dictionary.
      :rtype list:
    """
    out1 = [in1] #make an output value which is a list
    for k,v in in1.items(): # [('detectors', [1,2])...]
        out1 = expand(out1, [k,v]) # Is calling expand for every [key, pair] set
    return out1

In [14]:
process(compression1.copy()) #this is how the output looks like

[{'activators': 'C4a',
  'catalyzer': 'C1s',
  'chains': 'CD59',
  'detectors': 'C3a',
  'enablers': 'C3a',
  'finishers': 'C5b',
  'responders': 'C4b',
  'trappers': 'MBL',
  'upgraders': 'MCP'},
 {'activators': 'C4a',
  'catalyzer': 'C1s',
  'chains': 'CD59',
  'detectors': 'C3a',
  'enablers': 'C3a',
  'finishers': 'C5b',
  'responders': 'C4b',
  'trappers': 'MBL',
  'upgraders': 'DAF'},
 {'activators': 'C4a',
  'catalyzer': 'C1s',
  'chains': 'CD59',
  'detectors': 'C3a',
  'enablers': 'C3a',
  'finishers': 'C5b',
  'responders': 'C4b',
  'trappers': 'MBL',
  'upgraders': 'H'},
 {'activators': 'C4a',
  'catalyzer': 'CR2',
  'chains': 'CD59',
  'detectors': 'C3a',
  'enablers': 'C3a',
  'finishers': 'C5b',
  'responders': 'C4b',
  'trappers': 'MBL',
  'upgraders': 'MCP'},
 {'activators': 'C4a',
  'catalyzer': 'CR2',
  'chains': 'CD59',
  'detectors': 'C3a',
  'enablers': 'C3a',
  'finishers': 'C5b',
  'responders': 'C4b',
  'trappers': 'MBL',
  'upgraders': 'DAF'},
 {'activators': '

The scientists want to perform a more comprehensive test, but the size of the test soon proves to be an issue. The first tiem they try `compression2` they quickly run out of ram. 

Even with buying lost of ram and significantly reducing the size of the lists, there is a chance that there is a failure as the individualized files are being sent to the supercomputer. They have no way of finding out what was the last successful run to progress from there.

In [0]:
compression2 = {
    'detectors': [0 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30, 31],
    'responders': [0 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30, 31],
    'activators': [0 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20 ,21 ,22 ,24 ,25 ,26 ,27 ,28 ,29 ,30, 31],
    'enablers': [0 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30, 31],
    'catalyzer': [0 ,1 ,5 ,6 ,7 ,8 ,9 ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30, 31],
    'upgraders': [0 ,1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30, 31],
    'chains': [0 ,1 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10 ,11 ,12 ,14 ,15 ,16 ,17 ,18 ,19 ,20 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30, 31],
    'trappers': [0 ,1 ,2 ,3 ,4 ,5 ,7 ,8 ,9 ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30, 31],
    'finishers': [0 ,1 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20 ,21 ,22 ,23 ,24 ,25 ,26 ,27 ,28 ,29 ,30, 31]
}

In [0]:
len(process(compression1.copy())) #this is pretty fast because we only have 216 values....

216

In [0]:
# However, the below command will crash if uncommented
# done = process([compression2.copy()])
32**9 #Impossible to do this in one go

**Homework: Create a function called `bigDataProcess` that makes the same output as `process` but does two things:**


1.   Can give you only a small batch of the total results everytime you call it. For example, it can give you the first 1000 (or any RAM-friendly number) upon request, then another 1000 with another request.
2.   Can tell you which was the last batch or individualized dictionary that was processed; in case there was a failure.

Make sure you use a docstring of your choice to explain your code. Use a debugger in case of any issues.



In [0]:
k_b=''
v_b=[]
count=0
def expand(array, pair,k_b,v_b,count):

    output = [];
    for candidate in array: #go into the previously expanded list
        for option_index in pair[1]: #go into each item of the vlaue (the list of indexes)
          if count<100: # stop when it reach the maximum
            candidate[pair[0]] = compliment_system[option_index] #change the list into a single value
            output.append(candidate.copy()) #put it in the output
            count+=1
            #print (count)
            k_b=''
            v_b=0
          else:
            k_b=pair[0] #if the count is reached, it stores the value here.
            v_b=pair[1]
            #print ("else active count is %s"%count)
    return output,k_b,v_b,count #so out1 can keep expanding

def bigDataProcess(in1,count,k_b,v_b):
    """
    Function that takes the input, the other values are "memory" to give another batch if the function is run again
    k_b is the key that it was left
    v_b the values attached to that key
    one variable is missing of the code to keep track of the option_index.

    """

    out1 = [in1] #make an output value which is a list
    for k,v in in1.items(): # [('detectors', [1,2])...]
        if len(k_b)>0 & (count<100): #if the max count is not reached AND the k_b memory is not empty, so it means it start where it left last time.
          out1,k_b,v_b,count = expand(out1, [k_b,v_b],k_b,v_b,count)
        elif count<100:
          out1,k_b,v_b,count = expand(out1, [k,v],k_b,v_b,count) # Is calling expand for every [key, pair] set
    #print (count,k_b,v_b)
    #print (len(out1))
    return out1,k_b,v_b

In [4]:
out1,k_b,v_b=bigDataProcess(compression2.copy(),count,k_b,v_b)
print (out1)

[]


In [5]:
k_b

'responders'