# Parallel Computation
### Segment 1 of 4

Parallel Computation allows us to process multiple data elements simultaneously, which can speed up our processing or allow us to tackle bigger problems with more data.

<i>Lesson Developer: </i>
<ul>
    <li>
    <i>Eric Shook eshook@umn.edu</i>
    </li>
</ul>


In [None]:
# This code cell starts the necessary setup for Hour of CI lesson notebooks.
# First, it enables users to hide and unhide code by producing a 'Toggle raw code' button below.
# Second, it imports the hourofci package, which is necessary for lessons and interactive Jupyter Widgets.
# Third, it helps hide/control other aspects of Jupyter Notebooks to improve the user experience
# This is an initialization cell
# It is not displayed because the Slide Type is 'Skip'

from IPython.display import HTML, IFrame, Javascript, display
from ipywidgets import interactive, Textarea, HBox, Text
import ipywidgets as widgets
from ipywidgets import Layout

import getpass # This library allows us to get the username (User agent string)

# import package for hourofci project
import sys
sys.path.append('../../supplementary') # relative path (may change depending on the location of the lesson notebook)
# sys.path.append('supplementary')
import hourofci
try:
    import os
    os.chdir('supplementary')
except:
    pass

# load javascript to initialize/hide cells, get user agent string, and hide output indicator
# hide code by introducing a toggle button "Toggle raw code"
HTML(''' 
    <script type="text/javascript" src=\"../../supplementary/js/custom.js\"></script>
    
    <style>
        .output_prompt{opacity:0;}
    </style>
    
    <input id="toggle_code" type="button" value="Toggle raw code">
''')

## Thank you for helping our study


<a href="#/slide-1-0" class="navigate-right" style="background-color:blue;color:white;padding:8px;margin:2px;font-weight:bold;">Continue with the lesson</a>

Throughout this lesson you will see reminders, like the one below, to ensure that all participants understand that they are in a voluntary research study.

### Reminder

<font size="+1">

By continuing with this lesson you are granting your permission to take part in this research study for the Hour of Cyberinfrastructure: Developing Cyber Literacy for GIScience project. In this study, you will be learning about cyberinfrastructure and related concepts using a web-based platform that will take approximately one hour per lesson. Participation in this study is voluntary.

Participants in this research must be 18 years or older. If you are under the age of 18 then please exit this webpage or navigate to another website such as the Hour of Code at https://hourofcode.com, which is designed for K-12 students.

If you are not interested in participating please exit the browser or navigate to this website: http://www.umn.edu. Your participation is voluntary and you are free to stop the lesson at any time.

For the full description please navigate to this website: <a href="gateway-1.ipynb">Gateway Lesson Research Study Permission</a>.

</font>

### Purpose of Parallel Computation
Almost every desktop computer, laptop computer, cellphone, and server has more that one processing **core** used to make sense of data, and they are called **multi-core processors.** These cores can help you find the nearest "Hip Po Coffee" location, estimate tomorrow's weather forecast, or calculate where stones thrown by a catapult will land. We use lots of terms for making sense of data: processing data, filtering data, querying data, analyzing data, data munging, and visualizing data are just a few examples. 

What if we have lots and lots of data? So much data that it takes one processing core hours, days, weeks, or even years to filter, query, analyze, or visualize it? Parallel computation is a special type of computation that enables multiple calculations to be performed simultanously using multiple processing cores.

Before we get into parallel computation. First, let's have a little race.

### The race is on! How fast can you plant a field?

Click on each cell to plant one seed. Let's see how fast you can do it.

<!-- {{IFrame("supplementary/field-manual.html", width="600", height="350")}} -->

In [None]:
%%html
<iframe src="supplementary/field-manual.html", width=600, height=350, allowfullscreen></iframe>


### Meet Sam.

Sam is grumpy and works alone. In fact, he will only do his work when no one else is in the field.

<!-- <img src="https://rawcdn.githack.com/coopbri/hci-binder/0a0f8e02e3a3e7429f881b69634200d671c0f560/notebooks/parallel-computation/supplementary/farmer.svg" width="150" height="150"> -->

<img src="supplementary/farmer.svg" width="150" height="150">

### Sam's turn to plant the field

Let's see how fast Sam can plant the field.

In [None]:

%%html
<iframe src="supplementary/field-one.html", width=600, height=350, allowfullscreen></iframe>


How quickly can Sam plant the entire field if he has all the seeds?

In [None]:
widget0 = widgets.BoundedIntText(
    value=0,
    min=0,
    max=100,
    step=1,
    description='Sam\'s fastest time (seconds):', style={'description_width': 'initial'},
    layout = Layout(width='40%'),
)

def out():
    return print('Submitted!')
display(widget0)

hourofci.SubmitBtn2(widget0, out)

What is the fundamental limitation in the planting time? Check all that apply.

In [None]:
check1 = widgets.Checkbox(
    value=False,
    description='Sam\'s speed',
    disabled=False
)
check2 = widgets.Checkbox(
    value=False,
    description='The number of spots to plant',
    disabled=False
)
check3 = widgets.Checkbox(
    value=False,
    description='Sam\'s shoe color',
    disabled=False
)

# Submit button
button1 = widgets.Button(
    description = 'Submit',
    disabled = False,
    button_style = '',
    icon = 'check'
)

# Output
output1 = widgets.Output()

display(check1, check2, check3, button1, output1)

# Output function
def out(b):
    with output1:
        output1.clear_output()
        if (check1.value and check2.value and not check3.value):
            print("Correct!")
        else:
            print("Not quite!")

# Handle click event
button1.on_click(out)

Sam's speed gives us an example of **processing speed**.<br> <br>
How fast can a core process data? This is usually measured in clock cycles per second (or Hertz/Hz). One Hertz is equal to one cycle per second. This represents the [Clock Rate](https://en.wikipedia.org/wiki/Clock_rate) of a processor. Today, most processors are between 1.8 and 3.0 Gigahertz/GHz. <br><br>
This means that processors have a clock speed between 1,800,000,000 to 3,000,000,000 clock cycles per second! The clock rate of a processor can tell you roughly how fast it can process data.


The number of spots to plant gives us an example of data to process. How much data do we need to crunch before it is all done? <br><br>
Sam's shoe color is just a stylish choice so while it may help Sam feel cool. It won't get the field planted any faster.



How did you compare?


In [None]:
o1='faster'
o2='slower'
o3='about the speed'

widget1 = widgets.RadioButtons(
    options = [o1, o2, o3],
    description = 'Compared to Sam planting the field, I was: ', style={'description_width': 'initial'},
    layout = Layout(width='100%'),
    value = None
)
def out():
    print("Successfully submitted!")

display(widget1)

hourofci.SubmitBtn2(widget1, out)

Sam is an example of what is known as **serial computation**. He will only work alone and does his tasks one step at a time. When anyone learns how to program they learn how to write a program for serial computation. The next step in programming is **parallel computation.**

### Meet Parker and Patricia. 

Parker and Patricia love to share, and they will work in the field at the same time. 
How quickly can Parker and Patricia plant the entire field if they split all the seeds evenly?



In [None]:
%%html
<iframe src="supplementary/field-two.html", width=650, height=450, allowfullscreen></iframe>


When working together what is the fastest time that Parker and Patricia can plant the field?


In [None]:
# TODO: Style the widget better, there's no easy way unfortunately
# Add end values, add units ('seconds') to current value, etc.
widget1 = widgets.IntSlider(
    value=1,
    min=0,
    max=20,
    step=1,
    description='Fastest time (0 - 20 seconds):', style={'description_width': 'initial'},
    layout = Layout(width='60%'),
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d'
)


def out():
    return print('Submitted!')
display(widget1)

hourofci.SubmitBtn2(widget1, out)


What if Parker and Patricia change how they divide the field? <br>
Move the black bar to adjust how many seeds they will each plant. Use the dropdown menu to adjust their planting speed. Try different combinations to see how fast or slow Parker and Patricia can plant a field in parallel.

In [None]:
%%html
<iframe src="supplementary/field-two.html", width=650, height=450, allowfullscreen></iframe>


What is the slowest time possible if you adjust the speed and division of work for Parker and Patricia?

In [None]:
# TODO: Style the widget better, there's no easy way unfortunately
# Add end values, add units ('seconds') to current value, etc.
widget2 = widgets.IntSlider(
    value=1,
    min=0,
    max=20,
    step=1,
    description='Slowest time (0 - 20 seconds):', style={'description_width': 'initial'},
    layout = Layout(width='60%'),
    disabled=False,
    continuous_update=False,
    orientation='horizontal',
    readout=True,
    readout_format='d'
)

def out():
    return print('Submitted!')
display(widget2)

hourofci.SubmitBtn2(widget2, out)

What is the difference between the fastest and slowest time? Write 1-2 sentences describing what affects the times when Parker and Patricia are planting in parallel.

In [None]:

tarea1 = Textarea(
            value='',
            placeholder='Type your answer here',
            description='',
            disabled=False,
            layout=Layout( height='100px', min_height='100px', width='500px')
            )



def out():
    return print('Submitted!')
display(tarea1)

hourofci.SubmitBtn2(tarea1, out)


## Speed!
To get the fastest time in this scenario requires two things. First, both Parker and Patricia need to be moving as quickly as possible, which means they are going the same speed: fast. Second, they need to have the same amount of work. 

The first requirement gives us an example of **processing speed** that we learned about earlier.

The second requirement gives us an example of **load balancing.** Load balancing ensures that each core is processing its fair share of data, which will make parallel computation faster. Just as we saw with Parker and Patricia if they are both working to plant their portion of the field simultaneously the entire field is planted faster. If only Patricia is planting the field and Parker is not helping, then it takes longer because she is doing all of the work and Parker is doing no work. This is improper load balancing. The same is true in parallel computation. If we have two or more cores that are capable of processing data and we only use one core, then the other cores are not helping to speed up the computation.

## Let's take a step back
Before we keep moving, let's take a step back to look at what you learned.
* Serial computation - one task at a time versus
* Parallel computation - multiple tasks at a time




## Let's take a step back
What affects parallel computation (parallelism)?
<ul>
    <li>
Processing speed - how quickly can you finish each step in the task
    </li> 
    <li>
Data to process - how much 'work' do you have to do
    </li>
    <li>
Load balancing - how 'fairly' is the workload distributed across the workers
    </li>
    <li>
Coordination - how much coordination must be done where parallel work isn't happening
    </li>
</ul>




## Where else do we see this?
Cooking!
* We see parallelism everywhere including cooking.
* If you ever asked a friend to cut vegetables while you mixed up a few ingredients, then congratulations you have worked in parallel.
* Another form of this is called multi-tasking where multiple tasks can be completed simultaneously.
* If you have turned on the oven to preheat, then chopped vegetables while listening to music and chewing gum at the same time, then congratulations you have worked in parallel by multi-tasking.


## Different levels of parallelism
There are different levels of parallelism in parallel computing.
* We will delve into this in more detail later in the lesson, but just remember just like you preheating the oven while listening to music and chewing gum at the same time is an example of parallelism. Three people working together in a kitchen to cook a meal is also an example of parallelism. But the task coordination and amount of parallelism is different.


## Concept check
How might parallelism be different from one person multitasking during cooking compared to multiple people working together to cook a meal?


In [None]:
tarea2 = Textarea(
            value='',
            placeholder='Type your answer here',
            description='',
            disabled=False,
            layout=Layout( height='100px', min_height='100px', width='500px')
            )

def out():
    print("Interesting answer, let's compare answers on the next slide")
display(tarea2)
hourofci.SubmitBtn2(tarea2, out)

## Concept check
How might parallelism be different from one person multitasking during cooking compared to multiple people working together to cook a meal? A few more differences.
* A single person can coordinate tasks in their head without having to verbally communicate, which can save time.
* Multiple people can run into each other in the kitchen.
* Multiple people can disagree how and when tasks should be completed.
* Multiple people can (usually) accomplish more than a single person, which means the food will be ready earlier or more food can be prepared.
* People may operate at different speeds so tasks must be balanced differently.


<b>Continue the journey: </b><br><br>


<font size="+1"><a style="background-color:blue;color:white;padding:12px;margin:10px;font-weight:bold;" href="pc-3.ipynb">Click here to go to the next notebook.</a></font>