# High Throughput Computing

OSG Services like the OSPool are designed to support a type of computing approach called high throughput computing. What is this approach and would it work for you? 

## Solving Computing Problems

How do you solve a big computational problem? E.g.
* Lots of data
* Thousands of parameters
* Many grid points 

![problem](images/02-00-problem.png)

To solve a big problem, use more computers.

![solution](images/02-10-solution.png)

There are different ways to do this: 
* High throughput computing
* High performance computing

We're going to focus on high throughput computing. 

## High Throughput Computing

In high throughput computing, sub-problems are separate and self-contained with their own input and output. They can be run on many computers. It's very scalable and doesn't require any special programming. 

![diagram](images/02-20-htc-diagram.png)

One of our favorite HTC examples: baking the world’s largest/longest cake

![cake](images/02-25-cake.avif)

In computational terms: solving a big problem (the world’s longest cake) by executing many small, self-contained tasks (individual cakes) and joining them.

### HTC on the Open Science Pool 

The OSPool is a good fit for HTC workloads that can be distributed and open: 

* Jobs are short/resumable
    * Because the OSPool backfills other resources, interruptions are possible.  
* Jobs have a laptop-sized resource profile 
    * Best for numerous jobs of one (or few) CPUs and <16GB memory, each. 
* Individual jobs need/produce less than 20GB of data
    * Because OSPool resources are distributed across the US (and world) it can be prohibitive for individual jobs that need more than ~20GB of data.
* Jobs software and data can be ‘open’
    * Open-source software (no restrictive licensing), unrestricted data (no HIPAA, etc.)

What workloads are good for the OSPool*?

![table](images/02-30-table.png)

The “less-so, but maybe” column could still be an HTC workload, but one that would run more effectively on a local, dedicated HTC system instead of the OSPool

### HTC Examples

Could your research be run in an HTC way? Here are a few examples of domains who have "HTC" problems: 

![htc-research](images/02-40-htc-research.png)

If you want to learn more about how research has taken advantage of an HTC approach check out the following talks: 

* Researcher talks at 2020 OSG All Hands Meeting
    * https://indico.fnal.gov/event/22127/timetable/#20200831.detailed 
* Researcher talks at 2019 OSG User School
    * https://opensciencegrid.org/user-school-2019/#materials/
* Researcher talks at 2019 OSG All Hands Meeting
    * https://indico.cern.ch/event/759388/timetable/#20190319.detailed 

---

[Next Lesson: Using the OSPool](03-Using-the-OSPool.ipynb)