`Asynchronous programming` or `async` is all about concurrency. Which enable multiples activities to happen concurrently. The most important distinction you need to know about `async` is the difference between `parallelism` and `concurrency`.

`Paralellism` is about two or more things happen at the **same time**. `Concurrency` is about two or more things being handled in the same time.

Put it other way, at the same point in time, let say at second 0. If two things both run, they're running in `parallel`. If one thing runs while the other stop, then takes turn to run, we have `concurrency`.

If you're using your computers, most of the things are running concurrently, not in parallel. The switching is just too fast, it makes you think everything are running in the same time.

It doesn't matter how powerful your CPU is, each CPU only have a finite number of cores. Each core, at one given point in time, can only perform one and only one instruction. It depends on how precise you want to be. If one given point in time means one clock cycle, then each core could only do one step of the multiple steps to finish an instruction. Since an instruction usually requires more than one step. It's easier to understand the concept of `parallelism` and `concurrency` by going down to clock cycle level.

Let's have an example with very simple instruction: adding two numbers. We actually simplify this simple instruction by saying that it requires 3 clock cycle to complete the adding of two numbers, e.g load first number, load second number, add two numbers together

|Clock cycle|One core CPU|2 cores CPU|
|-|-|-|
|1|Load first number|load first number|
|2|Load second number|load second number|
|3|add two numbers|add two number|

Well, for this simple instruction, it's basically the same for two CPU.
Let's make it a bit complex by doing two computation in the same time : (a + b) and (c + d)

|Clock cycle|One core CPU|2 cores CPU|
|-|-|-|
|1|Load a|Core 1: load a, Core 2: load c|
|2|Load b|Core 1: load b, core 2: load d|
|3|Compute a + b|Core 1: compute a + b, core 2: compute c + d|
|4|Load c|Already finished|
|5|Load d|Already finished|
|6|Compute c + d|Already finished|

We can see that the one core CPU doesn't have any choice but to run the steps sequentially. However, there's a different way to run the steps sequentially:

|Clock cycle|One core CPU|2 cores CPU|
|-|-|-|
|1|Load a|Core 1: load a, Core 2: load c|
|2|Load c|Core 1: load b, core 2: load d|
|3|Load d|Core 1: compute a + b, core 2: compute c + d|
|4|Load b|Already finished|
|5|Compute a + b|Already finished|
|6|Compute c + d|Already finished|

The steps of (a + b) were stopped in the middle to do steps of (c + d). Assuming that CPU have enough registries to hold that many values at the same time (in this example, we need 4 registries, probably 6, 2 more to contain the results).

Hopefully at this point, you could clearly understand the difference between `parallelism` and `concurrency`. This seems too simple but it is how things physically work in your CPU, hence in all of your programs. Understand this, you will understand everything about `async` and differences of `threading`, `subprocesses`, and `async` itself.

Simply put, `subprocesses` is true `parallelism`, `threading` and `async` are `concurrency`.

`subprocesses` will leverage all of your CPU cores, split the works and do everything in parallel.<br>
`threading` and `async` use one core to run multiple tasks, hence sharing the core across tasks. When one runs, the others have to stop and wait.

So, what is the difference between `threading` and `async`? Generally speaking, they're not different. However, there's a style of `async` called `cooperative async` and during this lesson, `async` refers to such style.<br>
First thing first, we all know that `threading` and `async` are `concurrency`. It's about starting and stoping tasks so that at any given time, only one task will run. The difference is how the tasks are controlled in order to start and stop. When exactly a tasks stop to give control (resource) to other tasks?

In `threading`, the underlying Operation System (OS) is usually the one who controls this process. Those threads that are created and controlled by the OS called `native thread`. Your program can also control this process by implementing your own policy in which how threads will be stopped or started. This type of threads is `green thread`.

Usually, when controls a bunch of threads, the nature of works carried out by the threads are unknown to the controller. Therefore, the threads are stopped and started in a quite arbitrarily manner. Some OS will define `priority level`. A thread with higher `priority` will be allowed to use the resource until it completes. Then lower priority threads will run. However, amongst threads with same `priority level`, they're still coordinated in an arbitrarily manner.

When I say arbitrarily, I mean a thread could be stopped **at any time** and forced to give control to the other threads. If the thread yields when it's in the middle of some critical processes to update the state of the system, it will leave the system in invalid state. This could also lead to integrity issues when two or more threads accessing the same resource. This is known as `race condition`.

Take a simple example: we have a process to increase a counter by one everytime something happen. The steps are as below:
1. get current counter
2. increase the counter
3. store the counter back

There should be no issues if these three steps are carried out in that exact order. `Race condition` happens when more then two threads trying to do the same thing to our `counter`:

|Order|Steps|value of counter|
|-|-|-|
|1|Thread 1 get current counter|0|
|2|Thread 1 increase counter|1|
|3|Thread 1 store counter|1|
|4|Thread 2 get current counter|1|
|5|Thread 2 increase counter|2|
|6|Thread 2 store counter|2|

final counter value is 2

|Order|Steps|value of counter|
|-|-|-|
|1|Thread 1 get current counter|0|
|2|Thread 2 get current counter|0|
|3|Thread 1 increase counter|1|
|4|Thread 1 store counter|1|
|5|Thread 2 increase counter|1|
|6|Thread 2 store counter|1|

final counter value is 1

This simple value show how easily it is to mess up a naive `threading` application. Of course, there's many solution to this race condition. The most popular is applying `lock` to critical resource such as our counter in above example. Make sure that only one thread could access the resource at a time, so the resource are updated consistently. However, there are also downsides when using `lock`. The most two common issues are:
1. Forget to lock a particular resource and whole systems will mess up.
2. Overly lock your application will slow it downs. Every threads need to wait for one thread to finish using a resource. Too many locks will lead to `deadlock` where thread A is holding a lock to resource RA and need to access another resource RB which is locked by another thread B, who in turn needs to use the resource RA being held by thread A to complete its job and release RB.

There are other issues such as `starvation`, `greedy` but that's a topic for different time.


The point is that there will be many issues with `threading` and all of these issues are due to lack of coordination between threads. That's exactly why we have `async`. If you remember, `async` refers to `cooperative async`. The key word here is `cooperative`. Imagine that, somehow, we can ensure a thread or a task before yielding the control to different task doesn't hold onto any lock or leave the system in a invalid state (i.e. in the middle of an update or mutating process). It's like saying:<br>
"I have finished my job, you could do yours"<br>
or<br>
"I have finished a critical part of my job, you could do yours and don't worry that you will violate my state or you cannot access something your need for your jobs."

Seems cool, right? But, wait a minute, if we could let a task to finish then start another one, isn't that sequential? And if only one task could actually run at a time, why do we want to stop it halfway?

I'm glad that you ask. It's about efficiency in using resource, the resource in this case is the CPU.

When a task runs, it doesn't necessarily require the CPU 100% of time. There's `IO operations` such as reading/writing files on disk, making network call then waiting for response. These operations don't required CPU power. So, if a task is given the CPU resource to run but end up waiting for `IO`, it's wasting the CPU time. The whole concept of `async` is about leveraging CPU time to make sure that the CPU works as much as possible to handle processes.

Another simple illustration will help:
```
Time   1|||||||||2|||||||||3|||||||||4|||||||||5|||||||||6|||||||||7|||||||||8
Task 1 |----------------|=====|-----------|xxxxxxxxxxxx|------|finished
Task 2 |================|-----|========================|======|-----------|finished
                                                                          |
Task 1 |----------------|=====|-----------|xxxxxxxxxxxx|------|finished   |
Task 2 |================|-----|===========|-----------|finishe|           |
                                                              |           |
scenario 1                                                    |           |
scenario 2                                                                |

---- time that task uses CPU
==== time that task is waiting for CPU (CPU is being used by another task)
xxxx dead time for IO
```
In first scenario, task 2 has to wait even though task 1 is waiting for IO and not using CPU.
In second scenario, when task 1 begins waiting for IO, task 2 was given the CPU to run. Thus in scenario 2, the overall time for two tasks is shorter.

# Summary
We was introduced the concept of `async`. The key difference of `parallelism` and `concurrency` gives us the basic to further explore `threading`, `subprocess` and `cooperative async`. Again, by highlighting the differences between these concepts, we further get down to how each of the concepts relates to each other and ultimately learn the purpose of writing `async` programs. It's to leverage the CPU time to handle as much work as possible in a given amount of time.