<p style="font-size: 30px; font-weight: bold;">Comments about the Exercises</p>

# Exercise

## Task

* It is important to understand the difference between copy Constructor, Copy-able (copy assignment), Move-constructable and move-able (move assignment)
* To make the restrictions it is necessary to use again templates at the beginning of the class and to define the characteristics of a method of this class. The next is an example of this:

```cpp
template<typename T>
class threadsafe_stack {
private:
    // In order to make a wrapped unique_ptr, we need to create a 
    // stack with the specific typename
    std::stack<std::unique_ptr<T>> data{};
    // Mutable is used when we want to change the value inside a const object
    mutable std::mutex m{};
public:
    threadsafe_stack() = default;
    
    // There is a method to push one element by copy semantics. 
    // This method only exists if the type is
    // copy-able and copy-constructable without exceptions.
    template<typename D = T, std::enable_if_t<std::is_same_v<D, T> && 
                                              std::is_nothrow_copy_constructible_v<D> && 
                                              std::is_nothrow_copy_assignable_v<D>,
                                             bool> = true>
    void push(const T &value) {
        std::lock_guard<std::mutex> lock(m);

        auto unique_ptr = std::make_unique<T>(value);
        data.push(std::move(unique_ptr));
    }
}
```

* In this case the `template` is used in the method to make a comparison regarding to if the type does not throw an exception and is copy constructor and copy assignable. It is necessary in this case to create the `enable_if_t` and `is_same_v`.
* When `push` or `pop` an element, it is necessary to block the stack using the mutex as it was mentioned above.

## Task

* Search for the first variable and what should *happens-before* to fulfil the condition given.
* Make the same with all the variables to get the restrictions for the sequential sequence.
* Do not **forget** about the sequence before. The element 1.1 should always be happening before 1.2.

# Exercise

* Go to the different codes in the repo to see the diference of sintax.
* Conditional variables are prone to spurious and lost wakeups at the moment of using the `notify_all` could lead to problems when use as waiting points.

# Exercise

## Task

First point is quite straight forward. The idea is to use the `fetch_add` method of the atomic variables to add the the `sharedValue` variable the addition. In this case, it does not matter when was added or which thread has added a specific value, the only thing that matters is that the value is not being half seen by any of the threads.

* If we dont use `fetch_add` but only `load` and `store` operations, we could see problems and strange behavior, this is not a **data race** but a **race condition** because in the middle of the two operations, the other thread could also make a `load` operations, which would lead to no add the value correctly. However, it is not **data race** because an atomic variable cannot be seen as half done.

## Task

### Inter-thread happens before

Between threads, evaluation A inter-thread happens before evaluation B if any of the following is true

* A synchronizes-with B
* A is dependency-ordered before B
* A synchronizes-with some evaluation X, and X is sequenced-before B
* A is sequenced-before some evaluation X, and X inter-thread happens-before B
* A inter-thread happens-before some evaluation X, and X inter-thread happens-before B

### Happens-before

Regardless of threads, evaluation A happens-before evaluation B if any of the following is true:

* A is sequenced-before B
* A inter-thread happens before B

If one evaluation modifies a memory location, and the other reads or modifies the same memory location, and if at least one of the evaluations is not an atomic operation, the behavior of the program is undefined (the program has a data race) unless there exists a happens-before relationship between these two evaluations.

## Task

During the exercise section of this week, we spoke with Fabian about this particular issue that we are trying to consider in this case and why it is possible that the `synchronizes-with` that I show in the diagram is not enough to guarantee that the thread 1 is going to read always **true** in this case. I will try to simplify his explanation in that case here because I found in that meeting that I have not understand the Load/Store communication with memory ordering before.

- The Store/Load operations does not guarantee that the last value in the variable is *always read* in other threads. This means, that even if you have a `synchronizes-with` relationship, it could also happen that because of caching you see the false. Before I thought that Load/Store operation if they were seeing in the order that I shown in the time diagram, they will always guarantee that the value is going to be read the last value in the variable. However, it is not the case, even if in the "Sacred timeline" shows that execution of events, due to caching in the different threads, the Store/Load operations are not enough. That's why the author said: : "the problem is that the load *might not see the store*"

- The RMW operations have "superpowers". The RMW operations indeed guarantee that you are going to see the **last value** in the variable. The RMW operations are more expensive operations, because they are going to get always the last data that was written (with at least acq_rel of course). 

In my example of execution time with the code from Listing 3, it is not possible to see always the true from the thread 0. The synchronizes-with is there, but the Load operation from the thread 1 could also see the value from the constructor, and that's why it is not enough to have this synchronizes-with with Store/Load operations. I mean it is there in the author explanation, but very bad explain.

### Solution

```cpp
void lock(int threadID)
{
    int me = threadID;      
    int he = 1 - me;        
    interested[me].store(true, std::memory_order_relaxed);
    turn.exchange(1 - me, std::memory_order_acq_rel);
    while (interested[he].load(std::memory_order_acquire) && turn.load(std::memory_order_relaxed) == he)
        continue; //spin
}
```

The main part why this is enough to solve the problem with the mutex based on atomic variables is the follows:

* This time the `turn` variable is using a RMW operation, which is going to get the last value in the variable and creates a *happens-before* relationship with the last written `store` operation. This would make that all the previous operations in the other thread can be seen. This implies that the `interesed[he]` would receive the last value written when operates the `load` in loop. 

# Exercise

The idea behind the exercise fourth is to use the parallel programming to increase the performance of a code. The code is not important but the idea behidn how to make it faster. The code has two important parts, which are the ones that have the operations that we can automated.

* `all_pairs_shortest_paths`: This function has two nested for loops. The idea in this case is to be aware if there is variables that needs to communicate information with other threads. In this case, there are not variables that needs to communicate with other threads. An interation of the first loop is going to set the value of the local distances from this node into the return variable of this function `all_distances`. However, those values are not read for other thread, therefore it is possible to make a simple `for_each` with parallel execution to make this parallel without race condition or data condition.
* `calculate_largest_smallest_path`: This function has shared memory between the for statements. Therefore it would need a lock-based or lock-free strategie to avoid data race conditions. In this case it were shown during the solution three solutions:
    * lock-based: It is the easiest version, it would only blocks the information when it needs to be changed. This solution create s a vector of `mutexes` to lock the specific data regarding to that position to avoid block all the data unnecessary.
    * lock-free: Here the solution showed two possible approaches: `std::atomic` and `std::atomic_ref`. Both of them works under the same principles, but the `std::atomic` needs to create a vector all atomic variables which is more expensive in terms of resources than making an atomic reference when necessary. Therefore the overload is less than the creation of the whole array of atomic items.

# Exercise

## Task

<img src="img/multiverse_example_2.jpg" alt="Multiverse example with FIFO" width="80%" style="margin: 0 auto;">

* This example in the image is sequential consistency because there is a possible single thread execution that can be made to reproduce this behavior. Because of that the two conditions are fulfil.
* However, it is not **linearizable** because during the time of execution of every operation, it cannot be possible to maintain the correctness of the implementation. Since the beginning of the operation `enq(y)` the operation `enq(x)` has been already finished. Because of that, the result of the `deq(y)` should not be possible, the real result should be `x`

<img src="img/ex5_task1.png" alt="Multiverse example with FIFO" width="80%" style="margin: 0 auto;">

* For this exercise the reasoning is the same. We can found a single thread execution that delivers the same result that the given by this history. Because of that the execution is *sequential consistency*. For the linearizable, the operations should be execute during its invocation and return and delivers a correctnes result. In that case, the next execution fullfil this condition:

    * C_r.write(2), B_r.write(1), A_r.read(1), B_r.read(1)

This confirms that the system is **linearizable**

## Task

* I should always work with two trheads to find the critical points where the information could be wrong. In this case after an execution with two threads, I have not find any error in the execution, but it was because I did not stop in the critical points. As it was mentioned in the solution: 
    * The implementation is not linearizable. Suppose the queue is empty. Thread A calls enq(x) and thread B calls enq(y) simultaneously. Further suppose that thread A completes the loop at line 11 first, and a slot is reserved by thread A. However, due to some reasons thread A is stuck between line 13 and line 14. While in this time, thread B finishes the call to enq(y), and start another call to deq(). Since x has not been written to its slot by A, this deq() will throw at line 24. This execution history violates the sequential specification of FIFO: Since y has been enqueued, the call to deq() should return y, not throw an exception.

# Exercise

* The implementation of the exercise 6 follows something similar to the implementation of lock-free data structures during the 05 Slides of the Lecture. The idea is always to protect variables that cannot be atomic. In this case, the `data` array is not an atomic variable and therefore, we need some strategies to protect that only one thread could be access the information in a concurrent matter. This is made using the next strategie:
    * Use `writers` while loop to allow only one thread at the time in the array. This would lead to maintain waiting other threads for the access. However, this also means that two threads that wants to update a different position of the array cannot do that.
    * Use `history` atomic variable to avoid that a read and write operation happens at the same time and cause a data race condition. The atomic variable in this case is used with `fetch_add` and *happens-before* relationship to avoid that two threads one read and one write happens at the same time creating an undefined value during the read process.
    
* The last point is really straight forward, it would use the `while` loop and the `compare_exchange_weak` strategie seen in the Lecture to choose the maximum value. The only difference in that case, is that the first use the RMW approach with the `compare_exchange_weak` and the second one the read-and-conditional-store instruction.