## Critical Section
A region of code that must be executed by only one thread at a time, usually when accessing a shared resource like shared data, network connection or hardware device. Only one thread can enter the critical section at a time, all other threads are locked out. When this thread leaves the critical section, one of the other threads can now enter it. This is called the locking protocol.

## Mutex
Mutex is a mutual exclusion object, used to implement locking protocol. Mutex has two states 'locked' and 'unlocked'. If the mutex is unlocked, a thread can enter the critical section. If the mutex is locked, no thread can enter until it becomes unlocked. A thread locks the mutex when it enters the critical section. A thread unlocks the mutex when it leaves the critical section. These rules ensure that only one thread can be in the mutex at a given time. Unlocking a mutex also pushes any changes to the shared object/data, so that the new value is available for the other threads.  
C++ standard library provides std::mutex class for this, we can use objects of this class to syncronize threads. The mutex object should be visible to all the thread functions that need syncronization.
```
std::mutex task_mutex;

void print(std::string str)
{
    for(int i = 0; i < 5; i++)
    {
        //Lock the mutex before the critical section
        task_mutex.lock()
        
        std::cout << str[0] << str[1] << str[2] << std::endl;
        
        //Unlock the mutex after the critical section
        task_mutex.unlock();
        
    }
}

int main()
{
    std::thread thr1(print, "abc");
    std::thread thr2(print, "def");
    std::thread thr3(print, "xyz");
    
    thr1.join();
    thr2.join();
    thr3.join();
}
```
With the mutex the output is not scrambled. lock() is blocking call, thread gets blocked until it gets the mutex. There is also a try_lock() method which returns immediately, returns true if it locked the mutex else false, main use case is to be able to write a loop till the thread gets the mutex and do something else when it does not get the mutex before trying again.
```
//Keep trying to get the lock
while(!task_mutex.try_lock())
{
    
    //Could not get the mutex, try later
    std::this_thread::sleep_for(100ms);
}

//Finally got the mutex, can execute the critical section
```

## Internally Synchronized Class
C++ STL containers need to be externally syncronized, by locking a mutex before calling a member function. They are not internally syncronised(thread safe). We can provide internal syncronization for our own types, with std::mutex as a data member and locking/unlocking when accessing the class's data. Here the class is taking the resposibility to prevent the data race not the caller of the class.
```
class sync_vector
{
public:
    void push_back(const int& i)
    {
        m_mut.lock();
        m_vec.push(i);
        m_mut.unlock();
    }
    
private:
    std::mutex m_mut;
    std::vector<int> m_vec;
}
```

## Lock Gaurd
If an exception is thrown in a critical section, then the mutex will be left locked.
```
try
{
    task_mutex.lock();
    
    //Critical section throws an exception
    
    task_mutex.unlock(); //Never gets called
}
catch(std::exception &e)
{
}
```
That's why we don't use the mutex class directly. C++ provides wrapper classes on mutex. These classes use the RAII idiom to manage resources, in this case the resource is the std::mutex. We create the wrapper class on the stack, when the object goes out of scope, destructor is called and the mutex is unlocked, even when an exception is thrown.  
The first one is the std::lock_gaurd, very basic wrapper with only a constructor and a destructor. std::lock_guard is template class, templete parameter is the type of the mutex. That is because C++ has different types of mutexes.
```
try
{
    std::lock_guard<std::mutex> lck_guard(task_mutex);
    
    //Critical section that might throw an exception
    
    //When lck_guard goes out of scope, task_mutex is unlocked 
}
catch(std::exception &e)
{
}
```

## Unique Lock
std::unique_lock has the same basic features as std::lock_guard, plus a unlock member function. We can explicitly call unlock no need to wait for the destructor.
```
try
{
    std::unique_lock<std::mutex> uniq_lock(task_mutex);
    
    //Critical section that might throw an exception
    
    uniq_lock.unlock();
    
    //do something which is not critical
}
catch(std::exception &e)
{
}
```
std::unique_lock constructor gives more options. If the second argument is std::try_to_lock, then mutex's try_lock() will be used, so the constructor will immediately return, it has owns_lock() member to check if mutex is locked. If we pass std::defer_lock as the second argument, then the constructor will not lock the mutex, we have to lock() explicitly. If we pass std::adopt_lock as the second argument, the constructor will assume that the mutex is already locked, helps in situations where mutex can be locked twice. std::unique_lock is a move only object like the std::unique_ptr.  
For a basic case we have to use a lock_gaurd as it is small and fast, if we need any of the above functionality we can use the unique_lock.

## Timed Mutex
std::timed_mutex is similar to std::mutex, but with extra member functions. try_lock_for(), keep trying to lock the mutex for a specified duration. try_lock_until(), keep trying to lock the mutex until a specified time. These return true if they get the mutex else false.
```
std::timed_mutex the_mutex;

void task1()
{
    the_mutex.lock();
    std::this_thread::sleep_for(5s);
    the_mutex.unlock();
}

void task2()
{
    std::this_thread::sleep_for(500ms);
    
    //Try for 1 second to lock the mutex
    while(!the_mutex.try_lock_for(1s))
    {
        //Try again on the next iteration
    }
    
    //The mutex is locked now, execute the critical section
    
    the_mutex.unlock();
}

int main()
{
    std::thread thr1(task1);
    std::thread thr2(task2);
    
    thr1.join();
    thr2.join();
}
```
Like the normal std::mutex we can use any of the wrapper class on std::timed_mutex, hence wrapper classes like std::lock_guard and std::unique_lock are template parameterised with mutex type. std::unique_lock also has try_lock_for() and try_lock_until() methods, but you cannot use these methods if the template parameter is std::mutex, you get a compilation error.
```
std::timed_mutex the_mutex;

void task1()
{
    //Use lock_guard, no need to explicitly call unlock
    std::lock_guard<std::timed_mutex> lck_guard(the_mutex);
    std::this_thread::sleep_for(5s);
}

void task2()
{
    std::this_thread::sleep_for(500ms);
    
    //std::defer_lock, will not lock the mutex in the constructor
    std::unique_lock<std::timed_mutex> uniq_lck(the_mutex, std::defer_lock);
    
    //Try for 1 second to lock the mutex
    while(!uniq_lck.try_lock_for(1s))
    {
        //Try again on the next iteration
    }
    
    //The mutex is locked now, execute the critical section
}

int main()
{
    std::thread thr1(task1);
    std::thread thr2(task2);
    
    thr1.join();
    thr2.join();
}
```
try_lock_for() and try_lock_until() may return later than requested due to thread scheduling delays.  
lock() cannot be called twice by the same thread before unlock(), this behaviour is undefined. C++ also has std::recursive_mutex, whose lock() can be called repeatedly by the same thread without calling unlock(), for each lock() call there must eventually be an unlock() call, else other threads cannot access the critical section. Using std::recursive_mutex is normally a sign of bad design. 

## Multiple Reader, Single Writer
In cases like financial data feed for infrequently traded stocks, audio/vedio buffers in multimedia players, there will be many clients accessing the data(read) but only a occasional update(write). Here there is a high probability that many readers want concurrent access, in which case locking is not required. Other cases where atleast one writer is asking for access is low probability and need locking. With std::mutex all threads are syncronised even when not required(all readers case), loss of concurrency reduces performance.
```
std::mutex mut;
//Shared variable
int x = 0;

void write()
{
    std::lock_guard<std::mutex> lck_guard(mut);
    ++x;
}

void read()
{
    std::lock_guard<std::mutex> lck_guard(mut);
    std::cout << x << std::endl;
    std::this_thread::sleep_for(100ms);
}

int main()
{
    std::vector<std::thread> threads;
    for(int i = 0; i < 20; i++)
    {
        //Thread is moved into the vector
        threads.push_back(std::thread(read));
    }
    
    threads.push_back(std::thread(write));
    threads.push_back(std::thread(write));
    
    for(int i = 0; i < 20; i++)
    {
        //Thread is moved into the vector
        threads.push_back(std::thread(read));
    }
    
    for(auto& thr : threads)
    {
        thr.join();
    }
}
```
To solve this problem we need selective locking, lock only when one of the threads is writing. These are called read-write lock.

## Shared Mutex
C++ implemented read-write lock with std::shared_mutex. It can be locked in 2 ways. Exclusive lock, no other thread may acquire a lock(like lock on other mutex objects).  Shared lock, other threads may acquire a shared lock and execute critical sections concurrently. To get a exclusive lock we have to use std::lock_guard\<std::shared_mutex\> or std::unique_lock\<std::shared_mutex\>. To get shared lock we have to use std::shared_lock\<std::shared_mutex\>. We can get a exclusive lock only if no other thread is having a shared or exclusive lock, else it has to wait for all other threads to unlock. We can get a shared lock if no other thread is having a exclusive lock.
```
std::shared_mutex shmut;
//Shared variable
int x = 0;

void write()
{
    std::lock_guard<std::shared_mutex> lck_guard(shmut);
    ++x;
}

void read()
{
    std::shared_lock<std::shared_mutex> shared_lck(shmut);
    std::cout << x << std::endl;
    std::this_thread::sleep_for(100ms);
}

int main()
{
    std::vector<std::thread> threads;
    for(int i = 0; i < 20; i++)
    {
        //Thread is moved into the vector
        threads.push_back(std::thread(read));
    }
    
    threads.push_back(std::thread(write));
    threads.push_back(std::thread(write));
    
    for(int i = 0; i < 20; i++)
    {
        //Thread is moved into the vector
        threads.push_back(std::thread(read));
    }
    
    for(auto& thr : threads)
    {
        thr.join();
    }
}
```
std::shared_mutex also has member functions if you want to work with it directly. Exclusive, lock(), try_lock(), unlock(). Shared, lock_shared(), try_lock_shared(), unlock_shared(). std::shared_mutex uses more memory and is slower than std::mutex, but it is suited for situations there are a lot of read threads and small number of write threads. We get a performance boost with concurrency.

## Shared Static Data Initialization
Shared data can be a global variable, static variable at namespace scope, static class member and static local variable. The first 3 are initialized when the program starts before the main() function is called, no data race as only one thread is running at this point. For the static local variable initialization happens when the function is called during program execution, this may result in 2 or more threads calling the constructor concurrently, thus may result in data race. Before C++ 11 the only way to solve this is to use a mutex, but that would mean locking the mutex every time the program passes through the decleration not just during initialization. This is resolved in C++ 11, local static variable initialization is internally syncronised. This is only for initalization, for subsequent modifications the usual rules of shared data apply, they have to be protected. Below is a C++ singleton implementation before and after C++ 11.
```
class singleton
{
public:
    //Before C++ 11
    static singleton* get_singleton()
    {
        mut.lock();
        if(single == nullptr)
        {
            single = new singleton();
        }
        mut.unlock();
        return single;
    }
    
    //In C++ 11 we can take advantage of thread-safe initialization of static local variables
    //We can also move this function outside the singleton class too
    static singleton& get_singleton()
    {
        static singleton single;
        return single;
    }
    
    //Delete copy and move operators here
    
    //Class functionality here

private:
    //Private constructor here to make the object singleton
    
    static singleton* single;
    static std::mutex mut;
}
```

## Thread-local Data
Use the thread_local keyword to declare them, then each thread will have a copy of the object. These variables can be global varaibles or data members of a class or local variables in a function. They are always initialized the first time they are used and destroyed when the thread completes execution.
```
//Thread local random number engine
//One varaible for each thread
std::thread_local mt19937 mt;

void func()
{
    std::uniform_real_distribution<double> dist(0, 1);
    
    for(int i = 0; i < 10; i++)
    {
        std::cout << dist(mt) << ",";
    }
}

int main()
{
    std::thread thr1(func);
    thr1.join();
    
    std::thread thr2(func);
    thr2.join();
}

```
Both threads will generate the same random numbers as they are using seperate engine objects.

## Lazy initialization
Lazy initialization is another concept that might need handling in multi-threaded code. Lazy initialization is where the varaible is only initialized when it is first used, used for objects that are expensive to construct.
```
//Lazy initialization(single-threaded)
class test
{
public:
    void func() { /*...*/ }
};

test *ptest = nullptr;    //Variable to be lazily initialized

void process()
{
    if(!ptest)                    //First time variable has been used
    {
        ptest = new test();       //Initialize it
    }
    ptest->func();                //Use it
}
```
ptest initialization is not thread safe here, we can use a mutex for the initialization but it is inefficient as mutex gets locked always, not just for the first time when it is needed. We can solve this by using a shared lock. Another way to solve this is to check the variable ptest twice.
```
//Lazy initialization
class test
{
public:
    void func() { /*...*/ }
};

std::mutex mut;
test *ptest = nullptr;            //Variable to be lazily initialized

void process()
{
    if(!ptest)                    //First time variable has been used
    {
        std::lock_guard lck_guard(mut);
        if(!ptest)
        {
            ptest = new test();       //Initialize it
        }
    }
    ptest->func();                //Use it
}
```
The second check is needed because when one thread has entered the lock, another thread could be interleaved and waiting at the lock statement. But this implementation too could still have a problem in C++, as new is a 3 step operation, construct the required memory, assign the pointer to ptest and initialize the memory. Say one thread is initializing ptest, it got the pointer but the memory is not initialized yet, if another thread comes at this point ptest check will pass, so it calls func() on an uninitialized object. One way to solve this is to use std::call_once, this will guarantee that a given function will be called only once. So only one thread executes it and it cannot be interrupted while the function is executing. std::call_once is thread safe.
```
//Lazy initialization
class test
{
public:
    void func() { /*...*/ }
};

test *ptest = nullptr;            //Variable to be lazily initialized
std::once_flag ptest_flag;

void process()
{
    //Pass a callable object which performs the initialization
    std::once_call(ptest_flag, [](){
        ptest = new test();
        });
    ptest->func();                //Use it
}
```

## Deadlock
A thread is dead locked when it cannot run, normally this happens when two or more threads are waiting on each other.
```
std::mutex mut1;
std::mutex mut2;

void func_a()
{
    std::lock_guard lck_guard1(mut1);
    std::this_thread::sleep_for(50ms);
    std::lock_guard lck_guard2(mut2);
    std::this_thread::sleep_for(50ms);
}

void func_b()
{
    std::lock_guard lck_guard1(mut2);
    std::this_thread::sleep_for(50ms);
    std::lock_guard lck_guard2(mut1);
    std::this_thread::sleep_for(50ms);
}

int main()
{
    std::thread thr_a(func_a);
    std::thread thr_b(func_b);
    
    thr_a.join();
    thr_b.join();
}
```
One way to avoid deadlock is for both the threads to try to acquire the locks in the same order. We need better approaches as this might not be feasible in large programs.  
One such solution is to lock multiple mutexes in a single operation. C++ has std::scoped_lock for this, it is similar to lock_guard except it can lock more than one mutex at the same time.
```
std::mutex mut1;
std::mutex mut2;

void func_a()
{
    std::scoped_lock scoped_lck(mut1, mut2);
    std::this_thread::sleep_for(50ms);
}

void func_b()
{
    std::scoped_lock scoped_lck(mut1, mut2);
    std::this_thread::sleep_for(50ms);
}

int main()
{
    std::thread thr_a(func_a);
    std::thread thr_b(func_b);
    
    thr_a.join();
    thr_b.join();
}
```
Global std::lock and std::try_lock methods also allow to lock multi locks at the same time.  
Sometimes it is not feasible to acquire multiple locks simultaneously, then a common technique is to use ordering, like a thread cannot lock a mutex unless it has locked a mutex with a lower number. This is known as hierarchical mutex.  
Some ways to avoid deadlock are, avoid waiting for a thread while holding a lock, avoid nested locks, if you need multiple locks acquire them in a single operation.  

## Livelock
In livelock too the program cannot make the progress, but the threads are still active and not doing anything useful, stuck in a loop.
```
std::mutex mut1;
std::mutex mut2;

void func_a()
{
    std::this_thread::sleep_for(10ms);
    bool locked = false;
    
    while(!locked)
    {
        std::lock_guard lck_guard1(mut1);
        std::this_thread::sleep_for(2s);
        locked = mut2.try_lock();
    }
    
    if(locked)
    {
        std::cout << "Thread A has locked both mutexes" << std::endl;
    }
}

void func_b()
{
    bool locked = false;
    
    while(!locked)
    {
        std::lock_guard lck_guard2(mut2);
        std::this_thread::sleep_for(2s);
        locked = mut1.try_lock();
    }
    
    if(locked)
    {
        std::cout << "Thread B has locked both mutexes" << std::endl;
    }
}
```
To avoid livelock, do the dead lock avoidance in the rignt way as described in the deadlock section, like using std::scoped_lock or std::lock.  
Threads can also have priorities, not directly supported by C++. We have to use the native implemetation using std::thread's native_handle() method. A high priority thread will run more ofen, a low priority thread is interrupted more often.  
Resource starvation can happen when, a thread cannot get the resources it needs to run(e.g deadlock/livelock), lack of system resources can prevent a thread starting(e.g system memory exhausted) or low priority threads may get starved of processor time.

## Conclusion
Always hold the lock for the sortest possible time, else it will have a performance impact. Recomendation for reading shared data is to lock, make a copy of the shared data, unlock and then process the copy. Recomendation for writing shared data is to lock, make a copy of the shared data, unlock and then process the copy, lock again, update the shared data from the copy and then unlock. When making data structures thread safe, do not lock any more elements than necessary or do not make locking too fine-grained, we have to find a middle path. For example if you lock the entire linked list then other threads will be blocked even when accessing unrelated elements. On the other hand if it is too fine-grained like locking individual elements, then operation like insert and delete might have a data race as these operations effect neibouring elements. Locking/unlocking are slow operations, syncronization with semaphores is much faster.