### Synchronization and Thread Safety

Uncontrolled reading and writing of shared variables leads to unpredictable outcomes.  Reading and writing without synchronization leads to __race conditions__ in which computations produce different results depending the order in which threads are executed by the operating system.

<img src="https://i.stack.imgur.com/m7HYo.png" width=512 title="Race Condition" />

The problem arises because `x = x + 1` is actually multiple operations.  Each thread:

```
reads value of x for memory
updates local copy of x
writes local copy to memory
```

Race conditions lead to bugs when one of the outcomes is undesirable.  It is the job of the programmer to __explicitly order__ operations so that bugs do not arise.

__Thread safety__ is the idea that functions can be called from multiple threads concurrently and will produce correct results.  When you synchronize your code, it should be thread safe.

We will look at several constructs for synchronization in Java:
  * `synchronized` blocks
  * the volatile keyword
  * `Atomic` variables 

The most frequent and useful operation is to place a `synchronize` block around racing operations.  This synchronization creates a __critical section__ of code.

In [1]:
class SynchronizedWorks implements Runnable
{
  // Create variables for testing.
  static int sharedvar = 0;
  static int sharedsynchvar = 0;

  public void run ()
  { 
    for ( int i=0; i<10000000; i++ )
    {
      sharedvar++;
      synchronized(SynchronizedWorks.class){sharedsynchvar++;}
    }
  }
}

int numthreads = 4;
Thread[] threads = new Thread[numthreads];

// create and start thread objects
for ( int i=0; i<numthreads; i++ )
{
    threads[i] = new Thread ( new SynchronizedWorks() );
    threads[i].start();
}

// Await the completion of all threads
for ( int i=0; i<numthreads; i++ )
{
    threads[i].join();
}

System.out.println("Shared variable = " + SynchronizedWorks.sharedvar);
System.out.println("Shared synchronized variable = " + SynchronizedWorks.sharedsynchvar);

Shared variable = 39686495
Shared synchronized variable = 40000000


### synchronized

* A Java `synchronize` block:
  * Has only one thread accessing the block at a time.
  * Is reconciled with memory at start and end of block.
  * This notion of black is the same as OpenMP.  Single entry point, single exit point.
      
These guarantees ensure that the all of the operations of a thread are completed and all changes are written to a shared coherent memory before any other threads execute the block.  This is good enough to make the operation __atomic__.

_Def'n_ __atomic__ a sequence of operations is executed by a processor as an indivisible unit that cannot be interrupted.

This is a lame and controversial definition (see https://en.wikipedia.org/wiki/Linearizability), but is adequate for us.  The notion is that all the operations happen as an "atom" that cannot be divided. 

While we're criticizing definitions, _synchronized_ is a terrible word.  It means _v. To make two or more events happen at exactly the same time or at the same rate._  This is not what the way it's used in CS.


### volatile

Java implements the the declaration specifier `volatile`, which it inherited from C/C++.  Variables that are declared as `volatile`
* read the variable from memory
* write the variable to memory
* ensure the all operators are atomic
This seems like it might be good enough.......but alas.  The problem is that even though all operations are atomic, the combination of operations is not.

I point this out because many programmers have read about volatile and thought it would get the job done.  `volatile` is an important building block for concurrency control, but not useful to application programmers directly.

In [2]:
class VolatileWorks implements Runnable
{
  static int sharedvar = 0;
  static volatile int sharedvolvar = 0;

  public void run ()
  {
    for ( int i=0; i<10000000; i++ )
    {
      sharedvar++;
      sharedvolvar++;
    }
  }
}

int numthreads = 4;
Thread[] threads = new Thread[numthreads];

// create and start thread objects
for ( int i=0; i<numthreads; i++ )
{
    threads[i] = new Thread ( new VolatileWorks() );
    threads[i].start();
}

// Await the completion of all threads
for ( int i=0; i<numthreads; i++ )
{
    threads[i].join();
}

System.out.println("Shared variable = " + VolatileWorks.sharedvar);
System.out.println("Shared volatile variable = " + VolatileWorks.sharedvolvar);

Shared variable = 12470527
Shared volatile variable = 12859733


### java.util.concurrent.atomic

Jave also provides "atomic" classes that wrap all basic types.  They make all basic operations on these variables atomic through member functions.  They do so with a "lock-free, thread safe encapsulation of fundamental types."  All operations are of the read/modify/write type that we will discuss in the concurrency lectutre. That doesn't mean much now.  But, they guarantee atomicity.

In [2]:
import java.util.concurrent.atomic.AtomicInteger;

class AtomicWorks implements Runnable
{
  static int sharedvar = 0;
  static AtomicInteger sharedatomint = new AtomicInteger();

  public void run ()
  {
    for ( int i=0; i<10000000; i++ )
    {
      sharedvar++;
      sharedatomint.incrementAndGet();
    }
  }
}

int numthreads = 4;
Thread[] threads = new Thread[numthreads];

// create and start thread objects
for ( int i=0; i<numthreads; i++ )
{
    threads[i] = new Thread ( new AtomicWorks() );
    threads[i].start();
}

// Await the completion of all threads
for ( int i=0; i<numthreads; i++ )
{
    threads[i].join();
}

System.out.println("Shared variable = " + AtomicWorks.sharedvar);
System.out.println("Shared atomic variable = " + AtomicWorks.sharedatomint);

Shared variable = 15572713
Shared atomic variable = 40000000


### On Performance

The relative performance of these constructs depends on many things.  Here are some guidelines.
* Atomics are best for single operations.  
  * they are implemented with single instruction hardware support.
  * they have to perform reads/writes to memory on every operation
* Sychronized blocks are more flexible and better for multiple operations
  * they are implemented with lockiing
  * compiler can cache values during block execution
  * memory is written only on block exit

Typical practice is to use synchronized blocks and not worry too much.  But rememeber, __all code in a sycnhronized block is running serially__, regardless of how many threads/cores/etc. your system has.  Minimize code in synchronized blocks.

## The `synchronize` bug

Synchronized only applies to an object (or class). A frequent mistake is to apply to an object and assume it will synchronize all objects of this class. This type of error quite difficult to find and debug.

It can be right to synchronize to an object and I have many examples.  But, our examples do the following:
  * map an object to a thread
  * synchronize on the class to ensure that only one thread access shared data
  
This approach is good parallel design and is robust.

In [3]:
class SynchronizedBug implements Runnable
{
  static int sharedvar = 0;
  static int sharedsynchvar = 0;

  public void run ()
  { 
    for ( int i=0; i<10000000; i++ )
    {
      sharedvar++;
      // NOTE synchronizing on object not class
      synchronized(this){sharedsynchvar++;}
    }
  }
}

int numthreads = 4;
Thread[] threads = new Thread[numthreads];

// create and start thread objects
for ( int i=0; i<numthreads; i++ )
{
    threads[i] = new Thread ( new SynchronizedBug() );
    threads[i].start();
}

// Await the completion of all threads
for ( int i=0; i<numthreads; i++ )
{
    threads[i].join();
}

System.out.println("Shared variable = " + SynchronizedBug.sharedvar);
System.out.println("Shared synchronized variable = " + SynchronizedBug.sharedsynchvar);

Shared variable = 24820557
Shared synchronized variable = 20730602
