## What is a distributed system

- A collection of autonomous computing elements that appears to its users as a single coherent system; a software system in which components located on networked computers communicate and coordinate their actions by passing messages
    - Leslie Lamport (2013 Turing Award recipient): a system in which the failure of one computer can render others unusable (e.g., DNS sinkhole)
    - Maarten van Steen: easy to disassemble but hard to put back together



## Why build distributed system

1. Resource sharing saves money
    - Office staff share a printer and a file server
    

2. Integrating multiple systems into one can simplify business processes
    - payroll system talk to accounting system


3. A centralized system may not be powerful or dependable enough to solve a problem
    - Google search does not fit on a single server


4. In some scenarios the users are distributed around the world
    - Social network

## Middleware

- In order to support heterogeneous computers and networks while offering a **single-system view**, distributed systems are often organized as middleware: *a layer of software that separates applications from the underlying platforms*

<img src="img/Snip20200511_1.png" width=80%/>

### Common Middleware Services

- Communication
    - e.g., add job to remote queu
- Transactions
    - e.g., access two independent services automatically
- Service composition
    - e.g., Google map enhanced with weather forecast
- Reliability
    - e.g., replicated state machine

## Goals of Distributed Systems

- Supporting resource sharing
- Making distribution transparent
    - Users should not be aware of the distributed essence of the system
    - Achieveing 100% transparency is impossible
- Being open
- Being scalable
    - Increase performance by horziontal scaling

### Supporting Resource Sharing

- Peripheral devices (e.g., printers, video cameras)
- Storage facilities (e.g., file server)
- Enterprise data (e.g., contact info, payroll)
- Web pages (e.g., web search)
- CPUs (e.g., supercomputer)

### Making Distribution Transparent

- Distributed systems, particularly middleware systems, attempt to provide **distribution transparency**, that is to hide the fact that processes and resources are physically distributed

|Transparency|Description|
|:---|:---|
|**Access**|Hide differences in data representation and how a resource is accessed (e.g., copper wire cables vs optical cables)|
|**Location**|Hide where a resource is located (e.g., server location)|
|**Migration**|Hide that a resource may move to another location (e.g., DNS for dynamic IPs)|
|**Relocation**|Hide that a resource may be moved to another location while in use|
|**Replication**|Hide that a resource is replicated (e.g., uploading files to S3)|
|**Concurrency**|Hide that a resource may be shared by several competitive users (e.g., multiple users sharing a single cloud drive)|
|**Failure**|Hide the failure and recovery of a resource (e.g., resume download from a different server when one server fails)|

### Being Open

- An **open distributed system** offers components that can be easily used by or integrated into other systems, the openness properties:
    - Interoperability
        - Defining clear interfaces
        - Middleware provides a RPC interface for multiple communicating parties
    - Composability
        - Defining modules that can be reused in multiple components
    - Extensibility
    - Separation of policy from mechanism
        - Configurable parameters for changing a system's behavior

### Being Scalable

- **Scalability** is a system's ability to expand along three axes
    - Size (e.g., adding users and resources)
    - Geography (e.g., users on different continents)
    - Administration (e.g., multiple independent admins)


- Design concepts tend to limit scalability
    - Centralized service (e.g., a single server for all users)
    - Centralized data (e.g., a single online telephone book)
    - Centralized algorithms (e.g., routing based on complete information)


- Scaling techniques
    - Hiding communication latencies, e.g., validating web form at server vs at client
    - Replication, e.g., using distributed memory cache to speed up web application
    - Partitioning, e.g., original DNS name space was divided into zones


## Fallacies/Pitfalls of Networked and Distributed Computing

1. The network is reliable
2. The network is secure
3. The network is homogeneous
4. The topology does not chaneg
5. The latency is zero
6. The bandwidth is infinite
7. Transport cost is zero
8. There is one administrator


## Types of Distributed Systems

- Websites and web services 
- High performance computing (HPC)
- Cluster computing (e.g., Hadoop, Spark)
- Cloud and grid computing
- Transaction processing
- Enterprise application integration (EAI)
- Distributed pervasive systems / Internet of things (IoT)
- Sensor networks

## High Performance Computing

<img src="img/Snip20200513_1.png" width=80%/>

## Shared Memory vs Message Passing Paradigms

#### Shared Memory Paradigm

- Threads communicate by accessig shared variables, easier to program but requires shared variable abstraction
- Used heavily for solving CPU-intensive problems
- Tend to be called "parallel computing"

#### Message Passing Paradigms

- Processes communicate by sending and receiving messages over a network (more scalable but programmer deals with messages)
- Used heavily for resource sharing and coordination
- Tend to be called "distributed computing"



## Cluster Computing Systems

- One of the most common type message passing systems
- Cluster computing frameworks distribute CPU or I/O intensive jobs across multiple servers

<img src="img/Snip20200513_2.png" width=80%/>



## Cloud and Grid Computing

- Grid computing, also called commodity computing

<img src="img/Snip20200513_3.png" width=80%/>




## Transaction Processing Systes

- Distributed transactions are coordinated by a **transaction processing (TP) monitor**
- To ensure atomocity, implement atomic transaction commitment protocol, which can be a feature of the TP monitor

<img src="img/Snip20200513_4.png" width=80%/>

## Enterprise Application Integration

- Middleware is often used as a communication facilitator

<img src="img/Snip20200513_5.png" width=80%/>

## Sensor Networks

- Relies heavily on **in-network data processing** to reduce communication costs
- Example: activity tracking in smartwatches

<img src="img/Snip20200513_6.png" width=80%/>