# System Design - Study Notes
---

# CAP Theorem
---

In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:

 * __Consistency__: Every read receives the most recent write or an error.
 * __Availability__: Every request receives a (non-error) response, without the guarantee that it contains the most recent write.
 * __Partition tolerance__: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

When a network partition failure happens should we decide to:

 * Cancel the operation and thus decrease the availability but ensure consistency.
 * Proceed with the operation and thus provide availability but risk inconsistency.

The CAP theorem implies that in the presence of a network partition, one has to choose between consistency and availability. Note that consistency as defined in the CAP theorem is quite different from the consistency guaranteed in ACID database transactions.

Eric Brewer argues that the often-used "two out of three" concept can be somewhat misleading because system designers only need to sacrifice consistency or availability in the presence of partitions, and that in many systems partitions are rare.

# Database Normalization
---

Database normalization is the process of structuring a database, usually a relational database, in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by Edgar F. Codd as part of his relational model.

Normalization entails organizing the columns (attributes) and tables (relations) of a database to ensure that their dependencies are properly enforced by database integrity constraints. It is accomplished by applying some formal rules either by a process of synthesis (creating a new database design) or decomposition (improving an existing database design).

![Database Normal Forms](images/normal_forms.png "Database Normal Forms")

# HATEOAS (Hypertext As The Engine Of Application State)
---

HATEOAS stands for Hypertext As The Engine Of Application State. It means that hypertext should be used to find your way through the API.

An example:

~~~
GET /account/12345 HTTP/1.1

HTTP/1.1 200 OK
<?xml version="1.0"?>
<account>
    <account_number>12345</account_number>
    <balance currency="usd">100.00</balance>
    <link rel="deposit" href="/account/12345/deposit" />
    <link rel="withdraw" href="/account/12345/withdraw" />
    <link rel="transfer" href="/account/12345/transfer" />
    <link rel="close" href="/account/12345/close" />
</account>
~~~

Apart from the fact that we have 100 dollars (US) in our account, we can see 4 options: deposit more money, withdraw money, transfer money to another account, or close our account. The "link"-tags allows us to find out the URLs that are needed for the specified actions. Now, let's suppose we didn't have 100 usd in the bank, but we actually are in the red:

~~~
GET /account/12345 HTTP/1.1

HTTP/1.1 200 OK
<?xml version="1.0"?>
<account>
    <account_number>12345</account_number>
    <balance currency="usd">-25.00</balance>
    <link rel="deposit" href="/account/12345/deposit" />
</account>
~~~

Now we are 25 dollars in the red. Do you see that right now we have lost many of our options, and only depositing money is valid? As long as we are in the red, we cannot close our account, nor transfer or withdraw any money from the account. The hypertext is actually telling us what is allowed and what not: HATEOAS.

# OSI (Open Source Interconnection) Model
---

![OSI Model](images/osi2.png "OSI Model")

![OSI Model](images/osi.png "OSI Model")

# OWASP Top 10 Web Application Security Risks
---

1. __A1:2017 - Injection__:

 Injection flaws, such as SQL, NoSQL, OS, and LDAP injection, occur when untrusted data is sent to an interpreter as part of a command or query. The attacker’s hostile data can trick the interpreter into executing unintended commands or accessing data without proper authorization.

2. __A2:2017 - Broken Authentication__:

 Application functions related to authentication and session management are often implemented incorrectly, allowing attackers to compromise passwords, keys, or session tokens, or to exploit other implementation flaws to assume other users’ identities temporarily or permanently.

3. __A3:2017 - Sensitive Data Exposure__:

 Many web applications and APIs do not properly protect sensitive data, such as financial, healthcare, and PII. Attackers may steal or modify such weakly protected data to conduct credit card fraud, identity theft, or other crimes. Sensitive data may be compromised without extra protection, such as encryption at rest or in transit, and requires special precautions when exchanged with the browser.

4. __A4:2017 - XML External Entities (XXE)__:

 Many older or poorly configured XML processors evaluate external entity references within XML documents. External entities can be used to disclose internal files using the file URI handler, internal file shares, internal port scanning, remote code execution, and denial of service attacks.

5. __A5:2017 - Broken Access Control__:

 Restrictions on what authenticated users are allowed to do are often not properly enforced. Attackers can exploit these flaws to access unauthorized functionality and/or data, such as access other users’ accounts, view sensitive files, modify other users’ data, change access rights, etc.

6. __A6:2017 - Security Misconfiguration__:

 Security misconfiguration is the most commonly seen issue. This is commonly a result of insecure default configurations, incomplete or ad hoc configurations, open cloud storage, misconfigured HTTP headers, and verbose error messages containing sensitive information. Not only must all operating systems, frameworks, libraries, and applications be securely configured, but they must be patched/upgraded in a timely fashion.

7. __A7:2017 - Cross-Site Scripting XSS__:

 XSS flaws occur whenever an application includes untrusted data in a new web page without proper validation or escaping, or updates an existing web page with user-supplied data using a browser API that can create HTML or JavaScript. XSS allows attackers to execute scripts in the victim’s browser which can hijack user sessions, deface web sites, or redirect the user to malicious sites.

8. __A8:2017 - Insecure Deserialization__:

 Insecure deserialization often leads to remote code execution. Even if deserialization flaws do not result in remote code execution, they can be used to perform attacks, including replay attacks, injection attacks, and privilege escalation attacks.

9. __A9:2017 - Using Components with Known Vulnerabilities__:

 Components, such as libraries, frameworks, and other software modules, run with the same privileges as the application. If a vulnerable component is exploited, such an attack can facilitate serious data loss or server takeover. Applications and APIs using components with known vulnerabilities may undermine application defenses and enable various attacks and impacts.

10. __A10:2017 - Insufficient Logging & Monitoring__:

 Insufficient logging and monitoring, coupled with missing or ineffective integration with incident response, allows attackers to further attack systems, maintain persistence, pivot to more systems, and tamper, extract, or destroy data. Most breach studies show time to detect a breach is over 200 days, typically detected by external parties rather than internal processes or monitoring.

# Requirements Gathering Checklist
---

Regarding the system to be designed:

 * Who is going to use it?
 * How are they going to use it?
 * How many users are there?
 * What does the system do?
 * What are the inputs and outputs of the system?
 * How much data do we expect to handle?
 * How many requests per second do we expect?
 * What is the expected read to write ratio?

# SOLID Principles
---

The following five concepts make up the SOLID principles:

 * __Single Responsibility__
 
 A class should only have one responsibility. Furthermore, it should only have one reason to change.
 
 * __Open/Closed__

 Classes should be open for extension but closed for modification. In doing so, we stop ourselves from modifying existing code and causing potential new bugs.

 * __Liskov Substitution__

 If class B is a subtype of class A, we should be able to replace A with B without disrupting the behavior of our program.

 * __Interface Segregation__

 Larger interfaces should be split into smaller ones. By doing so, we can ensure that implementing classes only need to be concerned about the methods that are of interest to them.

 * __Dependency Inversion__

 The principle of dependency inversion refers to the decoupling of software modules. This way, instead of high-level modules depending on low-level modules, both will depend on abstractions.

# SQL vs NoSQL
---

![SQL vs No-SQL](images/sql_nosql.png "SQL vs No-SQL")

## Reasons for SQL

 * Structured data
 * Strict schema
 * Relational data
 * Need for complex joins
 * Transactions
 * Clear patterns for scaling
 * More established: developers, community, code, tools, etc
 * Lookups by index are very fast

## Reasons for NoSQL

 * Semi-structured data
 * Dynamic or flexible schema
 * Non-relational data
 * No need for complex joins
 * Store many TB (or PB) of data
 * Very data intensive workload
 * Very high throughput for IOPS

## Sample data well-suited for NoSQL

 * Rapid ingest of clickstream and log data
 * Leaderboard or scoring data
 * Temporary data, such as a shopping cart
 * Frequently accessed ('hot') tables
 * Metadata/lookup tables