# IT Security
## CIA Triad
In computer security, CIA stands for confidentiality, integrity, and availability. The **CIA triad** is a guiding model for designing information security policies. 
* **Confidentiality** means keeping things hidden by limiting access to our data. 
* **Integrity** means keeping our data accurate and untampered with. The data that we send or receive should remain the same throughout its entire journey. 
* **Availability** means that the information is readily accessible to people, like being prepared if our data is lost or if our system is down. 

## Security Threats
#### RISK
>**Risk** is the possibility of suffering a loss in the event of an attack on the system. 

#### VULNERABILITY
>**Vulnerability** is a flaw in the system that could be exploited to compromise the system. There's a special type of vulnerability called a **0-day vulnerability** or **zero day** which is unknown to the software developer or vendor, but is known to an attacker. Basically, it refers to the amount of time the software vendor has had to react to and to fix the vulnerability, zero days. 

#### EXPLOIT
>**Exploit** is a software that is used to take advantage of a security bug or vulnerability. Attackers will write up exploits for vulnerabilities they find in software to cause harm to the system. 

#### THREAT
>**Threat** is the possibility of danger that could exploit a vulnerability. 

#### HACKER
>A **hacker** in the security world is someone who attempts to break into or exploit a system. There are actually two common types of hackers: **black hat hackers**, who try to get into systems to do something malicious and **white hat hackers**, who attempt to find weaknesses in a system, but also alert the owners of those systems so that they can fix it before someone else does something malicious. 

#### ATTACK
>**Attack** is an actual attempt at causing harm to a system. 

## Malware and its types
**Malware** is a type of malicious software that can be used to obtain our sensitive information or delete or modify files. Basically, it can be used for any and all unwanted purposes. The most common types of malware are viruses, worms, adware, spyware, Trojans, root kids, backdoors, botnets, etc. 

#### VIRUS
>**Viruses** are the best known type of malware, which attaches itself to some sort of executable code like a program. When the program is running, it touches many files, each of which is now susceptible to being infected with the virus. So, the virus replicates itself on these files, does the malicious work it's intended to do, and repeats this over and over until it spreads as far as it can. 

#### WORM
>**Worms** are similar to viruses except that instead of having to attach themselves onto something to spread, worms can live on their own and spread through channels like the network. 

#### ADWARE
>**Adware** is one of the most visible forms of malware, it is a software that displays advertisements and collects data. 

#### TROJAN
>**Trojan** is malware that disguises itself as one thing but does something else. A computer Trojan has to be accepted by the user, meaning the program has to be executed by the user. No one would willingly install malware on their machine, that's why trojans are meant to entice us to install them by disguising themselves as other software. 

#### SPYWARE
>**Spyware** is the type of malware that's meant to spy on us by monitoring our computer screens, key presses, webcams, and then reporting or streaming all of this information to another party. A keylogger is a common type of spyware that's used to record every keystroke. 

#### RANSOMWARE
>**Ransomware** is a type of attack that holds our data or system hostage until we pay some sort of ransom. 

#### BOTS
>There are malwares that can utilize someone else's machine to perform a task that is centrally controlled by the attacker. These compromised machines are known as **Bots**. If there are a collection of one or more Bots, we call that network of devices a Botnet. 

#### BOTNETS
>**Botnets** are designed to utilize the power of the Internet-connected machines to perform some distributed function, example mining Bitcoin. So instead of having one computer run computations, attackers can now have a thousand computers running computations and raking in more and more Bitcoin. 

#### BACKDOOR
>A **backdoor** is a way to get into a system if the other methods to get in a system aren't allowed, it's a secret entryway for attackers. Backdoors are most commonly installed after an attacker has gain access to our system and wants to maintain that access. 

#### ROOTKIT
>A **rootkit** is a kit for root, meaning a collection of software or tools that an admin would use. It allows admin level modification to an operating system. A rootkit can be hard to detect because it can hide itself from the system using the system itself. The rootkit can be running lots of malicious processes, but at the same time those processes wouldn't show up in task manager because it can hide its own presence. 

#### LOGIC BOMB
>A **logic bomb** is a type of Malware that's intentionally installed, after a certain event or time has triggered, it will run the malicious program. 

## Network Attacks
Some types of network attacks that occur in web are:

#### DNS CACHE POISONING ATTACK
>A **DNS Cache Poisoning attack** works by tricking a DNS server into accepting a fake DNS record that will point us to a compromised DNS server. It then feeds us fake DNS addresses when we try to access legitimate websites. Not only that, DNS Cache Poisoning can spread to other networks too. If other DNS servers are getting their DNS information from a compromised server, they'll serve those bad DNS entries to other hosts. 

#### MAN IN THE MIDDLE ATTACK
>A **man-in-the-middle attack** is an attack that places the attacker in the middle of two hosts that think they're communicating directly with each other. The attack will monitor the information going to and from these hosts, and potentially modify it in transit. A common man-in-the-middle attack is a session hijacking or cookie hijacking. 

#### ROGUE ACCESS POINT ATTACK
>A **rogue access point attack** or **rogue AP attack** defines an access point that is installed on the network without the network administrator's knowledge. For example, in corporate environments, someone may plug a router into their corporate network to create a simple wireless network. This can actually be pretty dangerous, and could grant unauthorized access to an authorized secure network. 

#### EVIL TWIN ATTACK
>It's similar to the rogue AP attack but has a small important difference. The premise of an evil twin attack is for us to connect to a network that is identical to ours. This identical network is our networks evil twin and is controlled by our attacker. Once we connect to it, they will be able to monitor our traffic. 

## DoS Attacks
A **Denial-of-Service**, or **DoS attack**, is an attack that tries to prevent access to a service for legitimate users by overwhelming the network or server. DoS attacks try to take up those resources of a service, and prevent real users from accessing it, example the **Ping of Death** or **POD**. POD works by sending a malformed ping to a computer which would be larger in size than what the internet protocol was made to handle, thus it results in a buffer overflow. This can cause the system to crash and potentially allow the execution of malicious code. 

Another example is a **ping flood**, which sends tons of ping packets to a system. More specifically, it sends ICMP echo requests, since a ping expects an equal number of ICMP echo replies. If a computer can't keep up with this, then it's prone to being overwhelmed and taken down. 

Similar to a ping flood is a **SYN flood**. Remember that to make a TCP connection, a client sends a SYN packet to a server it wants to connect to. Next, the server sends back a SYN-ACK message, then the client sends in ack message. In a SYN flood, the server is being bombarded with the SYN packets. The server is sending back SYN-ACK packets but the attacker is not sending ack messages. This means that the connection stays open and is taking up the server's resources. Other users will be unable to connect to the server which is a big problem. Since the TCP connection is half-open, we also refer to SYN floods as **half-open attacks**.

A DoS attack using multiple systems, is called a **distributed denial-of-service attack** or DDoS. **DDoS attacks** need a large volume of systems to carry out an attack and they're usually helped by botnet attackers. In that scenario, they can gain access to large volumes of machines to perform an attack. 

## Client-Side Attacks
A common security exploit that can occur in software development and runs rampant on the web is the possibility for an attacker to inject malicious code, or **Injection attacks**. Injection attacks can be mitigated with good software development principles, like validating input and sanitizing data. 

**Cross-site scripting**, or **XSS attacks**, are a type of injection attack where the attacker can insert malicious code and target the user of the service. XSS attacks are a common method to achieve a session hijacking. It would be as simple as embedding a malicious script in a website, and the user unknowingly executes the script in their browser. The script could then do malicious things like steal a victims cookies and have access to a log in to a website. 

Another type of injection attack is a **SQL injection attack**. Unlike an XSS that targets a user, a SQL injection attack targets the entire website if the website is using a SQL database. Attackers can potentially run SQL commands that allow them to delete website data, copy it, and run other malicious commands. 

## Password Attacks
**Password attacks** utilize software like password crackers that try and guess our password. A common password attack is a **brute force attack**, which just continuously tries different combinations of characters and letters until it gets access. Since this attack requires testing a lot of combinations of passwords, it usually takes a while to do this. CAPTCHAs are used to distinguish a real human from a machine. In a password attack, if we didn't have a CAPTCHA available, an automated system can just keep trying to log into our account until it found the right password combination. But with a CAPTCHA, it prevents these attacks from executing. Another type of password attack is a **dictionary attack**, which tries out words that are commonly used in passwords, like apple, monkey, football. 

## Deceptive Attacks
**Social engineering** is an attack method that relies heavily on interactions with humans instead of computers. Social engineering is a kind of con game where attackers use deceptive techniques to gain access to personal information. They then try to have a user execute something, and basically scam a victim into doing that thing. 

A popular type of social engineering attack is a **phishing attack**. Phishing usually occurs when a malicious email is sent to a victim disguised as something legitimate. One common phishing attack is an email, saying our bank account has been compromised, and then, gives us a link to click on to reset our password. Another variation of phishing is **spear phishing**. Both phishing schemes have the same end goals, but spearfishing specifically targets individual or group. 

Another popular social engineering attack is **email spoofing**, i.e. when we receive an email with a misleading sender address. Spoofing is when a source is masquerading around as something else. Think of an email spoof. 

One attack happens through actual physical contact, called **baiting**, which is used to entice a victim to do something. For example, an attacker could just leave a USB drive somewhere in hopes that someone out there will plug it into their machine to see what's on it. 

Another popular attack that can occur offline is called **tailgating**, which is essentially gaining access into a restricted area or building by following a real employee in. 


## Crytography
The practice of coding and hiding messages from third parties fore securing messages is called **cryptography**. The study of this practice is referred to as **cryptology**. The opposite of this looking for hidden messages or trying to decipher coded message is referred to as **cryptanalysis**. 

**Encryption** is the act of taking a message (or **plaintext**) and applying an operation to it (called a **cipher**) which returns a garbled output or unreadable message (called **ciphertext**). The reverse process, taking the garbled output and transforming it back into the readable plain text is called **decryption**. A cipher is actually made up of two components, the encryption algorithm and the key. The **encryption algorithm** is the underlying logic or process that's used to convert the plaintext into ciphertext. These algorithms are usually very complex mathematical operations. The other crucial component of a cipher is the **key**, which introduces something unique into our cipher. A **cryptosystem** is a collection of algorithms for key generation and encryption and decryption operations.

**Frequency analysis** is the practice of studying the frequency with which letters appear in ciphertext. The premise behind this type of analysis is that in written languages, certain letters appear more frequently than others, and some letters are more commonly grouped together than others. 

**Steganography** is a related practice but distinctly different from cryptography. It's the practice of hiding information from observers, but not encoding it. Think of writing a message using invisible ink, i.e. the message is in plaintext and no decoding is necessary to read the text but the text is hidden from sight. The ink is invisible and must be made visible using a mechanism known to the recipient. Modern steganographic techniques include embedding messages and even files into other files like images or videos. 

## Security Principles
**Security through obscurity** is a general concept which basically means that we're safe from attackers if no one knows what encryption algorithms or general security practices were used to secure data, i.e. keeping the algorithm secret, our messages are secured from a snooping third party. However, security through obscurity isn't something that we should rely on for securing communication or systems. The system should remain secure, even if our adversary knows exactly what kind of encryption systems we're employing, as long as our keys remain secure. 

**Kerckhoff's principle** is a concept that a cryptosystem that comprise a cryptographic service should remain secure, even if everything about the system is publicly known except for the key. The principle, sometimes referred to as Kerckhoff's axiom or law, forms the basis of open security and security by design and contrasts directly with the deprecated security through obscurity model.

**Shannon’s maxim principle** refines the Kerckhoff's principle by stating that the enemy knows the system. According to this principle, one ought to design systems under the assumption that the enemy will immediately gain full familiarity with them.

## Symmetric Cryptography
**Symmetric-key algorithm** are called symmetric because they use the same key to encrypt and decrypt messages. A **substitution cipher** is an encryption mechanism that replaces parts of our plaintext with ciphertext. In this case, the key would be the mapping of characters between plaintext and ciphertext without knowing what letters get replaced with. A well-known example of a substitution cipher is the **Caesar cipher**, where we're replacing characters in the alphabet with others usually by shifting or rotating the alphabet, a set of numbers or characters. The number of the offset is the key. Another popular example of this is referred to as **ROT-13**, where the alphabet is rotated 13 places, but really ROT-13 is a Caesar cipher that uses a key of 13 as Thirteen is exactly half of the alphabet. 

### Stream and Block Ciphers
There are two more categories that symmetric key ciphers can be placed into, block ciphers and stream ciphers. A **stream cipher** takes a stream of input and encrypts the stream one character or one digit at a time, outputting one encrypted character or digit at a time, i.e. a one- to-one relationship between data in and encrypted data out. The **block ciphers** takes data in, places that into a bucket or block of data that's a fixed size, then encodes that entire block as one unit. If the data to be encrypted isn't big enough to fill the block, the extra space will be padded to ensure the plaintext fits into the blocks evenly. 

Stream ciphers are faster and less complex to implement, but they can be less secure than block ciphers. If the key generation and handling isn't done properly, if the same key is used to encrypt data two or more times, it's possible to break the cipher and to recover the plaintext. 

### Initialization vector
**Initialization vector** or **IV** is used for avoiding key reuse by integrating a bit of random data into the encryption key and the resulting combined key is then used to encrypt the data. The idea behind this is if we have one shared master key, we generate a one-time encryption key which can be used only once by generating a new key using the master one and the IV. In order for the encrypted message to be decoded, the IV must be sent in plaintext along with the encrypted message. 
A good example of this can be seen when inspecting the 802.11 frame of a WEP encrypted wireless packet. 

![Symmetric Cryptography Example](imgs/sym_cryp_ex.png)

The IV is included in plaintext right before the encrypted data payload. 

### Symmetric-key algorithms
**DES** or **Data Encryption Standard** is a symmetric block cipher that uses 64-bit key sizes and operates on blocks 64-bits in size. Though the key size is technically 64-bits in length, 8-bits are used only for parity checking, a simple form of error checking. This means that real world key length for DES is only 56-bits.

The key is the unique piece that protects our data and the symmetric key must be kept secret to ensure the confidentiality of the data being protected. The **key size**, defined in bits, is the total number of bits or data that comprises the encryption key. Basically, it is the upper limit for the total possible keys for a given encryption algorithm and is super important in cryptography since it essentially defines the maximum potential strength of the system. 

**AES** or **Advanced Encryption Standard** is a symmetric block cipher similar to DES that uses 128-bit blocks, twice the size of DES blocks, and supports key lengths of 128-bit, 192-bit, or 256-bit. 

**RC4** or **Rivest Cipher 4** is a symmetric stream cipher that gained widespread adoption because of its simplicity and speed. RC4 supports key sizes from 40-bits to 2,048-bits. The weakness of RC4 aren't due to brute-force attacks, but the cipher itself has inherent weaknesses and vulnerabilities. RC4 was used in a bunch of popular encryption protocols, like WEP for wireless encryption, and WPA, the successor to WEP. It was also supported in SSL and TLS until 2015 when RC4 was dropped in all versions of TLS because of inherent weaknesses. For this reason, most major web browsers have dropped support for RC4 entirely, along with all versions of SSL, and use TLS instead. The preferred secure configuration is **TLS 1.2 with AES GCM**, a specific mode of operation for the AES block cipher that essentially turns it into a stream cipher. 

**GCM**, or **Galois/Counter Mode**, works by taking randomized seed value, incrementing this and encrypting the value, creating sequentially numbered blocks of ciphertexts. The ciphertexts are then incorporated into the plain text to be encrypted. GCM is super popular due to its security being based on AES encryption, along with its performance, and the fact that it can be run in parallel with great efficiency. 

Symmetric-key algorithm are relatively easy to implement and maintain due to their symmetric nature of the encryption and decryption process. Since there's only one shared secret key that needs to be maintained and kept secure, symmetric algorithms are also very fast and efficient at encrypting and decrypting large batches of data, example Wi-Fi password at home.  

## Asymmetric Cryptography
In **asymmetric cryptosystem** or **public key ciphers**, different keys are used to encrypt and decrypt. Consider an example where two persons need to communicate securely using asymmetric encryption. Firstly, both must generate a **private key** through which a **public key** is derived. The strength of the asymmetric encryption system comes from the computational difficulty of figuring out the corresponding private key given a public key. After generating private and public key pairs, they exchange public keys, keeping their private keys as secret. This allows them to begin exchanging secure messages. The first person uses the second person's public key to encrypt his message and sends the ciphertext. The second person receives the message and uses his private key to decrypt the message and read it. Because of the relationship between private and public keys, only the second person's private key can decrypt messages as this message was encrypted using his public key. 

**Public key signatures** or **digital signatures** are used to validate and trust communication when messages are exchanged. This ensures that the message was not modified or tampered with, thus verifying the message's origin and authenticity by combining the message, the digital signature, and the public key. This is an important component of the asymmetric cryptosystem. 

The three concepts that an asymmetric cryptosystem grants us are confidentiality, authenticity, and non-repudiation. Confidentiality is granted through the encryption-decryption mechanism. Authenticity is granted by the digital signature mechanism, as the message can be authenticated or verified that it wasn't tampered with. Non-repudiation means that the author of the message isn't able to dispute the origin of the message. In other words, this allows us to ensure that the message came from the person claiming to be the author. 

Asymmetric encryption allows secure communication over an untrusted channel, but with symmetric encryption, we need some way to securely communicate the shared secret or key with the other party. While asymmetric encryption works really well in untrusted environments, it's also computationally more expensive and complex. On the other hand, symmetric encryption algorithms are faster, and more efficient, and encrypting large amounts of data. In fact, what many secure communications schemes do is take advantage of the relative benefits of both encryption types by using both, for different purposes. For example, an asymmetric encryption algorithm is chosen as a key exchange mechanism or cipher whereas, the data can be sent quickly, efficiently, and securely using a symmetric encryption cipher, once the shared secret is received.

**MACs** or **Message Authentication Codes** are a bit of information that allows authentication of a received message, ensuring that the message came from the alleged sender and not a third party masquerading as them. It also ensures that the message wasn't modified in some way in order to provide data integrity. This sounds super similar to digital signatures using public key cryptography, however it differs slightly since the secret key that's used to generate the MAC is the same one that's used to verify it. In this sense, it's similar to symmetric encryption system and the secret key must be agreed upon by all communicating parties beforehand or shared in some secure way. 

One popular and secure type of MAC called **HMAC** or a **Keyed-Hash Message Authentication Code**. HMAC uses a cryptographic hash function along with a secret key to generate a MAC. Any cryptographic hash functions can be used like  MD5 or SHA, etc and the strength or security of the MAC is dependent upon the underlying security of the cryptographic hash function used. The MAC is sent alongside the message that's being checked. The Mac is verified by the receiver by performing the same operation on the received message, then comparing the computed MAC with the one received with the message. If the MACs are the same, then the message is authenticated. 

There are also MACs based on symmetric encryption ciphers, either block or stream like DES or AES, which are called **CMACs** or **Cipher-Based Message Authentication Codes**. The process is similar to HMAC, but instead of using a hashing function to produce a digest, a symmetric cipher with a shared keys used to encrypt the message and the resulting output is used as the MAC. A specific and popular example of a CMAC though slightly different is **CBC-MAC** or **Cipher Block Chaining Message Authentication Codes**. CBC-MAC is a mechanism for building MACs using block ciphers. This works by taking a message and encrypting it using a block cipher operating in CBC mode. **CBC mode** is an operating mode for block ciphers that incorporates a previously encrypted block cipher text into the next block's plain text. So, it builds a chain of encrypted blocks that require the full, unmodified chain to decrypt. This chain of interdependently encrypted blocks means that any modification to the plain text will result in a different final output at the end of the chain, ensuring message integrity.

### Asymmetric Encryption Algorithms
**RSA** was one of the first practical asymmetric cryptography systems that specifies mechanisms for generation and distribution of keys along with encryption and decryption operation using these keys. The key generation process depends on choosing two unique, random, and usually very large prime numbers. **DSA** or **Digital Signature Algorithm** is another example of an asymmetric encryption system, though its used for signing and verifying data. Similar to RSA, the specification covers the key generation process along with the signing and verifying data using the key pairs. The security of this system is dependent on choosing a random seed value that's incorporated into the signing process. If this value was leaked or if it can be inferred if the prime number isn't truly random, then it's possible for an attacker to recover the private key. 

Another popular key exchange algorithm is **DH** or **Diffie-Hellman**. For instance, assume that two people need to communicate over an unsecured channel, and they agree on the starting number that would be a very large random integer. This number should be different for every session and doesn't need to be secret. Next, each person chooses another randomized large number but this one is kept secret. Then, they combine their shared number with their respective secret number and send the resulting mix to each other. Next, each person combines their secret number with the combined value they received from the previous step. The result is a new value that's the same on both sides without disclosing enough information to any potential eavesdroppers to figure out the shared secret. This algorithm was designed solely for key exchange, though there have been efforts to adapt it for encryption purposes. It's even been used as part of a PKI system or Public Key Infrastructure system. 

**Elliptic curve cryptography** or **ECC** is a public key encryption system that uses the algebraic structure of elliptic curves over finite fields to generate secure keys. Well, traditional public key systems make use of factoring large prime numbers whereas ECC makes use of elliptic curves which is composed of a set of coordinates that fit in equation. Elliptic curves have a couple of interesting and unique properties. One is horizontal symmetry, which means that at any point in the curve can be mirrored along the x axis and still make up the same curve. On top of this, any non-vertical line will intersect the curve in three places at most. Its this last property that allows elliptic curves to be used in encryption. The benefit of elliptic curve based encryption systems is that they are able to achieve security similar to traditional public key systems with smaller key sizes. So, for example, a 256 bit elliptic curve key, would be comparable to a 3,072 bit RSA key. This is really beneficial since it reduces the amount of data needed to be stored and transmitted when dealing with keys. Both Diffie-Hellman and DSA have elliptic curve variants, referred to as **ECDH** and **ECDSA**, respectively. 

## Hashing
**Hashing** or a hash function is a type of function or operation that takes in an arbitrary data input and maps it to an output of a fixed size, called a **hash** or a **digest**. The output size is usually specified in bits of data and is often included in the hashing function name. This means that for any amount of feeded data into a hash function, the resulting output will always be the same size, but the output should be unique to the input, such that two different inputs should never yield the same output. Hash functions have a large number of applications in computing in general, typically used to uniquely identify data. Hashing can also be used to identify duplicate data sets in databases or archives to speed up searching of tables or to remove duplicate data to save space. 

Cryptographic hash functions are used for various applications like authentication, message integrity, fingerprinting, data corruption detection and digital signatures. Cryptographic hashing is distinctly different from encryption because cryptographic hash functions should be one directional. The ideal cryptographic hash function should be deterministic, meaning that the same input value should always return the same hash value. The function should be quick to compute and be efficient. It should be infeasible to reverse the function and recover the plain text from the hash digest. A small change in the input should result in a change in the output so that there is no correlation between the change in the input and the resulting change in the output. Finally, the function should not allow for **hash collisions**, meaning two different inputs mapping to the same output. Cryptographic hash functions are very similar to symmetric key block ciphers and that they operate on blocks of data. In fact, many popular hash functions are actually based on modified block ciphers. 

### Hashing Algorithms
**MD5** is a popular and widely used cryptographic hashing function which operates on a 512 bit blocks and generates 128 bit hash digests. **SHA-1** is part of the secure hash algorithm suite of functions, that operates a 512 bit blocks and generates 160 bit hash digest. SHA-1 is used in popular protocols like TLS/SSL, PGP SSH, and IPsec and is also used in version control systems like Git, which uses hashes to identify revisions and ensure data integrity by detecting corruption or tampering. **MIC** or **message integrity check** is essentially a hash digest of the message. In other words, it can be considered as check sum for the message, ensuring that the contents of the message weren't modified in transit. But this is distinctly different from a MAC as it doesn't use secret keys, which means the message isn't authenticated. There's nothing stopping an attacker from altering the message, recomputing the checksum, and modifying the MIC attached to the message. MICs can protect against accidental corruption or loss, but not against tampering or malicious actions.

**Md5sum** is a hashing program that calculates and verifies 128-bit MD5 hashes. As with all hashing algorithms, theoretically, there’s an unlimited number of files that will have any given MD5 hash. Md5sum is used to verify the integrity of files. Similarly, **shasum** is an encryption program that calculates and verifies SHA hashes. It’s also commonly used to verify the integrity of files.

In the authentication scenario, password shouldn't be stored in plain text, rather we must store a hash of the password. In order to protect against brute force attacks, the password must run through the hashing function multiple times, thus significantly increasing more computations for each password guess attempt. **Rainbow tables** are used by bad actors to help speed up the process of recovering passwords from stolen password hashes. A rainbow table is just a pre-computed table of all possible password values and their corresponding hashes. The idea behind rainbow table attacks is to trade computational power for disk space by pre-computing the hashes and storing them in a table. An attacker can determine what the corresponding password is for a given hash by just looking up the hash in their rainbow table. This is unlike a brute force attack where the hash is computed for each guess attempt. 

A **password salt** is additional randomized data that's added into the hashing function to generate the hash that's unique to the password and salt combination. A randomly chosen large salt is concatenated or tacked onto the end of the password. The combination of salt and password is then run through the hashing function to generate hash which is then stored alongside the salt. What this means now for an attacker is that they'd have to compute a rainbow table for each possible salt value. If a large salt is used, the computational and storage requirements to generate useful rainbow tables becomes almost unfeasible, for example early Unix systems used a 12 Bit salt, which amounts to a total of 4,096 possible salts. Modern systems like Linux, BSD and Solaris use a 128 bit salt.


## Cryptography Applications
### Public Key Infrastructure
**PKI** or **Public Key Infrastructure** is a system that defines the creation, storage and distribution of digital certificates. A **digital certificate** is a file that proves that an entity owns a certain public key. A certificate contains information about the public key, the entity it belongs to and a digital signature from another party that has verified this information. If the signature is valid and we trust the entity that signed the certificate, then we can trust the public key to be used to securely communicate with the entity that owns it. The entity that's responsible for storing, issuing, and signing certificates is referred to as **CA**, or **Certificate Authority**. It's a crucial component of the PKI system. There's also an **RA**, or **Registration Authority**, that's responsible for verifying the identities of any entities requesting certificates to be signed and stored with the CA. This role is usually lumped together with the CA. A central repository is needed to securely store and index keys and a certificate management system of some sort makes managing access to storage certificates and issuance of certificates easier. 

There are a few different types of certificates that have different applications or uses. **SSL or TLS server certificate** is a certificate that a web server presents to a client as part of the initial secure setup of an SSL, TLS connection. The client (usually a web browser) verifies that the subject of the certificate matches the host name of the server the client is trying to connect to. The client will also verify that the certificate is signed by a certificate authority that the client trusts. It's possible for a certificate to be valid for multiple host names. In some cases, a wild card certificate can be issued where the host name is replaced with an asterisk, denoting validity for all host names within a domain. 

It's also possible for a server to use what's called a **Self Sign Certificate**. This certificate has been signed by the same entity that issued the certificate. This would basically be signing our own public key using our private key.

Another certificate type is an **SSL or TLS client certificate**. This is an optional component of SSL, TLS connections and is less commonly seen than server certificates. As the name implies, these are certificates that are bound to clients and are used to authenticate the client to the server, allowing access control to a SSL, TLS service. These are different from server certificates in that the client certificates aren't issued by a public CA. Usually the service operator would have their own internal CA which issues and manages client certificates for their service. 

There are also **code signing certificates** which are used for signing executable programs. This allows users of these signed applications to verify the signatures and ensure that the application was not tampered with. It also lets them verify that the application came from the software author and is not a malicious twin. 

PKI is very much dependent on trust relationships between entities, and building a network or chain of trust. This chain of trust has to start somewhere and that starts with the **Root Certificate Authority**. These root certificates are self signed because they are the start of the chain of trust. So there's no higher authority that can sign on their behalf. This Root Certificate Authority can now use the self-signed certificate and the associated private key to begin signing other public keys and issuing certificates. It builds a sort of tree structure with the root private key at the top of the structure. If the root CA signs a certificate and sets a field in the certificate called CA to true, this marks a certificate as an **intermediary or subordinate CA**. What this means is that the entity that this certificate was issued to can now sign other certificates. And this CA has the same trust as the root CA. An intermediary CA can also sign other intermediate CAs. This extension of trust from one root CA to intermediaries can begin to build a chain. A certificate that has no authority as a CA is referred to as an **End Entity or Leaf Certificate**. Similar to a leaf on a tree, it's the end of the tree structure and can be considered the opposite of the roots. In order to bootstrap this chain of trust, we have to trust a root CA certificate, otherwise the whole chain is untrusted. This is done by distributing root CA certificates via alternative channels. Each major OS vendor ships a large number of trusted root CA certificates with their OS. And they typically have their own programs to facilitate distribution of root CA certificates.

The **X.509 standard** is the format of digital certificates. It also defines a **certificate revocation list** or **CRL** which is a means to distribute a list of certificates that are no longer valid. The fields defined in X.509 certificate are:
* *Version* represents what version of the X.509 standard certificate adheres to. 
* *Serial number* defines a unique identifier for their certificate assigned by the CA which allows the CA to manage and identify individual certificates. 
* *Certificate Signature Algorithm* field indicates what public key algorithm is used for the public key and what hashing algorithm is used to sign the certificate. 
* *Issuer Name* field contains information about the authority that signed the certificate. 
* *Validity* field contains two subfields, Not Before and Not After, which define the dates when the certificate is valid for. 
* *Subject* field contains identifying information about the entity the certificate was issued to. 
* *Subject Public Key Info*, these two subfields define the algorithm of the public key along with the public key itself. Certificate signature algorithm, same as the Subject Public Key Info field, these two fields must match. 
* *Certificate Signature Value*, the digital signature data itself. There are also certificate fingerprints which aren't actually fields in the certificate itself, but are computed by clients when validating or inspecting certificates. These are just hash digests of the whole certificate. 

An alternative to the centralized PKI model of establishing trust and binding identities is the Web of Trust. A **Web of Trust** is where individuals instead of certificate authorities sign other individuals' public keys. Before an individual signs a key, they should first verify the person's identity through an agreed upon mechanism. Once they determine the person is who they claim to be, signing their public key is basically vouching for this person. This process would be reciprocal, meaning both parties would sign each other's keys. Usually people who are interested in establishing web of trust will organize what are called **Key Signing Parties** where participants performed the same verification and signing. At the end of the party everyone's public key should have been signed by every other participant establishing a web of trust. In the future when one of these participants in the initial key signing party establishes trust with a new member, the web of trust extends to include this new member and other individuals they also trust. This allows separate webs of trust to be bridged by individuals and allows the network of trust to grow.

### Securing Network Traffic
**HTTPS** is the secure version of HTTP, the Hypertext Transfer Protocol. It can also be called **HTTP over SSL or TLS** since it's essentially encapsulating the HTTP traffic over an encrypted, secured channel utilizing SSL or TLS. TLS is actually independent of HTTPS, and is actually a generic protocol to permit secure communications and authentication over a network. TLS is also used to secure other communications aside from web browsing, like VoIP calls such as Skype or Hangouts, email, instant messaging, and even Wi-Fi network security. 

TLS grants us three things. One, a secure communication line, which means data being transmitted is protected from potential eavesdroppers. Two, the ability to authenticate both parties communicating, though typically, only the server is authenticated by the client. And three, the integrity of communications, meaning there are checks to ensure that messages aren't lost or altered in transit. TLS essentially provides a secure channel for an application to communicate with a service, but there must be a mechanism to establish this channel initially. This is what's referred to as a **TLS handshake**. 

The handshake process kicks off with a client establishing a connection with a TLS enabled service, referred to in the protocol as ClientHello. This includes information about the client, like the version of the TLS that the client supports, a list of cipher suites that it supports, and maybe some additional TLS options. The server then responds with a ServerHello message, in which it selects the highest protocol version in common with the client, and chooses a cipher suite from the list to use. It also transmits its digital certificate and a final ServerHelloDone message. The client will then validate the certificate that the server sent over to ensure that it's trusted and it's for the appropriate host name. Assuming the certificate checks out, the client then sends a ClientKeyExchange message. This is when the client chooses a key exchange mechanism to securely establish a shared secret with the server, which will be used with a symmetric encryption cipher to encrypt all further communications. The client also sends a ChangeCipherSpec message indicating that it's switching to secure communications now that it has all the information needed to begin communicating over the secure channel. This is followed by an encrypted Finished message which also serves to verify that the handshake completed successfully.

![tls_handshake.png](imgs/tls_handshake.png)

The server replies with a ChangeCipherSpec and an encrypted Finished message once the shared secret is received. Once complete, application data can begin to flow over the now the secured channel. The session key is the shared symmetric encryption key using TLS sessions to encrypt data being sent back and forth. Since this key is derived from the public-private key, if the private key is compromised, there's potential for an attacker to decode all previously transmitted messages that were encoded using keys derived from this private key.

To defend against this, there's a concept of **forward secrecy**. This is a property of a cryptographic system so that even in the event that the private key is compromised, the session keys are still safe. The **SSH**, or **secure shell**, is a secure network protocol that uses encryption to allow access to a network service over unsecured networks. This protocol is super flexible and has provisions for allowing arbitrary networks and traffic over those ports to be tunneled over the encrypted channel. It was originally designed as a secure replacement for the Telnet protocol and other unsecured remote login shell protocols like rlogin or r-exec. It's very important that remote login and shell protocols use encryption, otherwise these services will be transmitting usernames and passwords, along with keystrokes and terminal output in plain text. 

SSH uses public key cryptography to authenticate the remote machine that the client is connecting to, and has provisions to allow user authentication via client certificates. The SSH protocol is very flexible and modular, and supports a wide variety of different key exchange mechanisms like Diffie-Hellman, along with a variety of symmetric encryption ciphers. It also supports a variety of authentication methods, including custom ones that we can write. When using public key authentication, a key pair is generated by the user who wants to authenticate. They then must distribute those public keys to all systems that they want to authenticate to using the key pair. When authenticating, SSH will ensure that the public key being presented matches the private key, which should never leave the user's possession.

**PGP** or **Pretty Good Privacy** is an encryption application that allows authentication of data along with privacy from third parties relying upon asymmetric encryption to achieve this. It's most commonly used for encrypted email communication, but it's also available as a full disk encryption solution or for encrypting arbitrary files, documents, or folders. PGP is widely regarded as very secure, with no known mechanisms to break the encryption via cryptographic or computational means. Originally, PGP used the RSA algorithm, but that was eventually replaced with DSA to avoid issues with licensing.

**VPN** or **Virtual Private Network** is a mechanism that allows us to remotely connect a host or network to an internal private network, passing the data over a public channel, like the Internet. We can think of this as a sort of encrypted tunnel where all of our remote system's network traffic would flow, transparently channeling our packets via the tunnel through the remote private network. A VPN can also be point-to-point, where two gateways are connected via a VPN, i.e. bridging two private networks through an encrypted tunnel. There are a bunch of VPN solutions using different approaches and protocols with differing benefits and tradeoffs. **IPsec**, or **Internet Protocol Security**, is a VPN protocol that was designed in conjunction with IPv6. IPsec works by encrypting an IP packet and encapsulating the encrypted packet inside an IPsec packet. This encrypted packet then gets routed to the VPN endpoint where the packet is de-encapsulated and decrypted then sent to the final destination. IPsec supports two modes of operations, transport mode and tunnel mode. When **transport mode** is used, only the payload of the IP packet is encrypted, leaving the IP headers untouched. Header values are hashed and verified, along with the transport and application layers. This would prevent the use of anything that would modify these values, like NAT or PAT. In **tunnel mode**, the entire IP packet, header, payload, and all, is encrypted and encapsulated inside a new IP packet with new headers. 

While not a VPN solution itself, **L2TP**, or **Layer 2 Tunneling Protocol**, is typically used to support VPNs. A common implementation of L2TP is in conjunction with IPsec when data confidentially is needed, since L2TP doesn't provide encryption itself. It's a simple tunneling protocol that allows encapsulation of different protocols or traffic over a network that may not support the type of traffic being sent. L2TP can also just segregate and manage the traffic. ISPs will use the L2TP to deliver network access to a customer's endpoint, for example. The combination of L2TP and IPsec is referred to as **L2TP IPsec** and was officially standardized in IETF RFC 3193. The establishment of an L2TP IPsec connection works by first negotiating an IPsec security association which negotiates the details of the secure connection, including key exchange, if used. It can also share secrets, public keys, and a number of other mechanisms. Next, secure communication is established using **Encapsulating Security Payload**. It's a part of the IPsec suite of protocols, which encapsulates IP packets, providing confidentiality, integrity, and authentication of the packets. Once secure encapsulation has been established, negotiation and establishment of the L2TP tunnel can proceed. L2TP packets are now encapsulated by IPsec, protecting information about the private internal network. An important distinction to make in this setup is the difference between the tunnel and the secure channel. The tunnel is provided by L2TP, which permits the passing of unmodified packets from one network to another. The secure channel, on the other hand, is provided by IPsec, which provides confidentiality, integrity, and authentication of data being passed. 

**OpenSSL** is a commercial-grade utility toolkit for Transport Layer Security (TLS) and Secure Sockets Layer (SSL) protocols. It’s also a general-purpose cryptography library. SSL TLS is also used in some VPN implementations to secure network traffic, as opposed to individual sessions or connections. An example of this is **OpenVPN**, which uses the OpenSSL library to handle key exchange and encryption of data, along with control channels. This also enables OpenVPN to make use of all the cyphers implemented by the OpenSSL library. Authentication methods supported are pre-shared secrets, certificate-based, and username password. Certificate-based authentication would be the most secure option, but it requires more support and management overhead since every client must have a certificate. Username and password authentication can be used in conjunction with certificate authentication, providing additional layers of security. It should be called out that OpenVPN doesn't implement username and password authentication directly as it uses modules to plug into authentication systems. OpenVPN can operate over either TCP or UDP, typically over port 1194. It supports pushing network configuration options from the server to a client and it supports two interfaces for networking. It can either rely on a Layer 3 IP tunnel or a Layer 2 Ethernet tap. The Ethernet tap is more flexible, allowing it to carry a wider range of traffic. From the security perspective, OpenVPN supports up to 256-bit encryption through the OpenSSL library. It also runs in user space, limiting the seriousness of potential vulnerabilities that might be present. 

### Cryptographic Hardware
**Trusted Platform Module** or **TPM** is a hardware device that's typically integrated into the hardware of a computer, that's a dedicated crypto processor. TPM offers secure generation of keys, random number generation, remote attestation, and data binding and sealing. A TPM has unique secret RSA key burned into the hardware at the time of manufacture, which allows a TPM to perform things like hardware authentication. This can detect unauthorized hardware changes to a system. **Remote attestation** is the idea of a system authenticating its software and hardware configuration to a remote system. This enables the remote system to determine the integrity of the remote system. This can be done using a TPM by generating a secure hash of the system configuration, using the unique RSA key embedded in the TPM itself. 

Another use of this secret hardware backed encryption key is **data binding and sealing**. It involves using the secret key to derive a unique key that's then used for encryption of data. Basically, this binds encrypted data to the TPM and by extension, the system the TPM is installed in, sends only the keys stored in hardware in the TPM will be able to decrypt the data. Data sealing is similar to binding since data is encrypted using the hardware backed encryption key. But, in order for the data to be decrypted, the TPM must be in a specified state. TPM is a standard with several revisions that can be implemented as a discrete hardware chip, integrated into another chip in a system, implemented in firmware software or virtualize then a hypervisor. The most secure implementation is the discrete chip, since these chip packages also incorporate physical tamper resistance to prevent physical attacks on the chip. 

Mobile devices have something similar referred to as a **secure element**. Similar to a TPM, it's a tamper resistant chip often embedded in the microprocessor or integrated into the mainboard of a mobile device. It supplies secure storage of cryptographic keys and provides a secure environment for applications. An evolution of secure elements is the **Trusted Execution Environment** or **TEE** which takes the concept a bit further. It provides a full-blown isolated execution environment that runs alongside the main OS. This provides isolation of the applications from the main OS and other applications installed there. It also isolates secure processes from each other when running in the TEE. TPMs have received criticism around trusting the manufacturer. Since the secret key is burned into the hardware at the time of manufacture, the manufacturer would have access to this key at the time. It is possible for the manufacturer to store the keys that could then be used to duplicate a TPM, that could break the security the module is supposed to provide. 

TPMs are most commonly used to ensure platform integrity, preventing unauthorized changes to the system either in software or hardware, and full disk encryption utilizing the TPM to protect the entire contents of the disk. **Full Disk Encryption** or **FDE** is the practice of encrypting the entire drive in the system and not just sensitive files in the system. This allows us to protect the entire contents of the disk from data theft or tampering. There are a bunch of options for implementing FDE like the commercial product PGP, Bitlocker from Microsoft, which integrates very well with TPMs, Filevault 2 from Apple, and the open source software dm-crypt, which provides encryption for Linux systems. 

An FDE configuration will have one partition or logical partition that holds the data to be encrypted. Typically, the root volume, where the OS is installed. But, in order for the volume to be booted, it must first be unlocked at boot time. Because the volume is encrypted, the BIOS can't access data on this volume for boot purposes. This is why FDE configurations will have a small unencrypted boot partition that contains elements like the kernel, bootloader and a netRD. At boot time, these elements are loaded which then prompts the user to enter a passphrase to unlock the disk and continue the boot process. FDE can also incorporate the TPM, utilizing the TPM encryption keys to protect the disk. And, it has platform integrity to prevent unlocking of the disk if the system configuration is changed. This protects against attacks like hardware tampering, and disk theft or cloning. 

The selection of random numbers is a very important concept in encryption because if our number selection process isn't truly random, then there can be some kind of pattern that an adversary can discover through close observation and analysis of encrypted messages over time. Something that isn't truly random is referred to as **pseudo-random**. It's for this reason that operating systems maintain what's referred to as an **entropy pool**. This is essentially a source of random data to help seed random number generators. There's also dedicated random number generators and pseudo-random number generators, that can be incorporated into a security appliance or server to ensure that truly random numbers are chosen when generating cryptographic keys.

## ACTIVITY: Create/inspect key pair, encrypt/decrypt and sign/verify using OpenSSL in a LINUX server

#### STEP 1: Generating a private key
```shell
openssl genrsa -out private_key.pem 2048
```
This command creates a 2048-bit RSA key, called "private_key.pem". The name of the key is specified after the "-out" flag, and typically ends in ".pem". The number of bits is specified with the last argument. To view our new private key, use "cat" to print it to the screen, just like any other file:
```shell
cat private_key.pem
```
The contents of the private key file should look like a large jumble of random characters.

#### STEP 2: Generating a public key
```shell
openssl rsa -in private_key.pem -outform PEM -pubout -out public_key.pem
```
For viewing the public key file:
```shell
cat public_key.pem
```
It should look like a bunch of random characters, like the private key, but different and slightly shorter.

#### STEP 3: Creating a text file with some information
```shell
echo 'This is a secret message, for authorized parties only' > secret.txt
```
It will create a new text file called "secret.txt" which just contains the text, "This is a secret message, for authorized parties only".

#### STEP 4: Encrypt the file using our public key
```shell
openssl rsautl -encrypt -pubin -inkey public_key.pem -in secret.txt -out secret.enc
```
This creates the file "secret.enc", which is an encrypted version of "secret.txt". For displaying the encrypted file:
```shell
cat secret.enc
```
On viewing the contents of the encrypted file, the output is garbled.

#### STEP 5: Decrypt the file using our private key
```shell
openssl rsautl -decrypt -inkey private_key.pem -in secret.enc
```
This will print the contents of the decrypted file to the screen, which should match the contents of "secret.txt".

#### STEP 6: Create a hash digest of the message
```shell
openssl dgst -sha256 -sign private_key.pem -out secret.txt.sha256 secret.txt
```
This creates a file called "secret.txt.sha256" using our private key, which contains the hash digest of our secret text file.

#### STEP 7: Verifying the file
```shell
openssl dgst -sha256 -verify public_key.pem -signature secret.txt.sha256 secret.txt
```
This should show the following output, indicating that the verification was successful and the file hasn't been modified by a malicious third party.

## ACTIVITY: Hands on with Hashing


#### STEP 1: Creating a text file with some information
```shell
echo 'This is some text in a file, just so we have some data' > file.txt
```
It will create a new text file called "file.txt" which just contains the text, "This is some text in a file, just so we have some data".

#### STEP 2: Generate the MD5 sum for the file and store it
```shell
md5sum file.txt > file.txt.md5
```
This creates the MD5 hash, and saves it to a new file. For printing its contents to the screen:
```shell
cat file.txt.md5
```
This should print the hash to the terminal.

#### STEP 3: Verifying MD5 hash
```shell
md5sum -c file.txt.md5
```
This indicates that the hash is valid.

#### STEP 4: Verifying an invalid file
Creating a copy of original 'file.txt' and then generate a new md5sum for the new file
```shell
cp file.txt badfile.txt
md5sum badfile.txt > badfile.txt.md5
```
Note that the resulting hash is identical to the hash for our original file.txt despite the filenames being different. This is because hashing only looks at the data, not the metadata of the file. This can be veriifed as:
```shell
cat badfile.txt.md5
cat file.txt.md5
```
Next, we wll modify the 'badfile.txt' by appending space character in the end of file.
```shell
nano badfile.txt
```
Save the file above and try verifying the hash again.
```shell
md5sum -c badfile.txt.md5
```
A message will be displayed which shows that the verification wasn't successful.

To see how different the hash of the edited file is, generate a new hash and inspect it
```shell
md5sum badfile.txt > new.badfile.txt.md5
cat new.badfile.txt.md5
```

#### STEP 5: Create the SHA1 sum and save it to a file
```shell
shasum file.txt > file.txt.sha1
```
This creates the SHA1 hash, and saves it to a new file. View it by printing it to the screen:
```shell
cat file.txt.sha1
```
This should print the hash to the terminal.

#### STEP 6: Verifying SHA1 hash
```shell
shasum -c file.txt.sha1
```
This indicates that the hash is valid.

#### STEP 7: Create the SHA256 sum and save it to a file
```shell
shasum -a 256 file.txt > file.txt.sha256
```
The **-a** flag specifies the algorithm to use, and defaults to SHA1 if nothing is specified. This creates the SHA1 hash, and saves it to a new file. View it by printing it to the screen:
```shell
cat file.txt.sha256
```
This should print the hash to the terminal. SHA256's increased security comes from it creating a longer hash that's harder to guess. We can see that the contents of the file here are much longer than the SHA1 file.

#### STEP 8: Verifying SHA256 hash
```shell
shasum -c file.txt.sha256
```
This indicates that the hash is valid.

## AAA Security
Three A's of security are **authentication**, **authorization**, and **accounting**. Authentication is related to verifying the identity a user, authorization pertains to describing what the user account has access to or doesn't have access to. A user may successfully authenticate to a system by presenting valid credentials. But if the username they authenticated as isn't also authorized to access the system in question, they'll be denied access. 

### Authentication Best Practices
Consider an example of accessing email account using our username and password. **Identification** is the idea of describing an entity uniquely, for example, our email address is our identity when logging into our email. **Authentication** is the process that proves our claim to access our email address by supplying a password associated with the identity. Our identity is then authorized to access our email inbox by authenticating using email address and password, but we're not authorized to access anyone else's inbox. This process is called **authorization**. These two concepts are usually distinguished from each other in the security world, with the terms **authn** for **authentication** and **authz** for **authorization**. 

The basic authentication in the form of username and password is referred to as **single-factor authentication**. 

**Multifactor authentication** is a system where users are authenticated by presenting multiple pieces of information or objects. The many factors that comprise a multifactor authentication system can be categorized into three types:
* Something you know (something like a password, or a pin for our bank or ATM card)
* Something you have (a physical token, like our ATM or bank card)
* Something you are (biometric data, like a fingerprint or iris scan)

Ideally, a multifactor system will incorporate at least two of these factors. The premise behind multifactor authentication is that an attacker would find it much more difficult to steal or clone multiple factors of authentication, assuming different types are used. 

**Physical tokens** can take a few different forms, common ones include a USB device with a secret token on it, a standalone device which generates a token, or even a simple key used with a traditional lock. A physical token that's commonly used generates a short-lived token. Typically a number that's entered along with a username and password. This number is commonly called a **One-Time-Password** or **OTP** since it's short-lived and constantly changing value. An example of this is the **RSA SecurID token**. It's a small, battery-powered device with an LCD display, that shows a One-Time-Password that's rotated periodically. This is a **time-based token** sometimes called a **TOTP**, and operates by having a secret seed or randomly generated value on the token that's registered with the authentication server. The seed value is used in conjunction with the current time to generate a One-Time-Password. The scheme requires the time between the authenticator token, and the authentication server to be relatively synchronized. This is usually achieved by using the Network Time Protocol or NTP. 

There are also **counter-based tokens**, which use a secret seed value along with the secret counter value that's incremented every time a one-time password is generated on the device. The value is then incremented on the server upon successful authentication. This is more secure than the time-based tokens for two reasons. First, the attacker would need to recover the seed value and the counter value. Second, the counter value is also incrementing when it's being used. So, a cloned token would only be useful for a short period of time before the counter value changes too much and the clone token becomes un-synchronized from the real token and the server. These token generators can either be physical, dedicated devices, or they can be an app installed on a smartphone that performs the same functionality. 

Another very common method for handling multifactor today, is that the delivery of one-time password tokens using **SMS**. The problem with relying on SMS to transmit an additional authentication factor is that we're dependent on the security processes of the mobile carrier. SMS isn't encrypted, nor is it private. 

The other category of multifactor authentication is **biometrics**. Biometric authentication is the process of using unique physiological characteristics of an individual to identify them. By confirming the biometric signature, the individual is authenticated. A very common use of this in mobile devices is fingerprint scanners to unlock phones. This works by registering our fingerprints first, using an optical sensor that captures images of the unique pattern of our fingerprint. Much like how passwords should never be stored in plain text, biometric data used for authentication, so, it also never be stored directly. This is even more important for handling biometric data. Unlike passwords, biometrics are an inherent part of who someone is. So, there are privacy implications to theft or leaks of biometric data. Biometric characteristics can also be super difficult to change in the event that they are compromised unlike passwords. So, instead of storing the fingerprint data directly, the data is run through a hashing algorithm and the resulting unique hash is stored. One advantage of biometric authentication over knowledge or token-based systems, is that it's more reliable to identify an individual for authentication, since biometric features aren't usually shareable. Other biometric systems use features like iris scans, facial recognition, and even voice. 

An evolution of physical tokens is the **U2F** or **Universal Second Factor**. U2F incorporates a challenge-response mechanism, along with public key cryptography to implement a more secure and more convenient second-factor authentication solution. U2F tokens are referred to as **security keys** and are available from a range of manufacturers. Security keys are essentially small embedded cryptoprocessors, that have secure storage of asymmetric keys and additional slots to run embedded code. The first step is registration, since the security key must be registered with a site or service. At registration time, the security key generates a private-public key pair unique to that site, and submits the public key to the site for registration. It also binds the identity of the site with the key pair. The reason for unique key pairs for each site is for privacy reasons. If a site is compromised, this prevents cross-referencing registered public keys, and discovering commonalities between sites based on registration data. Once registered with the site, the next time we're prompted to authenticate, we'll be prompted for our username and password as usual. But afterwards, we'll be prompted to tap our security key. When we physically tap the security key, it's a small check for user presence to ensure malware cant authenticate on our behalf, without our knowledge. This tap will unlock the private keys stored in the security key, which is used to authenticate. The authentication happens as a challenge-response process, which protects against replay attacks. This is because the authentication session can't be used again later by an eavesdropper, because the challenge and resulting response will be different with every authentication session. What happens is the site generates a challenge, essentially, some randomized data and sends this to the client that's attempting to authenticate. The client will then select the private key matching the site, and use this key to sign the challenge and send the signed data back. The site can now verify the signature using the public key that was registered earlier. If the signature checks out, the user is authenticated. From a security perspective, this is a much more secure design than OTPs. This is because, the authentication flow is protected from phishing attacks, given the interactive nature of the process. While U2F doesn't directly protect against man in the middle attacks, the authentication should take place over a secure TLS connection, which would provide protection from this type of attack. Security keys are also resistant to cloning or forgery, because they have unique, embedded secrets on them and are protected from tampering. From the convenience perspective, this is a much nicer authentication flow compared to OTPs since the user doesn't have to manually transcribe a string of numbers into the authentication dialog. All they have to do is tap their security key. 

### Client Certificates
Certificates are public keys that are signed by a certificate authority or CA as a sign of trust. **Client certificates** operate very similarly to server certificates but are presented by clients and allow servers to authenticate and verify clients. It's common for VPN systems or enterprise Wi-Fi setups to use client certificates for authentication. In order to issue client certificates, an organization must set up and maintain CA infrastructure to issue and sign certificates. Part of **certificate authentication** also involves the client authenticating the server, giving us mutual authentication. This is a positive since the client can verify that it's talking to the real authentication server and not an impersonator. In this case, all clients that are using certificate authentication would also need to have the CA certificate in their certificate trust store. This establishes trust with the CA and allows the client to verify it's talking to the real server when trying to authenticate. 

Certificate authentication is like presenting identification at the airport. We show our ID or our certificate to prove who we are. The ID is checked to see if it was issued by an authority that is trusted by the verifier. Also, the expiration date on our ID will also be checked to ensure it's still valid. The same thing applies to certificate authentication, although the certificates have two dates that need to be verified. Not valid before, and not valid after. **Not valid before** is checking to see if the certificate is valid yet since it's possible to have certificates issued for future use. **Not valid after** is a straightforward expiration date, after which the certificate is no longer valid. Airport authorities also have a list of specific IDs that are flagged. If our ID is on that list, then we'll be rejected for air travel. Similarly, the certificate will be checked against a **revocation list** or a **CRL**. This is a signed list published by the CA which defines certificates that have been explicitly revoked. One last step that's performed as part of the authentication server verification process is to prove possession of the corresponding private key, since the certificate is a signed public key. If we don't prove possession, there's nothing stopping an attacker from copying the certificate, since it's not considered secret, and pretending to be the owner. To avoid this, possession of the private key is verified through a **challenge response mechanism**. This is where the server requests a randomized bit of data to be signed using the private key corresponding to the public key presented for authentication. This is similar to how the airport checks the photo on our ID to make sure we look like the person in the photo and aren't impersonating them.

### LDAP
**LDAP**, or **Lightweight Directory Access Protocol**, is an open industry-standard protocol for accessing and maintaining directory services, something similar to a phone or email directory. It's most commonly used as a backend for authentication of accounts. The LDAP specification describes the data structure of the directory itself and defines functions for interacting with the service, like performing look ups and modifying data. We can think of a directory like a database, but with more details or attributes, describing entities within the database. The structure of an LDAP directory is a sort of **tree layout** and is optimized for retrieval of data more so than writing. Think of it as being similar to a phone book being used for looking up data far more often than making modifications to that data. 

Directories can be hosted across lots of different LDAP servers to facilitate more rapid look ups, and are kept in sync through replication of the directory. The data that gets stored in directory entry is similar to an address book, an entry for a particular user will contain information pertaining to that user account, like their first and last name, phone number, desk location, email address, login shell, and other such data. Along with object attributes the location of an entry within the overall data structure will represent information pertaining to the objects as relationships between objects. Because LDAP uses a tree structure called a **Data Information Tree**, objects will have one parent and can have one or more children that belong to the parent object. We can also think about this like a file system with a root file system in folders under that. The folder an object belongs to will provide information about that object because of its relationship to the parent object. In LDAP language, we call these folders **Organizational Units**, or **OUs**. They let us group related objects into units like people or groups to distinguish between individual user accounts and groups that accounts can belong to. This tree structure also allows for inheritance and nesting of objects, where attributes or properties of a parent object can be inherited by children further down the tree. Now, since it is possible for entries in the directory to share attributes, there must be a unique identifier for each entry. We call this, **Distinguished Name**, or **DN**. Coming back to our file system analogy, we can think of a DN as a full path to a file as opposed to a file name. This is because we can have multiple files with the same file name across a file system. But the fully qualified path to the file would describe one unique file. 

Some of the more common operations that can be called by a client to interact with an LDAP server are:
* **bind**, which is how clients authenticate to the server
* **StartTLS**, which permits a client to communicate using LDAP v3 over TLS
* **Search**, for performing look ups and retrieval of records
* **Add/delete/modify** which are various operations to write data to the directory
* **Unbind**, which closes the connection to the LDAP server. 

There are many implementations of LDAP servers, like Active Directory from Microsoft and OpenLDAP for open source implementations.

### RADIUS
**RADIUS** or **Remote Authentication Dial-In User Service**, is a protocol that provides AAA services for users on a network. It's a very common protocol used to manage access to internal networks, WiFi networks, email services and VPN services. Originally designed to transport authentication information for remote dial up users, it's evolved to carry a wide variety of standard authentication protocols like **EAP** or **Extensible Authentication Protocol**. Clients who want to authenticate to a RADIUS back end server don't directly interact with it. Instead, when a client wants to access a resource that's protected, the client will present authentication credentials to a NAS or Network Access Server which will relay the credentials to the RADIUS server. The RADIUS server will then verify the credentials using a configured authentication scheme. RADIUS servers can verify user authentication information stored in a flat file or can plug into external sources like SQL databases, LDAP, Kerberos or Active Directory. Once the RADIUS server has evaluated the user authentication request, it replies with one of three messages access reject, access challenge or access accept.

### Kerberos
**Kerberos** is a network authentication protocol that uses **tickets** to allow entities to prove their identity over potentially insecure channels to provide mutual authentication. It also uses symmetric encryption to protect protocol messages from eavesdropping and replay attacks. Kerberos supports AES encryption, and implements checksums to ensure data integrity and confidentiality. When joined to a Windows domain, Windows 2000 and newer versions will use Kerberos as the default authentication protocol. Microsoft also implemented their own Kerberos service with some modifications to the open protocol like the addition of the RC 4 Stream Cipher. Tickets are a sort of token that proves our identity. They can be used for authenticating to services protected using Kerberos or in other words or within the Kerberos realm. The authentication tickets let users authenticate to services without requiring username and password authentication for every service individually. A ticket will expire after some time, but it has provisions for automatic transparent renewal of the ticket. 

First, a user that wants to authenticate enters their username and password on their client machine. Their Kerberos client software, will then take the password and generate a symmetric encryption key from it. Next, the client sends a plain text message to the Kerberos, **AS** or **Authentication Server** which includes the user ID of the authenticating user. The password or secret key derive from the password aren't transmitted. The AS uses the user ID to check if there is an account in the authentication database, like an active directory server. If so the AS will generate the secret key using the hashed password stored in the key distribution center server. The AS will then use the secret key to encrypt and send a message containing the client **TGS session key**. This is a secret key use for encrypting communications with the **Ticket Granting Service** or **TGS**, which is already known by the Authentication Server. The AS also sends a second message with a **Ticket Granting Ticket** or **TGT**, which is encrypted using the TGS secret key. The Ticket Granting Ticket has information like the client ID, ticket validity period, and the client taking granting service session key. So the first message can be decrypted using the shared secret key derived from the user password. It then provides the secret key that can decrypt the second message giving the client a valid Ticket Granting Ticket. Now, the client has enough information to authenticate with the Ticket Granting Server. Since the client has authenticated and received a valid Ticket Granting Ticket, it can use the Ticket Granting Ticket to request access to services from within the Kerberos realm. This is done by sending a message to the Ticket Granting Service with the encrypted Ticket Granting Ticket received from the AS earlier along with the service name or ID the client is requesting access to. The client also sends a message containing in authenticator which has the client ID and a time stamp that's encrypted with the client Ticket Granting Ticket session key from the AS. The Ticket Granting Service decrypts the Ticket Granting Ticket using the Ticket Granting Service secret key, which provides the Ticket Granting Service with the client Ticket Granting Service session key. It then uses the key to decrypt the authenticator message. Next, it checks the client ID of these two messages to ensure they match. If they do, it sends two messages back to the client. The first one, contains the client to server ticket which is comprised of the client ID, client address, validity period, and the client-server session key encrypted using the service's secret key. The second message, contains the client-server session key itself, and is encrypted using the client Ticket Granted Service session key. Finally, the client has enough information to authenticate itself to the **service server** or **SS**. The client sends two messages to the SS. The first message is the encrypted client to server ticket received from the Ticket Granting Service. The second is a new authenticator with the client ID and time stamp encrypted using the client-server session key. The SS decrypts the first message using its secret key which provides it with the client-server session key. The key is then used to decrypt the second message, and it compares the client ID in the authenticator to the one included in the client to server ticket. If these IDs match, then the SS sends a message containing the time stamp from the client supplied authenticator encrypted using the client-server session key. The client, then decrypts this message, and checks that the time stamp is correct authenticating the server. If this all succeeds, then the server grants access to the requested service on the client. 

Kerberos has received some criticism because it's a single monolithic service. This creates a single point of failure danger. If the Kerberos service goes down, new users won't be able to authenticate and log in. Aside from the availability issues, if the central Kerberos server is compromised, the attacker would be able to impersonate any user by generating valid Kerberos tickets for their user account. Kerberos enforces strict time requirements requiring the client and server clocks to be relatively closely synchronized, otherwise, authentication will fail. This is usually accomplished by using NTP to keep both parties synchronized using an NTP server. The trust model of Kerberos is also problematic, since it requires clients and services to have an established trust in the Kerberos server in order to authenticate using Kerberos. This means, it's not possible for users to authenticate using Kerberos from unknown or untrusted clients. So things like BYOD or Bring Your Own Device, and cloud computing are incompatible, or at least very challenging to implement securely with Kerberos authentication.

### TACACS+
**TACACS+** pronounced **TACACS plus**, stands for **Terminal Access controller Access-Control System Plus**. It's a Cisco developed AAA protocol whereas **XTACACS** or **Extended TACACS** is a Cisco proprietary extension on top of TACACS. TACACS+ is primarily used for device administration, authentication, authorization, and accounting, as opposed to RADIUS, which is mostly used for network access AAA. TACACS+ is mainly used as an authentication system for network infrastructure devices, which tend to be high value targets for attackers. 

### Single Sign-On
**Single Sign-On** or **SSO** is an authentication concept that allows users to authenticate once to be granted access to a lot of different services and applications. Since reauthentication for each service isn't needed, users don't need multiple sets of usernames and passwords across a mix of applications and services. SSO is accomplished by authenticating to a central authentication server, like an LDAP server. This then provides a cookie, or token that can be used to get access to applications configured to use SSO. 

Kerberos is actually a good example of an SSO authentication service. The user would authenticate against the Kerberos service once, which would then grant them a ticket granting ticket. This can then be presented to the ticket granting service in place of traditional credentials. So, the user can enter credentials once and gain access to a variety of services. 

SSO is really convenient. It allows users to have one set of credentials that grant access to lots of services, making it less likely that passwords will be written down or stored insecurely. This should also reduce the overhead for password assistance support and removes time spent re-authenticating throughout the workday. An attacker that manages to compromise an account has a lot more access under an SSO scheme. User credentials will grant access to all applications and services that account is permitted to access. This can be mitigated by using multifactor authentication in conjunction with an SSO scheme. But this opens a new channel of attack, theft of SSO session cookies or tokens. Instead of targeting credentials directly, attackers can try to steal the SSO tokens directly which will permit wide access if even for a short amount of time. Stealing these tokens, also lets an attacker dodge multifactor authentication protections since the session token permits access without requiring full authentication until the token expires. 

An example of an SSO system is the **openID**, the centralized authentication system. This is an open standard that allows participating sites known as **Relying Parties** to allow authentication of users utilizing a third party authentication service. This allows sites to permit authentication without requiring the site itself to have authentication infrastructure, which can be tricky to implement and maintain. It also lets users access the site without requiring them to create a new account, simplifying access management across a wide variety of sites. Instead, a user just needs to already have an account with an identity provider. To ask for authentication, first a relying party looks up the openID provider, then establishes a shared secret with the provider if one doesn't already exist. The shared secret will be used to validate the openID provider messages. Then, the user will be redirected or asked to authenticate in a new window through the identities provider's log and flow. Once authenticated, the user will be prompted to confirm if they trust the relying party or not. Once confirmed, credentials are relayed to the relying party, typically in the form of a token not actual user credentials which indicates the user is now authenticated to the service. 

### Access Control
In Kerberos, the user authenticated and received a ticket-granting ticket. This can then be used to request access to a specific service by sending a request to the ticket-granting service. This is when authorization comes into play, since the ticket-granting service will decide whether or not the user in question is permitted to access the service being requested. If they're not permitted or authorized to access the service, the request will be denied by the ticket-granting service. If the user is authorized, the ticket-granting service would return a ticket, which authorized the user to access the service. One very popular open standard for authorization and access delegation is OAuth, used by companies like Google, Facebook, and Microsoft. 

**OAuth** is an open standard that allows users to grant third-party websites and applications access to their information without sharing account credentials. This can be thought of as a form of access delegation because access to the user's account is being delegated to the third party. This is accomplished by prompting the user to confirm that they agree to permit the third party access to certain information about their account. Typically, this prop will specifically list which pieces of information or access are being requested. Once confirmed, the identity provider will supply the third party with a token that gives them access to the user's information. This token can then be used by the third party to access data or services offered by the identity provider directly on behalf of the user. OAuth is commonly used to grant access to third party applications, to APIs offered by large internet companies like Google, Microsoft, and Facebook. 

It's important that users pay attention to what third party is requesting access and what exactly they're granting access to. OAuth permissions can be used in phishing style attacks to gain access to accounts without requiring credentials to be compromised. This works by sending phishing emails to potential victims that look like legitimate OAuth authorization requests, which ask the user to grant access to some aspects of their account through OAuth. Once the user grants access, the attacker has access to the account through the OAuth authorization token. 

It's important to distinguish between OAuth and OpenID. OAuth is specifically an authorization system and **OpenID** is an authentication system. Though they're usually used together, OpenID Connect is an authentication layer built on top of OAuth 2.0 designed to improve upon OpenID and build better integration with OAuth authorizations. Since TACACS plus is a full AAA system, it also handles authorization along with authentication. This is done once a user is authenticated by allowing or disallowing access for the user account to run certain commands or access certain devices. This lets us not only allow admin access for users that administer devices while still allowing less privileged access to other users when necessary. 

Here's an example, since our networking teams are responsible for configuring and maintaining our network switches, routers, and other infrastructure. We'd give them admin access to our network and equipment. Meanwhile, we can have limited read-only access to our support team since they don't need to be able to make changes to switch configurations in their jobs. Read-only access is enough for them to troubleshoot problems. The rest of the user accounts would have no access at all and wouldn't be permitted to connect to the networking infrastructure. So more sophisticated or configurable AAA systems may even allow further refinement of authorization down to the command level. This gives us much more flexibility in how our access is granted to specific users or groups in our organization. RADIUS also allows us to authorize network access. For example, we may want to permit some users to have Wi-Fi and VPN access while others may not need this. When they authenticate to the RADIUS server, if the authentication succeeds, the RADIUS server returns configuration information to the network access server. This includes authorizations which specifies what network services the user is permitted to access.

### Access Control List
An **access control list** or **ACL**, is a way of defining permissions or authorizations for objects. The most common case we may encounter deals with file system permissions. A file system would have an ACL, which is a table or database with a list of entries specifying access rights for individuals or groups for various objects on the file system like folders, files or programs. These individual access permissions per object are called **Access Control Entries** and they make up the ACL. Individual entries can define permissions controlling whether or not a user or group can read, write or execute objects. ACLs are also used extensively in network security, applying access controls to routers switches and firewalls. Network ACLs are used for restricting and controlling access to hoster services running on hosts within our network. Network ACLs can be defined for incoming and outgoing traffic. They can also be used to restrict external access to systems and limit outgoing traffic to enforce policies or to prevent unauthorized outbound data transfers.

### Tracking Usage and Access
Accounting means keeping records of what resources and services our users access or what they did when they were using our systems. A critical component of this is **auditing**, which involves reviewing these records to ensure that nothing is out of the ordinary. For example, a TACACS+ server would be more concerned with keeping track of user authentication, what systems they authenticated to, and what commands they ran during their session. This is because TACACS+ plus is a device access AAA system that manages who has access to our network devices and what they do on them. 

CISCO's AAA system supports accounting of individual commands executed, connection to and from network devices, commands executed in privileged mode, and network services and system details like configuration reloads or reboots. 

RADIUS will track details like session duration, client location and bandwidth, or other resources used during the session. This is because radius is a network access AAA system. So it tracks details about network access and usage. RADIUS accounting kicks off with the Network Access Server sending an accounting request packet to the accounting server that contains an event record to be logged. This starts the accounting session on the server. The server replies with an accounting response indicating that the message was received. The NASS will continue sending periodic accounting messages with statistics of the session until an accounting stop packet is received. RADIUS accounting can be used for billing purposes by ISPs because it records the length of a session and the amount of data sent and received by the user. This data can also be used to enforce data or time quotas, limiting the duration of sessions or restricting the amount of data that can be sent or received. But, this accounting information isn't detailed and won't contain specifics of what exactly the user did during the session. Information like websites visited or what protocols were used aren't recorded. 

## Securing Networks
### Network Hardening Best Practices
**Network hardening** is the process of securing a network by reducing its potential vulnerabilities through configuration changes, and taking specific steps. Networks would be much safer if we disable access to network services that aren't needed and enforce access restrictions. 

**Implicit deny** is a network security concept where anything not explicitly permitted or allowed should be denied. This is different from blocking all traffic, since an implicit deny configuration will still let traffic pass that we've defined as allowed using ACL configurations. This can usually be configured on a firewall which makes it easier to build secure firewall rules. Instead of requiring us to specifically block all traffic we don't want, we can just create rules for traffic that we need to go through. We can think of this as **whitelisting**, as opposed to **blacklisting**. While this is slightly less convenient, it's a much more secure configuration. Before a new service will work, a new rule must be defined for it reducing convenience a bit. 

Another very important component of network security is **monitoring and analyzing traffic** on our network. It lets us establish a baseline of what our typical network traffic looks like. This is key because in order to know what unusual or potential attack traffic looks like, we need to know what normal traffic looks like. We can do this through network traffic monitoring and logs analysis. 

**Analyzing logs** is the practice of collecting logs from different network and sometimes client devices on our network, then performing an automated analysis on them. This will highlight potential intrusions, signs of malware infections or a typical behavior. We'd want to analyze things like firewall logs, authentication server logs, and application logs. Analysis of logs would involve looking for specific log messages of interests, like with firewall logs. Attempted connections to an internal service from an untrusted source address may be worth investigating. Connections from the internal network to known address ranges of Botnet command and control servers could mean there's a compromised machine on the network. 

**Logs analysis systems** are configured using user-defined rules to match interesting or a typical log entries. These can then be surfaced through an alerting system to let security engineers investigate the alert. Part of this alerting process would also involve categorizing the alert, based on the rule matched. We'd also need to assign a priority to facilitate this investigation and to permit better searching or filtering. Alerts could take the form of sending an email or an SMS with information, and a link to the event that was detected. 

**Normalizing logged data** is an important step, since logs from different devices and systems may not be formatted in a common way. We might need to convert log components into a common format to make analysis easier for analysts, and rule-based detection systems, this also makes correlation analysis easier. **Correlation analysis** is the process of taking log data from different systems, and matching events across the systems. So, if we see a suspicious connection coming from a suspect source address and the firewall logs to our authentication server, we might want to correlate that logged connection with the log data of the authentication server. That would show us any authentication attempts made by the suspicious client. This type of logs analysis is also super important in investigating and recreating the events that happened once a compromise is detected. This is usually called a **post fail analysis**, since it's investigating how a compromise happened after the breach is detected. 

**Detailed logging and analysis of logs** would allow for detailed reconstruction of the events that led to the compromise. It could also help determine the extent and severity of the compromise. Detailed logging would also be able to show if further systems were compromised after the initial breach. It would also tell us whether or not any data was stolen, and if it was, what that data was. One popular and powerful logs analysis system is Splunk, a very flexible and extensible log aggregation and search system. Splunk can grab logs data from a wide variety of systems, and in large amounts of formats. It can also be configured to generate alerts, and allows for powerful visualization of activity based on logged data. 

**Flood guards** provide protection against Dos or denial of service attacks. This works by identifying common flood attack types like SYN floods or UDP floods. It then triggers alerts once a configurable threshold of traffic is reached. There's another threshold called the **activation threshold**. When this one is reached, it triggers a pre-configured action. This will typically block the identified attack traffic for a specific amount of time. This is usually a feature on enterprise grade routers or firewalls, though it's a general security concept. A common open source flood guard protection tool is **failed to ban**. It watches for signs of an attack on a system, and blocks further attempts from a suspected attack address. Fail to ban is a popular tool for smaller scale organizations. This flood guard protection can also be described as a form of intrusion prevention system. 

**Network separation** or **network segmentation** is a good security principle as it permits more flexible management of the network, and provides some security benefits. This is the concept of using VLANs to create virtual networks for different device classes or types. Think of it as creating dedicated virtual networks for our employees to use, but also having separate networks for our printers to connect to. The idea here is that the printers won't need access to the same network resources that employees do. It probably doesn't make sense to have the printers on the employee network. We might be wondering how employees are supposed to print if the printers are on a different network. It's actually one of the benefits of network separation, since we can control and monitor the flow of traffic between networks more easily. To give employees access to printers, we'd configure routing between the two networks on our routers. We'd also implement network ackles that permit the appropriate traffic.

### Nework Hardware Hardening
**DHCP** is the protocol where devices on a network are assigned critical configuration information for communicating on the network. If an attacker can manage to deploy a rogue DHCP server on our network, they could hand out DHCP leases with whatever information they want. This includes setting a gateway address or DNS server, that's actually a machine within their control. This gives them access to our traffic and opens the door for future attacks. We call this type of attack a **rogue DHCP server attack**. To protect against this rogue DHCP server attack, enterprise switches offer a feature called **DHCP snooping**. A switch that has DHCP snooping will monitor DHCP traffic being sent across it. It will also track IP assignments and map them to hosts connected to switch ports. This basically builds a map of assigned IP addresses to physical switch ports. This information can also be used to protect against **IP spoofing** and **ARP poisoning attacks**. 

DHCP snooping also makes us designate either a trusted DHCP server IP, if it's operating as a DHCP helper, and forwarding DHCP requests to the server, or we can enable DHCP snooping trust on the uplinked port, where legitimate DHCP responses would now come from. Now any DHCP responses coming from either an untrusted IP address or from a downlinked switch port would be detected as untrusted and discarded by the switch. 

ARP allows for a layer to men-in-the-middle attack because of the unauthenticated nature of ARP. It allows an attacker to forge an ARP response, advertising its MAC address as the physical address matching a victim's IP address. This type of ARP response is called a **gratuitous ARP response**, since it's effectively answering a query that no one made. When this happens, all of the clients on the local network segment would cache this ARP entry. Because of the forged ARP entry, they send frames intended for the victim's IP address to the attacker's machine instead. The attacker could enable IP forwarding, which would let them transparently monitor traffic intended for the victim. They could also manipulate or modify data. **Dynamic ARP inspection** or **DAI** is another feature on enterprise switches that prevents this type of attack. It requires the use of DHCP snooping to establish a trusted binding of IP addresses to switch ports. DAI will detect these forged gratuitous ARP packets and drop them. It does this because it has a table from DHCP snooping that has the authoritative IP address assignments per port. DAI also enforces great limiting of ARP packets per port to prevent ARP scanning. An attacker is likely to ARP scan before attempting the ARP attack. 

To prevent IP spoofing attacks, **IP source guard** or **IPSG** can be enabled on enterprise switches along with DHCP snooping. IPSG works by using the DHCP snooping table to dynamically create ACLs for each switchboard. This drops packets that don't match the IP address for the port based on the DHCP snooping table. Now, if we really want to lock down our network, we can implement **802.1X**. This is the IEEE standard for encapsulating **EAP** or **Extensible Authentication Protocol** traffic over the 802 networks. This is also called **EAP over LAN** or **EAPOL**, it was originally designed for Ethernet but support was added for other network types like Wi-Fi and fiber networks. 

When a client wants to authenticate to a network using 802.1X, there are three parties involved. The client device is what we call the **supplicant**. It's sometimes also used to refer to the software running on the client machine that handles the authentication process for the user. The open source Linux utility wpa_supplicant is one of those. The supplicant communicates with the **authenticator**, which acts as a sort of gatekeeper for the network. It requires clients to successfully authenticate to the network before they're allowed to communicate with the network. This is usually an enterprise switch or an access point in the case of wireless networks. It's important to call out that while the supplicant communicates with the authenticator, it's not actually the authenticator that makes the authentication decision. The authenticator acts like a go between and forwards the authentication request to the authentication server. That's where the actual credential verification and authentication occurs. The authentication server is usually a RADIUS server. 

**EAP-TLS** is an authentication type supported by EAP that uses TLS to provide mutual authentication of both the client and the authenticating server. This is considered one of the more secure configurations for wireless security. **HTTPS** is a combination of the **hypertext transfer protocol**, **HTTP**, with **SSL-TLS** cryptographic protocols. When TLS is implemented for HTTPS traffic, it specifies a client's certificate as an optional factor of authentication. Similarly, most EAP-TLS implementations require client-side certificates. Authentication can be certificate-based, which requires a client to present a valid certificate that's signed by the authenticating CA, or a client can use a certificate in conjunction with a username, password, and even a second factor of authentication, like a one-time password. The security of EAP-TLS stems from the inherent security that the TLS protocol and PKI provide. That also means that the pitfalls are the same when it comes to properly managing PKI elements. We have to safeguard private keys appropriately and ensure distribution of the CA certificate to client devices to allow verification of the server-side. Even more secure configuration for EAP-TLS would be to bind the client-side certificates to the client platforms using TPMs. This would prevent theft of the certificates from client machines. When you combine this with FDE, even theft of a computer would prevent compromise of the network. 

### Network Software Hardening
Network software hardening includes implementation of things like firewalls, proxies, and VPNs that plays an important role in securing networks and their traffic for our organization. **Firewalls** are critical to securing a network and can be deployed as dedicated network infrastructure devices, which regulate the flow of traffic for a whole network. They can also be host-based as software that runs on a client system providing protection for that one host only. A **host-based firewall** provides protection for mobile devices such as a laptop that could be used in an untrusted, potentially malicious environment like an airport Wi-Fi hotspot. Host-based firewalls are also useful for protecting other hosts from being compromised, by corrupt device on the internal network. That's something a network-based firewall may not be able to help defend against. 

**VPNs** are also recommended to provide secure access to internal resources for mobile or roaming users. They are commonly used to provide secure remote access, and link two networks securely. Let's say we have two offices located in buildings that are on opposite sides of town. We want to create one unified network that would let users in each location, seamlessly connect to devices and services in either location. We could use a site to site VPN to link these two offices. To the people in the offices, everything would just work. They'd be able to connect to a service hosted in the other office without any specific configuration. Using a **VPN tunnel**, all traffic between the two offices can be secured using encryption. This lets the two remote networks join each other seamlessly. This way, clients on one network can access devices on the other without requiring them to individually connect to a VPN service. Usually, the same infrastructure can be used to allow remote access VPN services for individual clients that require access to internal resources while out of the office. 

**Proxies** can be really useful to protect client devices and their traffic. They also provide secure remote access without using a VPN. A standard web proxy can be configured for client devices. This allows web traffic to be proxied through a proxy server that we control for lots of purposes. This configuration can be used for logging web requests of client devices. The devices can be used for logs, and traffic analysis, and forensic investigation. The proxy server can be configured to block content that might be malicious, dangerous, or just against company policy. A **reverse proxy** can be configured to allow secure remote access to web based services without requiring a VPN. By configuring a reverse proxy at the edge of our network, connection requests to services inside the network coming from outside, are intercepted by the reverse proxy. They are then forwarded on to the internal service with the reverse proxy acting as a relay. This bridges communications between the remote client outside the network and the internal service. This proxy setup can be secured even more by requiring the use of client TLS certificates, along with username and password authentication. Specific ACLs can also be configured on the reverse proxy to restrict access even more. Lots of popular proxy solutions support a reverse proxy configuration like HAProxy, Nginx, and even the Apache Web Server. 

### WEP Encryption
The first security protocol introduced for Wi-Fi networks was **WEP** or **Wired Equivalent Privacy**. It was part of the original 802.11 standard introduced back in 1997. WEP was intended to provide privacy on par with the wired network, that means the information passed over the network should be protected from third parties eavesdropping. This was an important consideration when designing the wireless specification. Unlike wired networks, packets could be intercepted by anyone with physical proximity to the access point or client station. Without some form of encryption to protect the packets, wireless traffic would be readable by anyone nearby who wants to listen. WEP was proven to be seriously bad at providing confidentiality or security for wireless networks and was quickly discounted in 2004 in favor of more secure systems. 

WEP use the RC4 symmetric stream cipher for encryption. It used either a 40-bit or 104-bit shared key where the encryption key for individual packets was derived. The actual encryption key for each packet was computed by taking the user-supplied shared key and then joining a 24-bit initialization vector or IV for short. It's a randomized bit of data to avoid reusing the same encryption key between packets. Since these bits of data are concatenated or joined, a 40-bit shared key scheme uses a 64-bit key for encryption and the 104-bit scheme uses a 128-bit key. Later, 128-bit encryption became available for use and the shared key was entered as either 10 hexadecimal characters for 40-bit WEP, or 26 hex characters for 104-bit WEP. Each hex character was 4-bits each. The key could also be specified by supplying 5 ASCII characters or 13, each ASCII character representing 8-bits. But this actually reduces the available keyspace to only valid ASCII characters instead of all possible hex values. Since this is a component of the actual key, the shared key must be exactly as many characters as appropriate for the encryption scheme. 

WEP authentication originally supported two different modes, **Open System authentication** and **Shared Key authentication**. The open system mode didn't require clients to supply credentials. Instead, they were allowed to authenticate and associate with the **access point** or **AP**. But the access point would begin communicating with the client encrypting data frames with the pre-shared WEP key. If the client didn't have the key or had an incorrect key, it wouldn't be able to decrypt the frames coming from the access point. It also wouldn't be able to communicate back to the AP. Shared key authentication worked by requiring clients to authenticate through a four-step challenge response process. This basically has the AP asking the client to prove that they have the correct key. The client sends an authentication request to the AP. The AP replies with clear text challenge, a bit of randomized data that the client is supposed to encrypt using the shared WEP key. The client replies to the AP with the resulting ciphertext from encrypting this challenge text. The AP verifies this by decrypting the response and checking it against the plain text challenge text. If they match, a positive response is sent back. However, in this scheme, we're transmitting both the plain text and the ciphertext in a way that exposes both of these messages to potential eavesdroppers. This opens the possibility for the encryption key to be recovered by the attacker.

A general concept in security and encryption is to never send the plain text and ciphertext together, so that attackers can't work out the key used for encryption. But WEP's true weakness wasn't related to the authentication schemes, its use of the RC4 stream cipher and how the IVs were used to generate encryption keys led to WEP's ultimate downfall. The primary purpose of an IV is to introduce more random elements into the encryption key to avoid reusing the same one. When using a stream cipher like RC4, it's super important that an encryption key doesn't get reused. This would allow an attacker to compare two messages encrypted using the same key and recover information. But the encryption key in WEP is just made up of the shared key, which doesn't change frequently. It had 24-bits of randomized data, including the IV tucked on to the end of it. This results in only a 24-bit pool where unique encryption keys will be pulled from and used. It's also important to call out that the IV is transmitted in plain text. If it were encrypted, the receiver would not be able to decrypt it. This means an attacker just has to keep track of IVs and watch for repeated ones. The actual attack that lets an attacker recover the WEP key relies on weaknesses in some IVs and how the RC4 cipher generates a keystream used for encrypting the data payloads. This lets the attacker reconstruct this keystream using packets encrypted using the weak IVs. There are some open source tools that demonstrate this attack in action, like Aircrack-ng or AirSnort, they can recover a WEP key in a matter of minutes, thus making WEP vulnerable to attacks. 

### WPA/WPA2
The replacement for WEP from the Wi-Fi Alliance was **WPA** or **Wi-Fi Protected Access**. WPA was designed as a short-term replacement that would be compatible with older WEP-enabled hardware with a simple firmware update. This helped with user adoption because it didn't require the purchase of new Wi-Fi hardware. To address the shortcomings of WEP security, a new security protocol was introduced called **TKIP** or the **Temporal Key Integrity Protocol**. TKIP implemented three new features that made it more secure than WEP. First, a more secure key derivation method was used to more securely incorporate the IV into the per packet encryption key. Second, a sequence counter was implemented to prevent replay attacks by rejecting out of order packets. Third, a 64-bit MIC or Message Integrity Check was introduced to prevent forging, tampering, or corruption of packets. TKIP still use the RC4 cipher as the underlying encryption algorithm. But it addressed the key generation weaknesses of WEP by using a **key mixing function** to generate unique encryption keys per packet. It also utilizes 256 bit long keys. This key mixing function incorporates the Wi-Fi passphrase with the IV which is different compared to the simplistic concatenation of the shared key and IV. 

Under WPA, the pre-shared key is the Wi-Fi password we share with people when they come over and want to use our wireless network. This is not directly used to encrypt traffic. It's used as a factor to derive the encryption key. The passphrase is fed into the **PBKDF2** or **Password-Based Key Derivation Function 2**, along with the Wi-Fi networks SSID as a salt. This is then run through the HMAC-SHA1 function 4096 times to generate a unique encryption key. The SSID salt is incorporated to help defend against rainbow table attacks. The 4096 rounds of HMAC-SHA1 Increase the computational power required for a brute force attack. The pre-shared key can be entered using two different methods. A 64 character hexadecimal value can be entered, or the 64 character value is used as the key, which is 64 hexadecimal characters times four bits, which is 256 bits. The other option is to use PBKDF2 function but only if entering ASCII characters as a passphrase. If that's the case, the passphrase can be anywhere from eight to 63 characters long. 

WPA2 improve WPA security even more by implementing **CCMP** or **Counter Mode CBC-MAC Protocol**. It's based on the AES cipher finally getting away from the insecure RC4 cipher. The key derivation process didn't change from WPA, and the pre-shared key requirements are the same. Counter with CBC-MAC is a particular mode of operation for block ciphers. It allows for authenticated encryption, meaning data is kept confidential, and is authenticated. This is accomplished using an authenticate, then encrypt mechanism. The CBC-MAC digest is computed first. Then, the resulting authentication code is encrypted along with the message using a block cipher. We're using AES in this case, operating in counter mode. This turns a block cipher into a stream cipher by using a random seed value along with an incrementing counter to create a key stream to encrypt data with. 

The **Four-Way Handshake** is a process that authenticates clients to the network by generating the temporary encryption key that will be used to encrypt data for this client. This process is made up of four exchanges of data between the client and AP. It's designed to allow an AP to confirm that the client has the correct pairwise master key, or pre-shared key in a WPA-PSK setup without disclosing the PMK. The **PMK** is a long live key and might not change for a long time. So an **encryption key** is derived from the PMK that's used for actual encryption and decryption of traffic between a client and AP. This key is called the **Pairwise Transient Key** or **PTK**. The PTK is generating using the PMK, AP nonce, Client nonce, AP MAC address, and Client MAC address. They're all concatenated together, and run through a function. The AP and Client nonces are just random bits of data generated by each party and exchanged. The MAC addresses of each party would be known through the packet headers already, and both parties should already have the correct PMK. With this information, the PTK can be generated. This is different for every client to allow for confidentiality between clients. The PTK is actually made up of five individual keys, each with their own purpose. Two keys are used for encryption and confirmation of EAPoL packets, and the encapsulating protocol carries these messages. Two keys are used for sending and receiving message integrity codes. And finally, there's a temporal key, which is actually used to encrypt data. 

The AP will also transmit the **GTK** or **Groupwise Transient Key**. It's encrypted using the EAPoL encryption key contained in the PTK, which is used to encrypt multicast or broadcast traffic. Since this type of traffic must be readable by all clients connected to an AP, this GTK is shared between all clients. It's updated and retransmitted periodically, and when a client disassociates the AP. 

The four messages exchanged in order are:
* the AP sends a nonce to the client
* the Client then sends its nonce to the AP
* the AP sends the GTK
* the Client replies with an Ack confirming successful negotiation. 

The WPA and WPA2 standard also introduce an 802.1x authentication to Wi-Fi networks. It's usually called **WPA2-Enterprise**. The non-802.1x configurations are called either **WPA2-Personal** or **WPA2-PSK**, since they use a pre-shared key to authenticate clients. The only thing different is that the AP acts as the authenticator in this case. The back-end radius is still the authentication server and the PMK is generated using components of the EAP method chosen. While not a security feature directly, **WPS** or **Wi-Fi protected setup** is a convenience feature designed to make it easier for clients to join a WPA-PSK protected network. WPS provides several different methods that allow our wireless client to securely join a wireless network without having to directly enter the pre-shared key. This facilitates the use of very long and secure passphrases without making it unnecessarily complicated. 

WPS simplifies entering a 63-character passphrase to use our Wi-Fi by allowing for secure exchange of the SSID and pre-shared key. This is done after authenticating or exchanging data using one of the four supported methods. WPS supports PIN entry authentication, NFC or USB for out of banned exchange of the network details, or push-button authentication. The **push-button** is typically a small button somewhere on the home router with two arrows pointing counter-clockwise. The push-button mechanism works by requiring a button to be pressed on both the AP side and the client side. This requires physical proximity and a short window of time that the client can authenticate with a button press of its own. The **NFC** and **USB methods** just provide a different channel to transmit the details to join the network. The **PIN authentication** mechanism supports two modes. In one mode, the client generates a PIN which is then entered into the AP, and the other mode, the AP has a PIN typically hard-coded into the firmware which is entered into the client. It's the second mode that is vulnerable to an online brute force attack. The PIN authentication method uses PINs that are eight-digits long, but the last digit is a checksum that's computed from the first seven digits. This makes the total number of possible PINs 10 to the seventh power or around 10 million possibilities. But the PIN is authenticated by the AP in halves. This means the client will send the first four digits to the AP, wait for a positive or negative response, and then send the second half of the PIN if the first half was correct. This harms the security as we're actually reducing the total possible valid PINs even more and making it even easier to guess what the correct PIN is. The first half of the PIN being four digits has about 10,000 possibilities. The second half, only three digits because of the checksum value, has a maximum of only 1,000 possibilities. This means the correct PIN can be guessed in a maximum of 11,000 tries. Without any rate limiting, an attacker could recover the PIN and the pre-shared key in less than four hours. In response to this, the Wi-Fi Alliance revised the requirements for the WPS specification, introducing a **lockout period** of one minute after three incorrect PIN attempts. This increases the maximum time to guess the PIN from four hours to less than three days. If our network is compromised using this attack because the PIN is an unchanging element that's part of the AP configuration, the attacker could just reuse the already recovered WPS PIN to get the new password. This would happen even if we detected unauthorized wireless clients on our network and changed our Wi-Fi password. 

WPA2 is a really robust security protocol. It's built using best in class mechanisms to prevent attacks and ensure the confidentiality of the data it's protecting. Even so, it's susceptible to some forms of attack. The four-way authentication handshake is actually susceptible to an offline brute force attack. If an attacker can manage to capture the four-way handshake process just for packets, they can begin guessing the pre-shared key or PMK. They can take the nonces and MAC addresses from the four-way handshake packets and computing PTKs. Sends the message authentication code, secret keys are included as part of the PTK. The correct PMK guess would yield a PTK that successfully validates a MIC. This is a brute force or dictionary-based attack, so it's dependent on the quality of the password guesses. It does require a fair amount of computational power to calculate the PMK from the passphrase guesses and SSID values. But the bulk of the computational requirements lie in the PMK computation. This requires 4096 iterations of a hashing function, which can be massively accelerated through the use of GPU-accelerated computation and cloud computing resources. Because of the bulk of the computations involving computing the PMK, by incorporating the password guesses with the SSIDs, it's possible to pre-compute PMKs in bulk for common SSIDs and password combinations. This reduces the computational requirements to deriving the PTK from the unique session elements. These pre-computed sets are referred to as rainbow tables and exactly this has been done. Rainbow tables are available for download for the top 1000 most commonly seen SSIDs and 1 million passwords.

### Wireless Hardening
In an ideal world, we'd all be protecting our wireless networks using 802.1X with EAP-TLS. It offers arguably the best security available, assuming proper and secure handling of the PKI aspects of it. But, this option also requires a ton of added complexity and overhead. This is because it requires the use of a radius server and an additional authentication back-end at a minimum. If EAP-TLS is implemented, then all the public key infrastructure components will also be necessary. This adds even more complexity and management overhead. Not only do we have to securely deploy PKI on the back-end for certificate management, but a system must be in place to sign the client's certificates. We also have to distribute them to each client that would be authenticating to the network. This is usually more overhead than many companies are willing to take on, because of the security versus convenience trade-off involved. If 802.1X is too complicated for a company, the next best alternative would be **WPA2 with AES/CCMP mode**. But to protect against brute force or rainbow table attacks, we should take some steps to raise the computational bar. A long and complex passphrase that wouldn't be found in a dictionary would increase the amount of time and resources an attacker would need to break the passphrase. Changing the SSID to something uncommon and unique, would also make rainbow tables attack less likely. It would require an attacker to do the computations themselves, increasing the time and resources required to pull off an attack. When using a long and complex Wi-Fi password, we might be tempted to use WPS to join clients to the network, but this might not be a good idea from a security perspective. In practice, we won't see WPS enabled in an enterprise environment, because it's a consumer-oriented technology, thus WPS isn't enabled on your APs. They also verify that this feature is actually disabled using a tool like **Wash**, which scans and enumerates a piece that have WPS enabled. This independent verification is recommended, since some router manufacturers don't allow us to disable it. In some cases, disabling the feature through the management console doesn't actually disable the feature. 

### Network Monitoring
In order to monitor what type of traffic is on our network, we need a mechanism to capture packets from network traffic for analysis and potential logging. **Packet Sniffing** or **Packet Capture**, is a process of intercepting network packets in their entirety for analysis. By default, network interfaces and the networking software stack on an OS are going to behave like a well-mannered interface, they will only be accepting and processing packets that are addressed with specific interface address usually identified by a MAC address. If a packet with a different destination address is encountered, the interface will just drop the packet. But, if we wanted to capture all packets that an interface is able to see, like when we're monitoring all network traffic on a network segment, this behavior would be a pain for us. To override this, we can place the interface into what's called **Promiscuous Mode**. This is a special mode for Ethernet network interfaces that basically says, "Give me all the packets." Instead of only accepting and handling packets destined for its address, it will now accept and process any packet that it sees. This is much more useful for network analysis or monitoring purposes. The admin or root privileges are needed to place an interface into promiscuous mode and to begin to capture packets. 

Another super important thing to consider when we perform packet captures is whether er have access to the traffic we like to capture and monitor. If we wanted to analyze all traffic between hosts connected to a switch and our machine which is also connected to a port on the switch, we'd be able to capture traffic from our host or traffic destined for our host. If the packets aren't going to be sent to our interface in the first place, Promiscuous Mode won't help us see them. But, if our machine was inserted between the uplink port of the switch and the uplink device further upstream, now we'd have access to all packets in and out of that local network segment. Enterprise manage switches usually have a feature called **Port Mirroring**, which helps with this type of scenario. Port Mirroring, allows the switch to take all packets from a specified port, port range, or the entire VLAN and mirror the packets to a specified switch port. This lets us gain access to all packets passing on a switch in a more convenient and secure way. We can insert a hub into the topology with the device or devices we'd like to monitor traffic on, connected to the hub and our monitoring machine. **Hubs** are a quick and dirty way of getting packets mirrored to our capture interface. They obviously have drawbacks though, like reduced throughput and the potential for introducing collisions. 

If we capture packets from a wireless network, the process is slightly different. Promiscuous Mode applied to a wireless device would allow the wireless client to process and receive packets from the network it's associated with destined for other clients. But, if we wanted to capture and analyze all wireless traffic that we're able to receive in the immediate area, we can place our wireless interface into a mode called **monitor mode**. Monitor mode, allows us to scan across channels to see all wireless traffic being sent by APs and clients. It doesn't matter what networks they're intended for and it wouldn't require the client device to be associated or connected to any wireless network. To capture wireless traffic, all we need is an interface placed into monitor mode. Just like enabling promiscuous mode, this can be done with a simple command, but usually, the tools used for wireless packet captures can handle the enabling and disabling of the mode for us. We need to be near enough to the AP and client to receive a signal, and then we can begin capturing traffic right out of the air. There are a number of open source wireless capture and monitoring utilities, like Aircrack-ng and Kismet. It's important to call out that if a wireless network is encrypted, we can still capture the packets, but we won't be able to decode the traffic payloads without knowing the password for the wireless network.

### Wireshark and tcpdump
**Traffic analysis** is also an important part of network security like logs analysis. Traffic analysis is done using packet captures and packet analysis. **Traffic** on a network is basically a flow of packets. Now being able to capture and inspect those packets is important to understanding what type of traffic is flowing on our networks that we'd like to protect. 

**Tcpdump** is a super popular, lightweight command-line based utility that we can use to capture and analyze packets. Tcpdump uses the open source **libpcap library**, a very popular packet capture library that's used in a lot of packet capture and analysis tools. Tcpdump also supports writing packet captures to a file for later analysis, sharing, or replaying traffic. It also supports reading packet captures back from a file. Tcpdump's default operating mode is to provide a brief packet analysis. It converts key information from layers three and up into human readable formats. Then it prints information about each packet to standard out, or directly into your terminal. It does things like converting the source and destination IP addresses into the dotted quad format we're most used to. And it shows the port numbers being used by the communications.

Tcpdump allows us to actually inspect payload size (in bytes) from packets directly. Tcpdump, by default, will attempt to resolve host addresses to hostnames. It'll also replace port numbers with commonly associated services that use these ports. We could override this behavior with a **-n flag**. It's also possible to view the actual raw data that makes up the packet. This is represented as hexadecimal digits by using the **-x flag** or **-X flag** if we want the hex in ASCII interpretation of the data. Packets are just collections of data, or groupings of ones and zeros. They represent information depending on the values of this data, and where they appear in the data stream. The **view tcpdump** gives us lets us see the data that fits into the various fields that make up the headers for layers in a packet.

**Wireshark** is another packet capture and analysis tool that you can use, but it's way more powerful when it comes to application and packet analysis, compared to tcpdump. It's a graphical utility that also uses the libpcap library for capture and interpretation of packets. But it's way more extensible when it comes to protocol and application analysis. While tcpdump can do basic analysis of some types of traffic, like DNS queries and answers, Wireshark can do way more. Wireshark can decode encrypted payloads if the encryption key is known. It can identify and extract data payloads from file transfers through protocols like SMB or HTTP. Wireshark's understanding of application level protocols even extends to its filter strings. This allows filter rules like finding HTTP requests with specific strings in the URL, which would look like, http.request.uri matches "q=wireshark". That filter string would locate packets in our capture that contain a URL request that has the specified string within it. In this case it would match a query parameter from a URL searching for Wireshark. While this could be done using tcpdump, it's much easier using Wireshark.

The Wireshark interface, which is divided into thirds consists of the list of packets that are up top, followed by the layered representation of a selected packet from the list. Lastly, the Hex and ASCII representation of the selected packet are at the bottom. The packet list view is color coded to distinguish between different types of traffic in the capture. The color coded is user configurable, the defaults are green for TCP packets, light blue for UDP traffic, and dark blue for DNS traffic. Black also highlights problematic TCP packets, like out of order, or repeated packets. Above the packet list pane, is a display filter box, which allows complex filtration of packets to be shown. This is different from capture filters, which follows the libpcap standard, along with tcpdump. Wireshark's deep understanding of protocols allows filtering by protocols, along with their specific fields. 

Not only does Wireshark have very handy protocol handling infiltration, it also understands and can follow tcp streams or sessions. This lets us quickly reassemble and view both sides of a tcp session, so we can easily view the full two-way exchange of information between parties. Some other neat features of Wireshark is its ability to decode WPA and WEP encrypted wireless packets, if the passphrase is known. It's also able to view Bluetooth traffic with the right hardware, along with USB traffic, and other protocols like Zigbee. It also supports file carving, or extracting data payloads from files transferred over unencrypted protocols, like HTTP file transfers or FTP. And it's able to extract audio streams from unencrypted VOIP traffic.

### Intrusion Detection/Prevention Systems
**Intrusion Detection and Prevention Systems** or **IDS/IPS** operate by monitoring network traffic and analyzing it. They look for matching behavior or characteristics that would indicate malicious traffic. The difference between an IDS and an IPS system, is that IDS is only a detection system. It won't take action to block or prevent an attack, when one is detected, it will only log an alert. But an IPS system can adjust firewall rules on the fly, to block or drop the malicious traffic when it's detected. IDS and IPS system can either be **host based** or **network based**. In the case of a **Network Intrusion Detection System** or **NIDS**, the detection system would be deployed somewhere on a network, where it can monitor traffic for a network segment or sub net. A host based intrusion detection system would be a software deployed on the host that monitors traffic to and from that host only. It may also monitor system files for unauthorized changes. NIDS systems resemble firewalls in a lot of ways. But a firewall is designed to prevent intrusions by blocking potentially malicious traffic coming from outside, and enforce ACLs between networks. NIDS systems are meant to detect and alert on potential malicious activity coming from within the network. Plus, firewalls only have visibility of traffic flowing between networks they've set up to protect. They generally wouldn't have visibility of traffic between hosts inside the network. So, the location of the NIDS must be considered carefully when we deploy a system. It needs to be located in the network topology, in a way that it has access to the traffic we'd like to monitor. A good way that we can get access to network traffic is using the port mirroring functionality found in many enterprise switches. This allows all packets on a port, port range, or entire VLAN to be mirrored to another port, where NIDS host would be connected. With this configuration, our NIDS machine would be able to see all packets flowing in and out of hosts on the switch segment. This lets us monitor host to host communications, and traffic from hosts to external networks, like the internet. The NIDS hosts would analyzed this traffic by enabling promiscuous mode on the analysis port. This is the network interface that's connected to the mirror port on our switch, so it can see all packets being passed, and perform an analysis on the traffic. Since this interface is used for receiving mirrored packets from the network we'd like to monitor, a NIDS host must have at least two network interfaces. One is for monitoring an analysis, and a separate one is for connecting to our network for management and administrative purposes. Some popular NID or NIP systems are Snort, Suricata, and Bro NIDS. 

Placement of a **NIPS** or **Network Intrusion Prevention system**, would differ from a NIDS system. This is because of a prevention system being able to take action against a suspected malicious traffic. In order for a NIPS device to block or drop traffic from a detected threat, it must be placed in line with the traffic being monitored. This means, that the traffic that's being monitored must pass through the NIPS device. If it wasn't the case, the NIPS host wouldn't be able to take action on suspected traffic. Think of it this way, a NIDS device is a passive observer that only watches the traffic, and sends an alert if it sees something. This is unlike a NIPS device, which not only monitors traffic, but can take action on the traffic it's monitoring, usually by blocking or dropping the traffic. The detection of threats or malicious traffic is usually handled through signature based detection, similar to how antivirus software detects malware. **Signatures** are unique characteristics of known malicious traffic. They might be specific sequences of packets, or packets with certain values encoded in the specific header field. This allows Intrusion Detection and Prevention Systems from easily and quickly recognizing known bad traffic from sources like botnets, worms, and other common attack vectors on the internet. But similar to antivirus, less common are targeted attacks might not be detected by a signature based system, since they're might not be signatures developed for these cases. So, it's also possible to create custom rules to match traffic that might be considered suspicious, but not necessarily malicious. This would allow investigators to look into the traffic in more detail to determine the badness level. If the traffic is found to be malicious, a signature can be developed from the traffic, and incorporate it into the system. 

When a NIDS system detects something malicious, this is configurable, but usually the NIDS system would log the detection event along with a full packet capture of the malicious traffic. An alert would also usually be triggered to notify the investigating team to look into that detected traffic. Depending on the severity of the event, the alert may just email a group, or create a ticket to follow up on, or it might page someone in the middle of the night if it's determined to be a really high severity and urgent. These alerts would usually also include reference information linking to a known vulnerability, or some more information about the nature of the alert to help the investigator look into the event. 


## ACTIVITY: Introducing tcpdump
Tcpdump is the premier network analysis tool for information security and networking professionals. Tcpdump will help us display network traffic in a way that’s easier to analyze and troubleshoot. 

#### Basic Usage
Tcpdump does require root or administrator privileges in order to capture traffic, so every command must begin with **sudo**. At a minimum, we must specify an interface to listen on with the **-i flag**. We may want to check what the primary network interface name is using ip link. In this case, we'll be using the interface **eth0** for all the examples.

To use tcpdump to start listening for any packets on the interface, enter the command below:
```bash
sudo tcpdump -i eth0
```
This will output some basic information about packets it sees directly to standard out. This command will fill our terminal with a constant stream of text as new packets are read. It won't stop until we press Ctrl+C. Once tcpdump exits, it prints a summary of the capture performed, showing the number of packets captured, filtered, or dropped.

By default, tcpdump will perform some basic protocol analysis. To enable more detailed analysis, use the **-v flag** to enable more verbose output. By default, tcpdump will also attempt to perform reverse DNS lookups to resolve IP addresses to hostnames, as well as replace port numbers with commonly associated service names. We can disable this behavior using the **-n flag**. It's recommended that we use this flag to avoid generating additional traffic from the DNS lookups, and to speed up the analysis. To try this out, enter this command:
```bash
sudo tcpdump -i eth0 -vn
```
Without the verbose flag, tcpdump only gives us the layer 3 protocol, source, and destination addresses and ports
TCP details, like flags, sequence and ack numbers, window size, and options. With the verbose flag, we also get all the details of the IP header, like time-to-live, IP ID number, IP options, and IP flags.

#### Filtering
Tcpdump supports a powerful language for filtering packets, so we can capture only traffic that we care about or want to analyze. The filter rules go at the very end of the command, after all other flags have been specified. We'll use filtering to only capture DNS traffic to a specific DNS server. Then, we'll generate some DNS traffic, so we can demonstrate tcpdump's ability to interpret DNS queries and responses.

```bash
sudo tcpdump -i eth0 -vn host 8.8.8.8 and port 53
```
It'll run until we stop it using Ctrl+C like the previous command, but we shouldn't see any output yet. **Host 8.8.8.8** specifies that we only want packets where the source or destination IP address matches what we specify (in this case 8.8.8.8). If we only want traffic in one direction, we could also add a **direction qualifier**, like dst or src (for the destination and source IP addresses, respectively). However, leaving out the direction qualifier will match traffic in either direction. Next, the **port 53** portion means we only want to see packets where the source or destination port matches what we specify (in this case, DNS). These two filter statements are joined together with the logical operator "and". This means that both halves of the filter statement must be true for a packet to be captured by our filter.

Now, connect to second terminal and run this command:
```bash
dig @8.8.8.8 A example.com
```
This uses the **dig utility** to query a specific DNS server (in this case 8.8.8.8), asking it for the **A record** for the specified domain (in this case "example.com"). Back in the original terminal, we should now see two captured packets, as our filter rules should filter out any other traffic:
* The first one is the DNS query, which is our question (from the second terminal) going to the server. Note that, in this case, the traffic is UDP. Tcpdump's analysis of the DNS query begins right after the UDP checksum field. It starts with the DNS ID number, followed by some UDP options, then the query type (in this case A? which means we're asking for an A record). Next is the domain name we're interested in (example.com).
* The second packet is the response from the server, which includes the same DNS ID from the original query, followed by the original query. After this is the answer to the query, which contains the IP address associated with the domain name.

#### Saving Captured Packets
```bash
sudo tcpdump -i eth0 port 80 -w http.pcap
```
This starts a capture on our eth0 interface that filters for only HTTP traffic by specifying port 80. The **-w flag** indicates that we want to write the captured packets to a file named http.pcap. Like the other captures, this will run until we force it to stop with Ctrl+C. 

Once that's running, switch back to our second terminal, where we'll generate some http traffic that'll be captured in the original terminal. Don't stop the capture we started with the previous command just yet. (If we have, we can restart it now). In the second terminal window, execute this command to generate some traffic:
```bash
curl example.com
```
This command fetches the html from example.com and prints it to our screen. Once that's done, close the second terminal window and return to the original terminal where the capture is running. Stop the capture with Ctrl+C. It should return a summary of the number of packets captured.

Using ls, we can see that a binary file containing the packets we just captured, called http.pcap, will also have been created. Don't try to print the contents of this file to the screen; since it's a binary file, it'll display as a bunch of garbled text that we won't be able to read. Somewhere in that file, there's information about the packets created when we pulled down the html from example.com. We can read from this file using tcpdump now, using this command:
```bash
tcpdump -r http.pcap -nv
```
Note that we don't need to use sudo to read packets from a file. Also note that tcpdump writes full packets to the file, not just the text-based analysis that it prints to the screen when it's operating normally. For example, somewhere in the output we should see the html that was returned as the body of the original query in the other terminal.

## Defense in depth
**Defense in depth** concept is all about risk mitigation and implementing layers of security. Defense in depth is the concept of having multiple overlapping systems of defense to protect IT systems. This ensures some amount of redundancy for defensive measures. It also helps avoid a catastrophic compromise in the event that a single system fails, or a vulnerability is discovered in one system. Think of this as having multiple lines of defense. If an attacker manages to bypass our firewall, we're still protected by strong authentication systems within the network. This would require an attacker to find more vulnerabilities in more systems before real damage can occur. 

### Disabling Unnecessary Components
The special class of vulnerabilities called zero-day vulnerabilities are unique since they're unknown until they're exploited in the wild. The potential for these unknown flaws is something we should think about when looking to secure our company's systems and networks. Even though it's an unknown risk, it can still be handled by taking measures to restrict and control access to systems. Our end goal overall is risk reduction. Two important terms to know when talking about security risks are attack vectors and attack surfaces. An **attack vector** is a method or mechanism by which an attacker or malware gains access to a network or system. Some attack vectors are email attachments, network protocols or services, network interfaces, and user input. These are different approaches or paths that an attacker could use to compromise a system if they're able to exploit it. An **Attack Surface** is the sum of all the different attack vectors in a given system. Think of this as the combination of all possible ways an attacker could interact with our system, regardless of known vulnerabilities. It's not possible to know of all vulnerabilities in the system. So, make sure to think of all avenues that an outside actor could interact with our systems as a potential Attack Surface. The main takeaway here is to keep our Attack Surfaces as small as possible. This reduces the chances of an attacker discovering an unknown flaw and compromising our systems. 

To reduce Attack Surfaces, we need **simplification of our systems and services**. The less complex something is, the less likely there will be undetected flaws. So, make sure to disable any extra services or protocols. If they're not totally necessary, then get them out of there. Every additional surface that's operating represents additional Attack Surfaces, that could have an undiscovered vulnerability. That vulnerability could be exploited and lead to compromise. This concept also applies to access and ACLs. Only allow access when totally necessary. So, for example, it's probably not necessary for employees to be able to access printers directly from outside of the local network. We can just adjust firewall rules to prevent that type of access. 

Another way to keep things simple is to **reduce our software deployments**. Instead of having five different software solutions to accomplish five separate tasks, replace them with one unified solution, if we can. That one solution should require less complex code, which reduces the number of potential vulnerabilities. We should also make sure to **disable unnecessary or unused components of software and systems deployed**. By disabling features not in use, we're reducing even more tech services, even more. We're not only reducing the number of ways an attacker can get in, but we're also minimizing the amount of code that's active. It's important to take this approach at every level of systems and networks under our administration. It might seem obvious to take these measures on critical networking infrastructure and servers, but it's just as important to do this for desktop and laptop platforms that our employees use. 

Lots of consumer operating systems ship a bunch of default services and software-enabled right out of the box, that we probably won't be using in an enterprise network or environment. For example, Telnet access for a managed switch has no business being enabled in a real-world environment. We should disable it immediately if we find it on the device. Any vendor-specific API access should also be disabled if we don't plan on using these services or tools. They might be harmless especially if we set up strong firewall rules and network ACLs. 

### Host-Based Firewall
**Host-based firewalls** are important to creating multiple layers of security. They protect individual hosts from being compromised when they're used in untrusted and potentially malicious environments. They also protect individual hosts from potentially compromised peers inside a trusted network. Our network based firewall has a duty to protect our internal network by filtering traffic in and out of it, while the host based firewall on each individual host protects that one machine. Like our network based firewall, we'd still want to start with an implicit deny rule. Then, we'd selectively enable specific services and ports that will be used. This let us start with a secured default and then only permits traffic that we know and trust. We can think of this as starting with a perfectly secure firewall configuration and then poking holes in it for the specific traffic we require. This may look very different from our network firewall configuration since it's unlikely that our employees would need remote SSH access to their laptops, for example. 

A host-based firewall plays a big part in reducing what's accessible to an outside attacker. It provides flexibility while only permitting connections to selective services on a given host from specific networks or IP ranges. This ability to restrict connections from certain origins is usually used to implement a highly secure host to network. From there, access to critical or sensitive systems or infrastructure is permitted. These are called **Bastion hosts or networks**, and are specifically hardened and minimized to reduce what's permitted to run on them. Bastion hosts are usually exposed to the internet so we should pay special attention to hardening and locking them down to reduce the chances of compromise. But they can also be used as a sort of gateway or access portal into more sensitive services like core authentication servers or domain controllers. This would let us implement more secure authentication mechanisms and ackles on the Bastion hosts without making it inconvenient for our entire company. Monitoring and logging can be prioritized for these hosts more easily. Typically, these hosts or networks would also have severely limited network connectivity. It's usually just to the secure zone that they're designed to protect and not much else. Applications that are allowed to be installed and run on these hosts would also be restricted to those that are strictly necessary, since these machines have one specific purpose. 

Part of the host base firewall rules will likely also provide ackles that allow access from the VPN subnet. It's good practice to keep the network that VPN clients connected to separate using both subnetting and VLANs. This gives us more flexibility to enforce security on these VPN clients. It also lets us build additional layers of defenses, while a VPN host should be protected using other means, it's still a host that's operating in a potentially malicious environment. This host is then initiating a remote connection into our trusted internal network. These hosts represent another potential vector of attack and compromise. Our ability to separately monitor traffic coming and going from them is super useful. 

### Logging and Auditing
A critical part of any security architecture is **logging** and **alerting**. We need visibility into the security systems in place to see what kind of traffic they're seeing. We also need to have that visibility into the logs of all of our infrastructure devices and equipment that we manage. But it's not enough to just have logs, we also need ways to safeguard logs and make them easy to analyze and review. All systems and services running on hosts will create logs of some kind, with different levels of detail. It depends on what it's logging, and what events it's configured to log. So, an authentication server we'd log every authentication attempt, whether it's successful or not. A firewall would log traffic that matches rules with details like source and destination addresses, and ports being used. All this logged information gives us details about the traffic and activity that's happening on our network and systems. This can be used to detect compromise or attempts to attack the system. When there are a large number of systems located around our network, each with their own log format, it can be challenging to make meaningful sense of all this data. This is where **security information and event management systems** or **SIEMS** come in. A SIEM can be thought of as a centralized log server that does centralized logging for security administration purposes. A SIEM system gets logs from a bunch of other systems. It consolidates the logs from all different places and places it in one centralized location. This makes handling logs a lot easier. 

**Log normalization** is the process of taking log data in different formats and converting it into a standardized format that's consistent with a defined log structure. For example, log entries from our firewall may have a timestamp using a year, month, and day format, while logs from our client machines may use day, month, year format. To normalize this data, we choose one standard date format, then we define what the fields are for the log types that need to be converted. When logs are received from these machines, the log entries are converted into the standard that we defined, and stored by the logging server. This lets us analyze and compare log data between different log types and systems in a much easier fashion. If we log too much info, it's difficult to analyze a data and find useful information. Plus, storage requirements for saving logs become expensive very quickly. But if we log too little, then the information won't provide any useful insights into our systems and network. It will vary depending on the unique characteristics of the systems being monitored, and the type of activity on the network. No matter what events are logged, all of them should have information that will help understand what happened and reconstruct the events. There are lots of important fields to capture in log entries like timestamp, the event or error code, the service or application being logged, the user or system account associated with the event, and the devices involved in the event. Timestamps are super important to understanding when an event occurred. Fields like source and destination addresses will tell us who was talking to who. For application logs, we can grab useful information from the logged in user associated with the event, and from what client they used. On top of the analysis assistance it provides, a centralized live server also has security benefits. By maintaining logs on a dedicated system, it's easier to secure the system from attack. Logs are usually targeted by attackers after a breach, so that they can cover their tracks. By having critical systems send logs to remote logging server that's locked down, the details of a breach should still be logged. A forensics team will be able to reconstruct the events that led to the compromise. Once logs are centralized and standardized, we can write automated alerting based on rules. Maybe we'll want to find an alert rule for repeated unsuccessful attempts to authenticate to a critical authentication server. Lots of SIEMS solutions also offer handing dashboards to help analysts visualize this data, potentially providing more insight. **Log retention** or log storage will vary based on the amount of systems being logged, the amount of detail logs, and the rate at which logs are created. How long you want or need to keep logs around will also really influence the storage requirements for a log server. Some examples of logging servers and SIEMS solutions are the open source rsylog, Splunk Enterprise Security, IBM Security Qradar, and RSA Security analytics. 

### Antimalware Protection
Anti malware measures play a super important role in keeping various attacks off our systems and helping to protect our users. **Antivirus software** has been around for a really long time but some security experts question the value it can provide to a company especially since more sophisticated malware and attacks have been spun up in recent years. Antivirus software is signature based. This means that it has a database of signatures that identify known malware like the unique file hash of a malicious binary or the file associated with an infection. Or it could be that network traffic characteristics that malware uses to communicate with a command and control server. Antivirus software will monitor and analyze things like new files being created or being modified on the system in order to watch for any behavior that matches a known malware signature. If it detects activity that matches the signature, depending on the signature type, it will attempt to block the malware from harming the system. But some signatures might only be able to detect the malware after the infection has occurred. In that case, it may attempt to quarantine the infected files. If that's not possible, it will just log and alert the detection event. 

At a high level, this is how all antivirus products work. There are two issues with antivirus software though. The first is that they depend on antivirus signatures distributed by the antivirus software vendor. The second is that they depend on the antivirus vendor discovering new malware and writing new signatures for newly discovered threats. Until the vendor is able to write new signatures and publish and disseminate them, our antivirus software can't protect us from these emerging threats.  Antivirus, which is designed to protect systems, actually represents an additional attack surface that attackers can exploit. Antivirus software protects against the most common attacks out there on the internet. Antivirus is an easy solution to provide that protection. It lets us remove the background noise and focus on the more important targeted or specific threats. While antivirus operates on a blacklist model, checking against a list of known bad things and blocking what gets matched, there's a class of anti malware software that does the opposite. Binary whitelisting software operates off a white list. It's a list of known good and trusted software and only things that are on the list are permitted to run. Everything else is blocked. We can think of this as applying the implicit deny ACL rule to software execution. By default, everything is blocked. Only things explicitly allowed to execute are able to. This typically only applies to executable binaries, not arbitrary files like PDF documents or text files. This would naturally defend against any unknown threats but at the cost of convenience. 

Think about how frequently we download and install new software on our machine. Now imagine if we had to get approval before we could download and install any new software. Now, imagine that every system update had to be whitelisted before it could be applied. Obviously, not trusting everything wouldn't be very sustainable. It's for this reason that binary whitelisting software can trust software using a couple of different mechanisms. The first is using the unique cryptographic hash of binaries which are used to identify unique binaries. This is used to whitelist individual executables. The other trust mechanism is a **software-signing certificate**. Software signing or code signing is the same idea but applied to software. A software vendor can cryptographically sign binaries they distribute using a private key. The signature can be verified at execution time by checking the signature using the public key embedded in the certificate and verifying the trust chain of the public key. If the hash matches and the public key is trusted, then the software can be verified that it came from someone with the software vendor's code signing private key. Binary whitelisting systems can be configured to trust specific vendors' code signing certificates. They permit all binary sign with that certificate to run. This is helpful for automatically trusting content like system updates along with software in common use that comes from reputable and trusted vendors. However, each new code signing certificate that's trusted represents an increase in attack surface. An attacker can compromise the code signing certificate of a software vendor that our company trusts and use that to sign malware that targets our company. That would bypass any binary whitelisting defenses in place.

### Disk Encryption
**Full-disk encryption** or **FDE** is an important factor in a defense in-depth security model. It provides protection from some physical forms of attack. Systems with their entire hard drives encrypted are resilient against data theft. They'll prevent an attacker from stealing potentially confidential information from a hard drive that's been stolen or lost. Without also knowing the encryption password or having access to the encryption key, the data on the hard drive is just meaningless gibberish. This is a very important security mechanism to deploy for more mobile devices like laptops, cell phones, and tablets, desktops and servers too. Since disk encryption not only provides confidentiality but also integrity. This means that an attacker with physical access to a system can't replace system files with malicious ones or install malware. Having the disk fully encrypted protects from data theft and unauthorized tampering even if an attacker has physical access to the disk. But in order for a system to boot if it has an FDE setup, there are some critical files that must be accessible. They need to be available before the primary disk can be unlocked and the boot process can continue. Because of this, all FDE setups have an unencrypted partition on the disk, which holds these critical boot files. Examples include things like the **kernel** and **bootloader**, that are critical to the operating system. These files are actually vulnerable to being replaced with modified potentially malicious files by an attacker with physical access. While it's possible to compromise a machine this way, it would take a sophisticated and determined attacker to do it. There's also protection against this attack in the form of the secure boot protocol, which is part of the UEFI specification. **Secure boot** uses public key cryptography to secure these encrypted elements of the boot process. It does this by integrated code signing and verification of the boot files. Initially, secure boot is configured with what's called a **platform key**, which is the public key corresponding to the private key used to sign the boot files. This platform key is written to firmware and is used at boot-time to verify the signature of the boot files. Only files correctly signed and trusted will be allowed to execute. This way, a secure boot protects against physical tampering with the unencrypted boot partition. There are first-party full-disk encryption solutions from Microsoft and Apple called Bit Locker and FileVault 2 respectively. There are also a bunch of third party and open source solutions. On Linux, the dm-crypt package is super popular. There are also solutions from PGP, TrueCrypt, VeraCrypt, and lots of others. 

Full-disk encryption schemes rely on the secret key for actual encryption and decryption operations. They typically password-protect access to this key. And in some cases, the actual encryption key is used to derive a user key, which is then used to encrypt the master key. If the encryption key needs to be changed, the user key can be swapped out, without requiring a full decryption and re-encryption of the data being protected. This would be necessary if the master encryption key needs to be changed. Password-protecting the key works by requiring the user entry passphrase to unlock the encryption key. It can then be used to access the protected contents on the disk. In many cases, this might be the same as the user account password to keep things simple and to reduce the number of passwords to memorize. When we implement a full-disk encryption solution at scale, it's super important to think about how to handle cases where passwords are forgotten. This is another convenience tradeoff when using FDE. If the passphrase is forgotten, then the contents of the disk aren't recoverable. This is why lots of enterprise disk encryption solutions have a key escrow functionality. **Key escrow** allows encryption key to be securely stored for later retrieval by an authorized party. So if someone forgets the passphrase to unlock their encrypted disk for their laptop, the systems administrators are able to retrieve the escrow key or recovery passphrase to unlock the disk. It's usually a separate key passphrase that can unlock the disk in addition to the user to find one. This allows for recovery if a password is forgotten. The **recovery key** is used to unlock the disk and boot the system fully. We should compare full-disk encryption against file-based encryption. That's where only some files or folders are encrypted and not the entire disk. This is usually implemented as home directory encryption. It serves a slightly different purpose compared to FDE. **Home directory or file-based encryption** only guarantees confidentiality and integrity of files protected by encryption. These setups usually don't encrypt system files because there are often compromises between security and usability. When the whole disk isn't encrypted, it's possible to remotely reboot a machine without being locked out. If we reboot a full-disk encrypted machine, the disk unlock password must be entered before the machine finishes booting and is reachable over the network again. So while file-based encryption is a little more convenient, it's less protected against physical attacks. An attacker can modify or replace core system files and compromise the machine to gain access to the encrypted data. 

### Software Patch Management
While some parts of software features are exposed, a lot of attacks depend on exploiting bugs in software. This triggers obscure and unintended behavior which can lead to a compromise of the system running the vulnerable software. These types of vulnerabilities can be fixed through **software patches and updates** which correct the bugs that the attackers exploit. Software updates don't just improve software products by adding new features and improving performance and stability, they also address security vulnerabilities. There are some software bugs that are present in the core functionality of the software in question. This means, that the vulnerability can't be mitigated by disabling the vulnerable service. 

An example of this was the Heartbleed vulnerability. A bug in the open source TLS library Open SSL. This was discovered in widely publicized in April of 2014. The bug showed up in how the library handled TLS heartbeat messages. They're special messages that allow one party in a TLS session to signal to the other party that they like the session to be kept alive. This works by sending a TLS heartbeat request message, a packet that has a text string and the length of the string. The receiving end is supposed to reply with the same tech string and response. So, if the heartbeat requests message contains the text "I am still alive" and the length of 15, the receiving end would reply back with the same text, "I am still alive." But, the bug in the Open SSL library was that the replying side would allocate memory space according to the value in the received packet. This was based on the specified length of the string like it's defined in the packet. Not based on the actual length of the string. The value was not verified. This meant that an attacker can send a malformed heartbeat request message with a much larger length specified than what was allowed. The reply would contain the original text message but would also include bits of memory from the replying system. So, an attacker can send a malformed heartbeat request message containing the text, "I'm still alive", but with a length of 500. Because the length value wasn't verified. this means that the response back would be, "I'm still alive", followed by the next 485 characters in memory. So it was possible for an attacker to read up the 64 kilobytes of a target's memory. This memory was likely used before by Open SSL library, so it might contain sensitive information regarding other TLS sessions. This bug meant that it was feasible for an attacker to recover the private keys used to protect TLS sessions. This would allow them to decrypt TLS protected sessions and recover details like log in credentials. This is a great example of a mistake in the code leading to a very high profile software vulnerability. It could only be fixed through a software update or switching to a different TLS library entirely. While the heartbeat functionality is enabled by default, it's possible to disable it in the Open SSL library but it wasn't a simple argument to pass to an application. Disabling this functionality required compiling the library with a flag that was specified to disable heartbeats. This was also a library widely used by both server applications and client applications. This means, that it may not be possible to replace the Open SSL library with a customized version or a different library. 

The system should be checking for, distributing and verifying software updates for software deployment. This is a complex problem when considering a large organization with many machines to manage that run a variety of software products. This is where management tools can help make this task more approachable for us. Solutions like Microsoft's SCCM or Puppet Labs puppet in fact and tools allow administrators to get an overview of what software is installed across their fleet of many systems. This lets a security team analyze what specific software and versions are installed, to better understand the risk of vulnerable software in the fleet. When updates are released and pushed to the fleet, these reporting tools can help make sure that the updates have been applied. SCCM even has the ability to force install updates after a specified deadline has passed. Patching isn't just necessary for software, but also operating systems and firmware that run on infrastructure devices. Every device has code running on it that might have software bugs that could lead to security vulnerabilities, from routers, switches, phones even printers. Operating system vendors usually push security related patches pretty quickly when an issue is discovered. They'll usually release security fixes out of cycle from typical OS upgrades to ensure a timely fix because of the security implications. But, for embedded devices like networking equipment or printers, this might not be typical. Critical infrastructure devices should be approached carefully when we apply updates. There's always the risk that a software update will introduce a new bug that may affect the functionality of a device, or if the update process itself would go wrong and cause an outage. 

### Application Policies
Application software can represent a pretty large attack surface. This is especially true when it comes to a large fleet of systems used throughout an organization. So, it's important to have some kind of **application policies** in place. These policies serve two purposes. Not only do they defined boundaries of what applications are permitted or not, but they also help educate folks on how to use software more securely. We've seen the risks that software can pose because of security vulnerabilities. It makes sense to have a policy around applying software updates in a timely way. A common recommendation or even a requirement is to only support or require the latest version of a piece of software, as these updates will often fix issues that someone may be encountering. This should be clearly called out in a policy. It's generally a good idea to disallow risky classes of software by policy. Things like file sharing software and piracy-related software tend to be closely associated with malware infections. 

If we want to employ a binary white listing solution, it's also important to define a policy around what type of software can be whitelisted. These policies usually require some kind of business use case or justification to avoid a lot of one off personal software requests. Another class of software that we might want to have policies defined for are **browser extensions or add ons**. Since a lot of workflows live exclusively within the web browser now, they represent a potential vector for malware that often gets overlooked. Extensions that require full access to web sites visited can be risky since the extension developer has the power to modify pages visited. Some extensions may even send user input to a remote server. This could potentially leak confidential information. Clearly defining classifications of risky extensions and add ons will help protect our systems and provide guidance to our users. But, policies are usually not enough to arm users with the information they need to make informed choices. Their decisions can impact the security of our organization, and that's where education and training comes into play.

## Creating a Company Culture for Security
### Security Goals
In order to design a security architecture, we need to define exactly what we like it to accomplish and this will depend on what our company thinks is most important. It will probably have a way it wants different data to be handled and stored. We also need to know if our company has any legal requirements when it comes to security. If our company handles credit card payments, then you have to follow the **PCI DSS** or **Payment Card Industry Data Security Standard** depending on local laws. PCI DSS is broken into six broad objectives, each with some requirements. 
* The first objective is to build and maintain a secure network and systems. This includes the requirements to install and maintain a firewall configuration to protect cardholder data and to not use vendor supply default for system passwords and other security parameters. It provides more specific guidance around what a firewall configuration should control. For example, a secure firewall configuration should restrict connections between untrusted networks and any systems in the cardholder data environment. 

* The second objective category is to protect cardholder data. In this objective, the first requirement is to protect stored cardholder data. The second is to encrypt the transmission of cardholder data across open public networks. The requirements give us specific guidelines on how to get this done. The specifics of these requirements help clarify some of the points like what constitutes an open network. They also recommend using strong cryptography and offer some examples. But not all requirements are technical in nature. Let's look at the requirement to protect stored cardholder data for example, it has requirements for data retention policies to make sure that sensitive payment information isn't stored beyond the time it's required. Once payment is authorized, authentication data shouldn't be needed anymore and it should be securely deleted. This highlights the fact that good security defenses aren't just technical in nature. They are also procedural and policy-based.

* The third objective is to maintain a vulnerability management program. The first requirement is to protect all systems against malware and regularly update antivirus software or programs. The second is to develop and maintain secure systems and applications. The detailed implementation procedures within these requirements cover things like ensuring all systems have antivirus software installed and making sure this software is kept up to date. They also require that scans are run regularly and logs are maintained. There are also requirements for ensuring systems and software are protected against known vulnerabilities by applying security patches at least one month from the release of a security patch. Use of third-party security vulnerability databases is also listed to help identify known vulnerabilities within managed systems. 

* The fourth objective is to implement strong access control measures. This objective has three requirements. The first is to restrict access to cardholder data by business need-to-know. The second is to identify and authenticate access to system components. And the third is to restrict physical access to cardholder data. This highlights the importance of good access control measures along with good data access policies. The first objective, restricting access to data by business need-to-know, means that any sensitive data should be directed to data access policies to make sure that customer data isn't misused. Part of this requirement is to enforce password authentication for system access and two factor authentication for remote access, that's the minimum requirement. Another important piece highlighted by the PCI DSS requirements is access control for physical access. This is a critical security aspect to keep in mind since we need to protect systems and data from both physical theft and virtual attacks. 

* The fifth objective is to regularly monitor and test networks. The first requirement is to track and monitor all access to network resources and cardholder data. The second is to regularly test security systems and processes. The requirement for network monitoring and testing is another essential part of a good security plan. This refers to things like setting up and configuring intrusion detection systems and conducting vulnerability scans of the network. Testing defenses is another super important part of this as just having the systems in place isn't enough. It's really helpful to test defense systems regularly to make sure that they provide the protection that we want. It also ensures that the alerting systems are functional. 

* The sixth and final objective is to maintain an information security policy. It only has one requirement, to maintain a policy that addresses information security for all personnel. This requirement addresses why we need to have well-established security policies. They help govern and regulate user behavior when it comes to information security aspects. It's important to call out that this requirement mentions that the policy should be for all personnel. The responsibility of information security isn't only on the security teams. Every member of an organization is responsible for information security. Well-designed security policies address the most common questions or use cases that users would have based on the specific details of the organization. Every one that uses systems on our organization's network, is able to get around security. They might not mean to, but they can reduce the overall security with their actions and practices. That's why having well-thought-out security policies in place also need to be easy to find, and easy to read. 

### Measuring and Assessing Risk
Security is all about determining risks or exposure understanding the likelihood of attacks; and designing defenses around these risks to minimize the impact of an attack. **Security risk assessment** starts with threat modeling. First, we identify likely threats to our systems, then we assign them priorities that correspond to severity and probability. We do this by brainstorming from the perspective of an outside attacker putting ourselves in a hackers shoes. It helps to start by figuring out what high value targets an attacker may want to go after. From there, we can start to look at possible attack vectors that could be used to gain access to high value assets. High-value data usually includes account information, like usernames and passwords. Typically, any kind of user data is considered high value, especially if payment processing is involved. 

Another part of risk measurement is understanding what vulnerabilities are on our systems and network. One way to find these out is to perform regular vulnerabilities scanning. There are lots of open source and commercial solutions that we can use which can be configured to perform scheduled, automated scans of designated systems or networks to look for vulnerabilities. Then they generate a report. Some of these tools are Nessus, OpenVas and Qualys, etc. **Vulnerability scanners** are services that run on our system within our control that conduct periodic scans of configure networks. The service then conducts scans to find and discover hosts on the network. Once hosts are found either through a ping sweep or port scanning more detailed scans are run against discovered hosts scans, upon scans, upon scans. A **port scan** of either common ports or all possible valid ports is conducted against discovered hosts to determine what services are listening. These services are then probed to try to discover more info about the type of service and what version is listening on the relevant port. This information can then be checked against databases of known vulnerabilities. If a vulnerable version of a service is discovered, the scanner will add it to its report. Once the scan is finished the discovered vulnerabilities and hosts are compiled in a report, that way and analysts can quickly and easily see where the problem areas are on the network. Found vulnerabilities are prioritized according to severity, and other categorization. Severity takes into account a number of things, like how likely the vulnerability is to be exploited. It also considers the type of access the vulnerability would provide to an attacker and whether or not it can be exploited remotely or not. Vulnerabilities and the report will have links to detailed and disclosed information about the vulnerability. In some cases, it will also have recommendations on how to get rid of it. Vulnerability scanners will detect lots of things, ranging from misconfigured services that represent potential risks, to detecting the presence of back doors and systems. Vulnerability scanning can only detect known and disclose vulnerabilities and insecure configurations. That's why it's important for us to have an automated vulnerability scan conducted regularly. We'll also need to keep the vulnerability database up to date, to make sure new vulnerabilities are detected quickly. But vulnerability scanning isn't the only way to put our defenses to the test. 

Conducting regular **penetration tests** is also really encouraged to test our defenses even more. These tests will also ensure detection and alerting systems are working properly. Penetration Testing is the practice of attempting to break into a system or network to verify the systems in place. Think of this as playing the role of a bad guy, for educational purposes. This exercise isn't designed to see if we have the acting chops it's intended to make us think like an attacker and use the same tools and techniques they would use. This way, we can test our systems to make sure they protect us like they're supposed to. The results of the penetration testing reports will also show us, where weak points or blind spots exist. These tests help improve defenses and guide future security projects. They can be conducted by members of our in-house security team. If our internal team doesn't have the resources for this exercise, we can hire a third party company that offers penetration testing as a service. That would help give us more perspectives on our defense systems and we'll get a more comprehensive test this way.

### Privacy Policy
When we're supporting systems that handle customer data, it's super important to protect it from unauthorized and inappropriate access. It's not just to defend against external threats, it also protects that data against misuse by employees. This type of behavior would fall under our company's privacy policies. **Privacy policies** oversee the access and use of sensitive data. They also define what appropriate and authorize use is, and what provisions or restrictions are in place when it comes to how the data is used. People might not consider the security implications of their actions, so both privacy and data access policies are important to guiding and informing people how to maintain security while handling sensitive data. Having defined and well established privacy policies is an important part of good privacy practices. But we also need a way to enforce these policies. Periodic audits on cases where sensitive data was accessed can get us there. This was enabled by our logging and monitoring systems. Auditing data access logs is super important, it helps us ensure that sensitive data is only accessed by people who are authorized to access it, and that they use it for the right reasons. It's good practice to apply the principle of least privilege here, by not allowing access to this type of data by default. We should require anyone that needs access to first make an access request with a justification for getting the data. But it can't just be vague or generic requests for access, they should be required to specify what data they need access to. Usually, this type of request would also have a time limit that should be called out in a request. That way, we can ensure that data access is only permitted for legitimate business reasons which reduces the likelihood of inappropriate data access or usage. By logging each day the access request and actual data access, we can also correlate requests with usage. 

Any access that doesn't have a corresponding request should be flagged as a high-priority potential breach that needs to be investigated as soon as possible. Company policies act as our guidelines in informational resources on how and how not to access and handle data. They're equally important here. Policies will range from sensitive data handling to public communications. Data handling policies should cover the details of how different data is classified. If something is considered sensitive or confidential, we probably have stipulations that this data shouldn't be stored on media that's easily lost or stolen, like USB sticks or portable hard drives. They're also commonly used without any encryption at all.

### User Habits
We have to make sure our users habits and actions involve having clear and reasonable security policies. We also ensure that our users are diligent about maintaining security. Leaks and disclosures can be avoided by understanding what employees need to do to accomplish their jobs. If an employee needs to share a confidential file with an external partner and it's too big to e-mail, they may want to upload it to a third-party file sharing website that they have a personal account with. This is risky business. We should never upload confidential information onto a third-party service that hasn't been evaluated by our company. If sharing big files with external parties is common behavior for our employees, it's best to find a solution that meets the needs of our users and the security guidelines. By providing a sanctioned and approved mechanism for this file sharing activity, users are less likely to expose the organization to unnecessary risk. 

Users don't like to memorize long complicated passwords, but this is super important to keeping our company safe. If we require 20 character passwords that have to be changed every three months, our users will almost definitely write them down. This compromises the security that our complex password policy is supposed to provide. It's important to understand what threats password policies are supposed to protect against, that way, we can try to find a better balance between security and usability. A long and complex password requirement is designed to protect against brute force attacks, either against authentication systems or if a hashed password database is stolen. Since direct brute force attacks against authentication infrastructure should be easily detected and blocked by intrusion prevention systems, they can be considered pretty low risk. But the theft of a password database would be a super serious breach. We do have lots of additional layers of security in place to prevent a critical compromise like that from happening in the first place. So the two attacks that complex passwords are primarily designed to protect against, are fairly low risk. Now, we can relax the password requirements a bit and not ask for overly long passwords. 

We can even adjust the mandatory password rotation time period. Password reuse is another common user behavior. People don't want a bunch of passwords to memorize, lots of users find it easier to use the same password, for both their personal email account and their work account. But this undermines the security of their work password. If an online service is compromised and the password database is leaked, they're in trouble. The passwords in that database will find their way into password files used for cracking passwords and brute force attacks. Once a password isn't a secret, it shouldn't be used anymore. The chances of a bad actor being able to use the password are too high. That's why it's important to make sure employees use new and unique passwords, and don't reuse them from other services. It's also important to have a password change system check against old passwords. This will prevent users from changing their password back to a previously used potentially compromised password. 

A much greater risk in the workplace that users should be educated on is credential theft from phishing emails. Phishing emails are pretty effective. They take advantage of people's inclination to open emails without looking at them too closely. If an e-mail that seems authentic actually leads to a fake login page, users can blindly enter their credentials into the fake site and disclose their credentials to an attacker. While having two factor authentication helps protect against this type of attack, OTP-based two factor solutions would still provide usable credentials to an attacker, plus the attacker still has a password which is really not good even in a two factor environment. If someone entered their password into a phishing site or even suspects they did, it's important to change their password as soon as possible. Our organization should try to detect these types of password disclosures using tools like password alert. This is a Chrome extension from Google that can detect when we enter our password into a site that's not a Google page. Being able to detect when a password is entered into a potentially untrustworthy site, lets an organization detect potential phishing compromises. But we can also combat phishing attacks with good spam filtering combined with good user education. 

### Third-Party Security
Sometimes, we need to rely on **third party solutions**, or service providers, because we might not be able to do everything in-house. In some cases, we'll have to trust that third party with a lot of potentially-sensitive data or access. When we contract services from a third party, we're trusting them to protect our data and any credentials involved. If they have sub par security, we're undermining yur security defenses by potentially opening a new avenue of attack. It's important to hire trustworthy and reputable vendors whenever we can. We also need to manage the engagements in a controlled way. This involves conducting a vendor risk review or security assessment. In typical vendor security assessments, we ask vendors to complete a questionnaire that covers different aspects of their security policies procedures and defenses. The questionnaire is designed to determine whether or not they've implemented good security designs in their organization. 

For software services, or hardware vendors, we might also ask to test the software/ hardware, that way, we can evaluate it for potential security vulnerabilities or concerns before deciding to contract their services. It's important to understand how well-protected our business partners are, before deciding to work with them. If they have poor security practices, our organization's security could be at risk. If we contract services from a company that will be handling data on our behalf, the security of our data is in the hands of this third party. It's important to understand how safe our data will be with them. Sometimes, vendors will perform tasks for us, so they'll have access to our network and systems. In these cases, it's also important to understand how well secured third party is. A compromise of their infrastructure could lead to a breach of our systems. While the questionnaire model is a quick way to assess a third party, it's not ideal. It depends on self reporting of practices, which is pretty unreliable. Without a way to verify or prove what's stated in the questionnaire, we have to trust that the company is answering honestly. While we'd hope that a company we're doing business with would be honest, it's best to verify. If we can, ask for a third party security assessment report. Some of the information on the questionnaire can be verified, like third party security audit results and penetration testing reports. In the case of third party software, we might be able to conduct some basic vulnerability assessments and tests to ensure the product has some reasonable security. There are lots of companies that will evaluate vendors for us for a price. But, Google recently made their **vendor security assessment questionnaires** available for free. It's a great starting point to design our own vendor security assessment questionnaire, or we can just use these as is. If the third party service involves the installation of any infrastructure equipment on site, pay close attention to how they're doing it. We have to make sure this equipment's managed in a way, that doesn't negatively affect overall security. Let's say, the vendor company requires remote access to the infrastructure device to perform maintenance. If that's the case, then make appropriate adjustments to firewall rules to restrict this access. That way, we'll make sure that it can't be used as an entry point into our network. Additional monitoring would also be recommended for this third party device since it represents a new potential attack surface in our network. We can run in-depth vulnerability assessments and penetration testing of the hardware, and make sure there aren't any obvious vulnerabilities in the product. 
