# CMP 5006 - Information Security 


## Cryptographic Security Metrics:
### Key Metrics for Evaluating Cryptographic Techniques


### Alejandro Proano, PhD. 

##  Limitations of Classical Techniques

- Vulnerable to:
  * Frequency Analysis
  * Pattern Recognition
  * Brute Force Attacks

- Modern Cryptography Advances:
  * Complex Mathematical Algorithms
  * Large Key Spaces
  * Computational Complexity


## What is Cryptanalysis?
- **Definition**: The art and science of breaking cryptographic systems
- **Primary Goals**:
  - Understand cipher weaknesses
  - Recover original message without knowing the key
  - Exploit systematic vulnerabilities


## Cryptanalysis Techniques
- Frequency Analysis
- Pattern Recognition
- Statistical Methods
- Known Plaintext Attacks



##  Frequency Analysis Fundamentals
### Letter Frequency in English
```
Most Common Letters:
E - 12.7%
T - 9.1%
A - 8.2%
O - 7.5%
I - 7.0%
N - 6.7%
```

### Typical Letter Distribution
- **Vowels**: E, A, I, O, U
- **Common Consonants**: T, N, S, R, H


## Substitution Cipher Cryptanalysis
#### Frequency Analysis Method
1. Count letter frequencies in ciphertext
2. Compare with standard English distribution
3. Map most frequent ciphertext letters to most frequent plaintext letters


In [6]:
def decrypt_caesar_cipher(encrypted_text, key):
    decrypted_text = ""
    for letter in encrypted_text:
        if letter.isalpha():
            decrypted_letter = chr((ord(letter) - key - ord('A')) % 26 + ord('A'))
            decrypted_text += decrypted_letter
        else:
            decrypted_text += letter
    
    return decrypted_text

def count_letters(text):
    letter_count = {}
    for letter in text:
        if letter.isalpha():
            if letter.lower() in letter_count:
                letter_count[letter.lower()] += 1
            else:
                letter_count[letter.lower()] = 1
    return letter_count

In [1]:
### Example:

ciphertext = """
Wkxi iokbc vkdob, kc ro pkmon dro psbsxq caekn, Myvyxov Kebovskxy Leoxnsk 
gkc dy bowowlob drkd nscdkxd kpdobxyyx grox rsc pkdrob dyyu rsw dy nscmyfob smo. 
Kd drkd dswo Wkmyxny gkc k fsvvkqo yp dgoxdi knylo ryecoc, lesvd yx dro lkxu yp 
k bsfob yp mvokb gkdob drkd bkx kvyxq k lon yp zyvscron cdyxoc, grsmr gobo grsdo
kxn oxybwyec, vsuo zborscdybsm oqqc. Dro gybvn gkc cy bomoxd drkd wkxi drsxqc 
vkmuon xkwoc, kxn sx ybnob dy sxnsmkdo drow sd gkc xomocckbi dy zysxd. 
Ofobi iokb nebsxq dro wyxdr yp Wkbmr k pkwsvi yp bkqqon qizcsoc gyevn cod ez 
drosb doxdc xokb dro fsvvkqo, kxn gsdr k qbokd ezbykb yp zszoc kxn uoddvonbewc 
droi gyevn nsczvki xog sxfoxdsyxc. Psbcd droi lbyeqrd dro wkqxod
"""

In [2]:
from collections import Counter
tokens = list(ciphertext.lower().replace(' ','').replace('\n',''))

In [3]:
Counter(tokens)

Counter({'o': 67,
         'd': 54,
         'k': 53,
         'y': 40,
         'x': 38,
         'b': 36,
         's': 36,
         'c': 32,
         'r': 30,
         'n': 23,
         'v': 20,
         'w': 15,
         'g': 15,
         'q': 14,
         'p': 13,
         'm': 13,
         'e': 13,
         'i': 12,
         'z': 9,
         ',': 7,
         'l': 7,
         'f': 6,
         'u': 5,
         '.': 4,
         'a': 1})

In [4]:
alph = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
itoc = {ix:ch for ix,ch in enumerate(alph)}
ctoi = {ch:ix for ix,ch in itoc.items()}

In [5]:
# In English, the most popular letter is E
# e -> o
# a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z
# the key is: 
ctoi['o'] - ctoi['e']

10

In [7]:
print(decrypt_caesar_cipher(ciphertext.upper(),10))


MANY YEARS LATER, AS HE FACED THE FIRING SQUAD, COLONEL AURELIANO BUENDIA 
WAS TO REMEMBER THAT DISTANT AFTERNOON WHEN HIS FATHER TOOK HIM TO DISCOVER ICE. 
AT THAT TIME MACONDO WAS A VILLAGE OF TWENTY ADOBE HOUSES, BUILT ON THE BANK OF 
A RIVER OF CLEAR WATER THAT RAN ALONG A BED OF POLISHED STONES, WHICH WERE WHITE
AND ENORMOUS, LIKE PREHISTORIC EGGS. THE WORLD WAS SO RECENT THAT MANY THINGS 
LACKED NAMES, AND IN ORDER TO INDICATE THEM IT WAS NECESSARY TO POINT. 
EVERY YEAR DURING THE MONTH OF MARCH A FAMILY OF RAGGED GYPSIES WOULD SET UP 
THEIR TENTS NEAR THE VILLAGE, AND WITH A GREAT UPROAR OF PIPES AND KETTLEDRUMS 
THEY WOULD DISPLAY NEW INVENTIONS. FIRST THEY BROUGHT THE MAGNET



## 1. Computational Security Metrics

## Key Space
- **Definition:** Total number of possible keys
- **Importance:** Larger key space makes brute-force attacks more difficult
- **Calculation:** $2^{|K|}$
- **Example:** 
  - 128-bit key = $2^{128}$ possible combinations
  - 256-bit key = $2^{256}$ possible combinations


In [11]:
import math

print(math.log10(2**128))
print(math.log10(2**256))


38.53183944498959
77.06367888997919


In [22]:
# DES used 56-bits keys

a = 2**56

print(a * 10**(-9) / (2000*3600) )

10.007999171934436


## Time Complexity
- **Measures:** Computational effort required to break the encryption
- **Common Classifications:**
  - Polynomial time
  - Exponential time
  - Sub-exponential time


##  2. Information Theoretic Metrics


## Entropy
- **Definition:** Measure of unpredictability or randomness
- **Calculation:** Shannon entropy formula
- **Higher entropy indicates stronger security**


### Key Equivocation

It is a measure of the amount of uncertainty of the key remaining when the ciphertext is known.

> $H ( K | C ) = H ( K ) + H ( P ) − H ( C )$

## Spurious Keys

If a cryptoanalyst is trying to figure out the key used to encrypt a ciphertext. She will reduce the set of keys to a smaller set of keys. Of those keys, there is one that is the true key **k**. The other keys in this small set are called **spurious keys**

## Redundancy

Suppose $L$ is a natural language. The entropy of $L$ is defined to be the quantity

> $H_L = \lim_{n \rightarrow \infty} \frac{H({\bf P}^n)}{n}$

where $P^n$ is the random variable that has as its probability distribution that of all n-grams of plaintext


The redundancy of $L$ is defined to be

> $R_L = 1 − \frac{H_L}{\log_2 |P |}$

 $log_2 |P |$ measures the entropy per letter of the language $L$. A random language would have entropy $H_L$. So the quantity $R_L$ measures the fraction of **excess characters**, which we think of as redundancy.


## Theorem

Suppose there is a cryptosystem where $|C| = |P |$ and keys are chosen equiprobably. Let $R_L$ denote the redundancy of the underlying language. Then given a string of ciphertext of length $n$, where $n$ is sufficiently large, the expected number of spurious keys, $s_n$, satisfies

> $\hat{s_n} \geq \frac{|K|}{|P|^{nR_L}}$ - 1


This quantity approaches 0 exponentially quickly as $n$ increases. 

## Unicity Distance

A cryptosystem has a unicity distance is defined to be the value of $n$, denoted by $n_0$ , at which the expected number of spurious keys becomes zero; i.e., the average amount of ciphertext required for an opponent to be able to uniquely compute the key, given enough computing time.


- **Measures:** Amount of ciphertext required to uniquely determine the key
- **Indicates theoretical breakability of a cryptosystem**


##  3. Quantitative Security Metrics

## Perfect Secrecy

A cryptosystem has perfect secrecy if 

> $Pr[ x |y] = Pr[ x ]$

> $\forall x ∈ P , y ∈ C$ 

That is, the a posteriori probability that the plaintext is $x$, given that the ciphertext $y$ is observed, is identical to the a priori probability that the plaintext is $x$.

### Example

Shifter cryptosystem