# Classical Cipher Cryptanalysis Report

This document presents the full cryptanalysis and decryption of four ciphertexts provided in the `177-Student` folder.
The analysis is structured according to the assignment requirements:

1.  **Statistical Analysis**: Generation of frequency tables (1-gram, 2-gram, 3-gram) and IC calculation.
2.  **Cipher Identification**: Classification based on statistical metrics.
3.  **Decryption**: Algorithm implementation and key recovery.

**Alphabet Configuration**:
Per the assignment guidelines, the alphabet consists of 29 characters:
`ABCDEFGHIJKLMNOPQRSTUVWXYZ,.`

In [1]:
import os
import numpy as np
from collections import Counter

# Global Configuration
ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZ,.-"
MOD = len(ALPHABET)  # 29
BASE_PATH = "177-Student"  # Folder containing input files

def read_ciphertext(filename):
    path = os.path.join(BASE_PATH, filename)
    with open(path, "r", encoding="utf-8") as f:
        text = f.read().strip()
    # Filter strictly for the defined alphabet
    return "".join(c for c in text if c in ALPHABET)

## 1. Statistical Analysis

I perform a frequency analysis for single characters, digrams, and trigrams, and calculate the Index of Coincidence (IC). Results are saved to external files in the `analysis_results` folder to maintain document readability.

In [2]:
def index_of_coincidence(text):
    N = len(text)
    if N <= 1: return 0
    freq = Counter(text)
    return sum(f * (f - 1) for f in freq.values()) / (N * (N - 1))

def format_table(counter, total, limit=15):
    lines = []
    for i, (k, v) in enumerate(counter.most_common()):
        if i >= limit: break
        perc = v / total * 100
        lines.append(f"{k:>4} : {v:>6} ({perc:5.2f}%)")
    return "\n".join(lines)

def ngrams(text, n):
    return [text[i:i+n] for i in range(len(text) - n + 1)]

def analyze_cipher_to_file(cipher_id):
    output_dir = os.path.join(BASE_PATH, "analysis_results")
    os.makedirs(output_dir, exist_ok=True)
    output_file = os.path.join(output_dir, f"analysis_{cipher_id}.txt")
    
    text = read_ciphertext(f"{cipher_id}.txt")
    
    mono_c, mono_t = Counter(text), len(text)
    di_c, di_t = Counter(ngrams(text, 2)), len(ngrams(text, 2))
    tri_c, tri_t = Counter(ngrams(text, 3)), len(ngrams(text, 3))
    ic = index_of_coincidence(text)

    with open(output_file, "w", encoding="utf-8") as f:
        f.write(f"Statistical Analysis – Cipher {cipher_id}\n")
        f.write("=" * 40 + "\n\n")
        f.write(f"Length: {len(text)}\n")
        f.write(f"IC: {ic:.5f}\n\n")
        f.write("1-grams\n" + "-"*20 + "\n" + format_table(mono_c, mono_t) + "\n\n")
        f.write("2-grams\n" + "-"*20 + "\n" + format_table(di_c, di_t) + "\n\n")
        f.write("3-grams\n" + "-"*20 + "\n" + format_table(tri_c, tri_t) + "\n")
    
    print(f"Analysis for Cipher {cipher_id} saved (IC={ic:.4f}).")

# Run Analysis
for i in range(4):
    analyze_cipher_to_file(i)

Analysis for Cipher 0 saved (IC=0.0622).
Analysis for Cipher 1 saved (IC=0.0632).
Analysis for Cipher 2 saved (IC=0.0371).
Analysis for Cipher 3 saved (IC=0.0395).


## 2. Decryption

Based on the statistical analysis (IC and frequency distribution), we have identified and solved the four ciphers.

### Cipher 0: Monoalphabetic Substitution
**Identification**: High IC (~0.062) matching English. Stable character frequencies but mapped to wrong letters.
**Method**: Frequency analysis + crib dragging (identifying "THE", "ING", "AND").

In [7]:
cipher0 = read_ciphertext("0.txt")

# Key reconstructed from frequency analysis
key_map_0 = {
    'A': 'c', 'B': 'x', 'C': 'e', 'D': 'A', 'E': 'a', 
    'G': 'b', 'I': 'h', 'J': 'g', 'K': 'n', 'L': 's', 
    'M': 'p', 'N': 'o', 'P': 'w', 'Q': 'l', 'R': 'k', 
    'S': 'd', 'T': 'i', 'U': 't', 'V': 'w', 'W': 'u', 
    'X': 'z', 'Y': 'y', 'Z': 'm', 
    '.': 'v', ',': 'r', '-': 'f' 
}

plain0 = "".join(key_map_0.get(c, c) for c in cipher0)
print(f"--- Cipher 0 Decrypted  ---\n{plain0}")

--- Cipher 0 Decrypted  ---
evenwhiletheyhadbeensayingcommonplacethingssusanhadbeenconsciousoftheexcitementofintimacywwhichseemednotonlytolaybaresomethinginherwbutinthetreesandtheskywandtheprogressofhisspeechwhichseemedinevitablewaspositivelypainfultoherwfornohumanbeinghadevercomesoclosetoherbeforeAshewasstruckmotionlessashisspeechwentonwandherheartgavegreatseparateleapsatthelastwordsAshesatwithherfingerscurledroundastonewlookingstraightinfrontofherdownthemountainovertheplainAsothenwithadactuallyhappenedtoherwaproposalofarthurlookedroundatherhisfacewasoddlytwistedAshewasdrawingherbreathwithsuchdifficultythatshecouldhardlyanswerAyoumighthaveknownAheseizedherinhisarmsagainandagainandagaintheyclaspedeachotherwmurmuringinarticulatelyAwellwsighedarthurwsinkingbackonthegroundwthatsthemostwonderfulthingthatseverhappenedtomeAhelookedasifheweretryingtoputthingsseeninadreambesiderealthingsAtherewasalongsilenceAitsthemostperfectthingintheworldwsusanstatedwverygentlyandwithgreatconvictionAitwasnol

### Cipher 1: Caesar Cipher
**Identification**: IC matches standard English (~0.066). The frequency distribution is identical to English but shifted.
**Method**: Brute-force shifting. **Shift 19** yields coherent English.

In [8]:
cipher1 = read_ciphertext("1.txt")
SHIFT = 19

def caesar_decrypt(text, shift):
    return "".join(ALPHABET[(ALPHABET.index(c) - shift) % MOD] for c in text)

plain1 = caesar_decrypt(cipher1, SHIFT)
print(f"--- Cipher 1 Decrypted (Shift {SHIFT}) ---\n{plain1}")

--- Cipher 1 Decrypted (Shift 19) ---
THEUSUALEFFECTOFTAKINGAWAYALLDESIREFORCOMMUNICATIONBYMAKINGTHEIRWORDSSOUNDTHINANDSMALLAND,AFTERWALKINGROUNDTHEDECKTHREEORFOURTIMES,THEYCLUSTEREDTOGETHER,YAWNINGDEEPLY,ANDLOOKINGATTHESAMESPOTOFDEEPGLOOMONTHEBANKS.MURMURINGVERYLOWINTHERHYTHMICALTONEOFONEOPPRESSEDBYTHEAIR,MRS.FLUSHINGBEGANTOWONDERWHERETHEYWERETOSLEEP,FORTHEYCOULDNOTSLEEPDOWNSTAIRS,THEYCOULDNOTSLEEPINADOGHOLESMELLINGOFOIL,THEYCOULDNOTSLEEPONDECK,THEYCOULDNOTSLEEP--SHEYAWNEDPROFOUNDLY.ITWASASHELENHADFORESEENTHEQUESTIONOFNAKEDNESSHADRISENALREADY,ALTHOUGHTHEYWEREHALFASLEEP,ANDALMOSTINVISIBLETOEACHOTHER.WITHST.JOHNSHELPSHESTRETCHEDANAWNING,ANDPERSUADEDMRS.FLUSHINGTHATSHECOULDTAKEOFFHERCLOTHESBEHINDTHIS,ANDTHATNOONEWOULDNOTICEIFBYCHANCESOMEPARTOFHERWHICHHADBEENCONCEALEDFORFORTY-FIVEYEARSWASLAIDBARETOTHEHUMANEYE.MATTRESSESWERETHROWNDOWN,RUGSPROVIDED,ANDTHETHREEWOMENLAYNEAREACHOTHERINTHESOFTOPENAIR.THEGENTLEMEN,HAVINGSMOKEDACERTAINNUMBEROFCIGARETTES,DROPPEDTHEGLOWINGENDSINTOTHERIVER,ANDLOOKED

### Cipher 2: Hill Cipher (2x2)
**Identification**: IC is lower, and digraph distribution is flatter than English. Requires matrix inversion over Mod 29.
**Method**: Linear algebra. The inverse key matrix was determined to be `[[15, 16], [19, 1]]`.

In [10]:
cipher2 = read_ciphertext("2.txt")

def decrypt_hill(text):
    # Inverse Key Matrix found during analysis
    D = np.array([[15, 16], [19, 1]])
    
    cipher_indices = [ALPHABET.index(c) for c in text]
    plain_chars = []
    
    for i in range(0, len(cipher_indices)-1, 2):
        vec = np.array([[cipher_indices[i]], [cipher_indices[i+1]]])
        # P = D * C mod 29
        dec_vec = np.dot(D, vec) % MOD
        plain_chars.append(ALPHABET[dec_vec[0][0]])
        plain_chars.append(ALPHABET[dec_vec[1][0]])
    
    return "".join(plain_chars)

plain2 = decrypt_hill(cipher2)
print(f"--- Cipher 2 Decrypted ---\n{plain2}")

--- Cipher 2 Decrypted ---
TODRIFTPASTEACHOTHERINSILENCE.IMNOTAPRODIGY.IFINDITVERYDIFFICULTTOSAYWHATIMEAN--SHEOBSERVEDATLENGTH.ITSAMATTEROFTEMPERAMENT,IBELIEVE,MISSALLANHELPEDHER.THEREARESOMEPEOPLEWHOHAVENODIFFICULTYFORMYSELFIFINDTHEREAREAGREATMANYTHINGSISIMPLYCANNOTSAY.BUTTHENICONSIDERMYSELFVERYSLOW.ONEOFMYCOLLEAGUESNOW,KNOWSWHETHERSHELIKESYOUORNOT--LETMESEE,HOWDOESSHEDOIT--BYTHEWAYYOUSAYGOOD-MORNINGATBREAKFAST.ITISSOMETIMESAMATTEROFYEARSBEFOREICANMAKEUPMYMIND.BUTMOSTYOUNGPEOPLESEEMTOFINDITEASYOHNO,SAIDRACHEL.ITSHARDMISSALLANLOOKEDATRACHELQUIETLY,SAYINGNOTHINGSHESUSPECTEDTHATTHEREWEREDIFFICULTIESOFSOMEKIND.THENSHEPUTHERHANDTOTHEBACKOFHERHEAD,ANDDISCOVEREDTHATONEOFTHEGREYCOILSOFHAIRHADCOMEIMUSTASKYOUTOBESOKINDASTOEXCUSEME,SHESAID,RISING,IFIDOMYHAIR.IHAVENEVERYETFOUNDASATISFACTORYTYPEOFHAIRPIN.IMUSTCHANGEMYDRESS,TOO,FORTHEMATTEROFTHATANDISHOULDBEPARTICULARLYGLADOFYOURASSISTANCE,BECAUSETHEREISATIRESOMESETOFHOOKSWHICHICANFASTENFORMYSELF,BUTITTAKESFROMTENTOFIFTEENMINUTESWHEREASWITHYOURHELP

### Cipher 3: Vigenère Cipher
**Identification**: Low IC (~0.038) typical of polyalphabetic ciphers. Kasiski/Friedman tests suggested a key length of 5.
**Method**: Analyzed 5 interlaced Caesar shifts. Recovered Key: **CVSUI**.

In [9]:
cipher3 = read_ciphertext("3.txt")
KEY_VIGENERE = "CVSUI"

def vigenere_decrypt(text, key):
    res = []
    key_indices = [ALPHABET.index(k) for k in key]
    text_indices = [ALPHABET.index(c) for c in text]
    
    for i, t_idx in enumerate(text_indices):
        k_idx = key_indices[i % len(key)]
        p_idx = (t_idx - k_idx) % MOD
        res.append(ALPHABET[p_idx])
    return "".join(res)

plain3 = vigenere_decrypt(cipher3, KEY_VIGENERE)
print(f"--- Cipher 3 Decrypted (Key: {KEY_VIGENERE}) ---\n{plain3}")

--- Cipher 3 Decrypted (Key: CVSUI) ---
CANT.THINKOFTHESUNSETSANDTHEMOONRISES--IBELIEVETHECOLOURSARETHEREAREWILDPEACOCKS,RACHELHAZARDED.ANDMARVELLOUSCREATURESINTHEWATER,HELENASSERTED.ONEMIGHTDISCOVERANEWREPTILE,RACHELCONTINUED.THERESCERTAINTOBEAREVOLUTION,IMTOLD,HELENURGED.THEEFFECTOFTHESESUBTERFUGESWASALITTLEDASHEDBYRIDLEY,WHO,AFTERREGARDINGPEPPERFORSOMEMOMENTS,SIGHEDALOUD,POORFELLOWANDINWARDLYSPECULATEDUPONTHEUNKINDNESSOFWOMEN.HESTAYED,HOWEVER,INAPPARENTCONTENTMENTFORSIXDAYS,PLAYINGWITHAMICROSCOPEANDANOTEBOOKINONEOFTHEMANYSPARSELYFURNISHEDSITTING-ROOMS,BUTONTHEEVENINGOFTHESEVENTHDAY,ASTHEYSATATDINNER,HEAPPEAREDMORERESTLESSTHANUSUAL.THEDINNER-TABLEWASSETBETWEENTWOLONGWINDOWSWHICHWERELEFTUNCURTAINEDBYHELENSORDERS.DARKNESSFELLASSHARPLYASAKNIFEINTHISCLIMATE,ANDTHETOWNTHENSPRANGOUTINCIRCLESANDLINESOFBRIGHTDOTSBENEATHTHEM.BUILDINGSWHICHNEVERSHOWEDBYDAYSHOWEDBYNIGHT,ANDTHESEAFLOWEDRIGHTOVERTHELANDJUDGINGBYTHEMOVINGLIGHTSOFTHESTEAMERS.THESIGHTFULFILLEDTHESAMEPURPOSEASANORCHESTRAINALONDONREST