
# **üìò Enrichment & Processing**


# **Phase Overview**

**Goal**
Transform raw, unfiltered OSINT data collected in Phase 1 into *clean, enriched, contextualized, correlated intelligence* that is ready for analysis and threat modeling.


This phase is like *washing, chopping, seasoning, and organizing ingredients before cooking a proper meal*. Raw data becomes structured intelligence.

---

# **1. What Happened in This Phase**

---

We used:


## **HaveIBeenPwned (HIBP) for Breach Verification**

### **What Was Done**

* NHS emails extracted from the collection phase were checked against public breach databases.
* Identified whether staff accounts appeared in major historical breaches:

  * Onliner Spambot (2017)
  * Intelimost (2019)
  * Other known credential dumps
* Checked if any **plain-text passwords** were exposed.

### **Output**

* A list of breached NHS emails + breach names.
* A risk rating for each email:

  * **High risk** = appeared in breaches + password exposed
  * **Medium** = email exposed but password hashed
  * **Low** = no breach hits


## **Shodan.io for Infrastructure & Exposure Analysis**

### **What Was Done**

* Queried exposed IP addresses discovered during reconnaissance.
* Enumerated:

  * Open ports
  * Running services
  * Software versions
  * Certificate information

### **Ports & Risks Identified**

| Port      | Service | Risk     | Why It Matters                                        |
| --------- | ------- | -------- | ----------------------------------------------------- |
| 21        | FTP     | **High** | Plain-text credentials; attacker can upload webshells |
| 3306      | MySQL   | **High** | Publicly exposed DB ‚Üí direct data theft               |
| 22        | SSH     | Medium   | Outdated OpenSSH 7.4 vulnerable to brute force        |
| 8443/8880 | Plesk   | Medium   | If compromised ‚Üí full server control                  |

### **Key Finding Example**

**Server: 217.199.160.41**
Exposed services:

* FTP
* MySQL
* Plesk
* Outdated SSH


## **Certificate Transparency (crt.sh) for Subdomain & Infrastructure Mapping**

### **What Was Done**

* Queried **all certificates** for `%.nhs.uk`
* Extracted SANs (Subject Alternative Names)

  * Revealed *hidden subdomains*
* Identified:

  * Expired certificates
  * Wildcard certs
  * Legacy systems
  * High-value endpoints


## **Malware Analysis (by combining MalwareBazaar + VirusTotal)**

### **What Was Done**

* Located **Qilin ransomware samples** in MalwareBazaar.
* Analyzed with VirusTotal:

  * Detection ratios (e.g., *56/70 engines*)
  * Static + Dynamic behavior
  * Network indicators (C2 IPs/domains)
  * Dropped files and registry edits

### **MITRE ATT&CK Techniques Observed**

* **T1486** ‚Äì Data encrypted for impact
* **T1490** ‚Äì Inhibit system recovery
* **T1078** ‚Äì Abuse of valid accounts
* **T1529** ‚Äì Shutdown/Reboot
* **Other supporting TTPs** across execution, persistence, defense evasion, discovery, and C2

### **IOCs Extracted**

* SHA-256 malware hashes
* C2 domains
* C2 IP addresses
* File paths
* Registry keys
* Encryption behavior


This gave us insight into:

* Attack chain
* Ransomware capabilities
* Possible kill chain alignment

---

## **IOC Normalization & Compilation**

### **What Was Done**

We consolidated IOCs from all tools:

* theHarvester (emails, IPs, hosts)
* HIBP (breach data)
* Hunter.io (staff enumeration)
* Shodan (service exposure)
* crt.sh (subdomains)
* VirusTotal (malware IOCs)
       


---

# **What Each Tool Contributed**

| Tool                  | Input         | Output                           | Enrichment Value                          |
| --------------------- | ------------- | -------------------------------- | ----------------------------------------- |
| **HIBP**              | Emails        | Breach status, password exposure | Confirms credential compromise            |
| **Hunter.io**         | nhs.uk domain | 109 staff emails, org roles      | Role mapping, spear-phishing targeting    |
| **Shodan**            | Exposed IPs   | Ports, services, versions        | Attack surface discovery                  |
| **crt.sh**            | nhs.uk domain | Subdomains, certs                | Hidden infrastructure, legacy systems     |
| **VirusTotal**        | Qilin hashes  | C2 IOCs, sandbox behavior        | MITRE mapping, malware behavior           |
| **IOC Normalization** | All IOCs      | Master CSV                       | Clean, correlated, analysis-ready dataset |

---

This enrichment phase:

### **Transforms raw noise into intelligence**

We went from:

> random emails, ports, domains
> to
> validated breaches, exploitable services, correlated infrastructure

### **It Builds the attack surface picture**

* Subdomains
* Exposed services
* Legacy systems
* Staff mapping
* Credential risks

### **It Enables threat modeling**

By linking:

* exposed systems
* staff roles
* malware capabilities
* MITRE ATT&CK techniques

### **It Prepares the pipeline for analysis**

Clean data leads to meaningful insight.




---

# <span style="color:#3A6EA5;">HaveIBeenPwned (HIBP) ‚Äî Breach Verification & Exposure Analysis</span>

> **Use this for Passive OSINT only** : We are checking *public breach records*.
> **Never** attempt to log in or use leaked credentials.
> If breached emails belong to a real organisation (e.g., NHS during an engagement), follow **responsible disclosure** requirements.

---

## üîç HaveIBeenPwned‚Äù

Checking **HaveIBeenPwned** is like checking a **lost-and-found list** to see if someone's **house key (email + password)** is already circulating online.

If a key appears on the list:

* A thief *may already have a copy*.
* We must treat that account as **high-risk**.
* The user may be vulnerable to **account takeover**, **phishing**, or **credential-stuffing**.

---

# <span style="color:#3A6EA5;">1. Accessing HIBP Through OSINT Framework</span>

### **Step 1 ‚Äî Open Firefox in Kali**

* Go to **Applications ‚Üí Internet ‚Üí Firefox ESR**
  *(or click the Firefox icon on the dock)*
* Navigate to:

```text
https://osintframework.com
```

* Under **Email Address**, choose **Breach Data**.

### **Step 2 ‚Äî Navigate to HaveIBeenPwned**

Click **HaveIBeenPwned** then the official HIBP site will load.

<img src="pic/ti40.png" alt="hibp output 3">


---

# <span style="color:#3A6EA5;">2. Searching for an Email</span>

### **Step 3 ‚Äî Test Your First Email**

* Click the search box (‚Äúemail address‚Äù)
* Enter an email from your OSINT list, e.g.:

<img src="pic/ti41.png" alt="hibp output 3">

or

<img src="pic/ti42.png" alt="hibp output 3">
```
arpan.banerjee@heartofengland.nhs.uk
```

* Press **Enter** or click **Check**.

---

# <span style="color:#3A6EA5;">3. Example Results: What We Found</span>

The NHS address above appears in **two breaches**:

1. **Intelimost ‚Äî March 2019**
2. **Onliner Spambot ‚Äî August 2017**

We analyze each below.

---

# <span style="color:#3A6EA5;">üìå Breach 1 ‚Äî Onliner Spambot (August 2017)</span>

<img src="pic/ti43.png" alt="hibp output 3">

One of the **largest credential-harvesting spam operations** ever discovered.

### **Data exposed included:**

* Email addresses
* Passwords
* Email server configurations
* Authentication logs
* Occasionally contact lists

### **Attacker implications:**

‚úî Password was likely stolen
‚úî Attackers could try logging into the mailbox
‚úî Email could be weaponized as a **spam-sender**
‚úî Address is widely traded among cybercriminals

### **OSINT relevance:**

Attackers now know:

* The email is **valid**
* It is **highly exposed**
* It was harvested by **malware**, not just leaked
* It likely circulates on **spambot infrastructure**

‚û°Ô∏è **This is a high-value reconnaissance finding.**

---

# <span style="color:#3A6EA5;">üìå Breach 2 ‚Äî Intelimost (March 2019)</span>

A major spam campaign that impersonated people's contacts.

<img src="pic/ti44.png" alt="hibp output 3">

### **Database contents:**

* 3+ million email addresses
* **Plain-text passwords** ‚Üê *critical*
  (the password is visible exactly as typed)

### **Security impact:**

‚úî Attackers could log in if the user reused the password
‚úî Internal phishing becomes extremely credible
‚úî Business Email Compromise (BEC) becomes possible
‚úî Credential stuffing against NHS services is realistic

Attackers can infer:

* The email is **real**
* A real password was leaked
* This is a **high-value phishing target**

---

# <span style="color:#3A6EA5;">4. Exposure Severity</span>

> **More breaches = higher exposure = higher likelihood of password reuse.**

### **Types of exposed data (for both breaches):**

* Email address
* Password
* Possibly authentication data / configurations

### **Why this matters for NHS (or any organization):**

* Attackers can impersonate staff
* Launch highly believable phishing
* Attempt logins on VPN, portals, O365, SSO
* Pivot deeper into internal networks
* Begin the early stages of **ransomware intrusion**

---

# <span style="color:#3A6EA5;">5. Strong, Simple Security Recommendations</span>

### üîê **1. Reset all passwords immediately**

Especially for:

* NHS SSO
* Office 365
* Any portal or staff system

### üîê **2. Enforce unique passwords**

Never reuse passwords between systems.

### üîê **3. Enable MFA everywhere**

Prevents access even if attackers possess the password.

### üîé **4. Monitor inbox rules**

Attackers often create:

* Hidden forwarding rules
* Auto-delete rules
* Persistence mechanisms

### üìÖ **5. Monitor for future breaches**

HIBP supports notifications for new breaches.

---

# <span style="color:#3A6EA5;">6. Why This example matters</span>

HIBP answers 3 attacker questions:

1. **Does the email exist?**
   ‚Üí Yes, validated.

2. **Has it been exposed before?**
   ‚Üí Yes, in two major data breaches.

3. **Does it contain passwords?**
   ‚Üí Yes, including **plain-text** passwords.

This makes the email:

* A **confirmed attack surface entry point**
* A **high-value phishing target**
* A risk for **credential stuffing**
* Potentially abusable for **internal impersonation**

---

# <span style="color:#3A6EA5;">7. What We Learned</span>

> From this example, the NHS email appears in **two major spambot leaks**, exposing **email + plain-text passwords**.
> This makes the account extremely vulnerable to phishing, credential-stuffing, and internal impersonation attacks.

TheHarvester found the email
HIBP confirmed it is exposed in multiple breaches

Any such email should be treated as **high-risk**.

---

# <span style="color:#3A6EA5;">8. Ethics & notes</span>

* HIBP lookups are **passive** and permissible for OSINT.
* Only perform HIBP checks during **legitimate assessments**.
* Never attempt to use exposed credentials.
* Follow responsible disclosure rules if working with real organizations.

---


# üîç Investigating an IP on Shodan.io

*(Passive Recon Only ‚Äî Safe, Legal, Observational)*

Everything below is **100% passive** ‚Äî we only view **publicly indexed information**.
‚ö†Ô∏è **Never probe or scan hosts**.

---

## üåç What is Shodan?

Think of Shodan as a **satellite view of the Internet**.

It shows every device that waves a hand publicly:
**web servers, VPN gateways, cameras, printers, IoT devices, cloud services, etc.**

Shodan is **‚ÄúGoogle for machines and services exposed to the internet.‚Äù**

An attacker uses it to answer:

* What is exposed online?
* What software is running?
* Is it old, vulnerable, or misconfigured?
* Are there unnecessary open ports?
* Any misconfigured Mail/FTP/DNS services?
* Does the organization leak information unintentionally?

After collecting IPs from **TheHarvester**, Shodan.io is used to analyze the services exposed externally.
This reveals **exactly what an attacker sees** and how the **external attack surface** looks.

---

## üõ∞Ô∏è Reconnaissance Value

Even without exploitation:

* The domain and IP reveal **hosting provider**, **location**, **technologies**.
* Provides material for **phishing**, **social engineering**, and **targeted exploits**.
* Shows exposed entry points, outdated software, misconfigurations.
* Helps evaluate if shared hosting environments increase risk.

---

# üß≠ How to Use Shodan

## **1) Open Firefox**

* Applications ‚Üí Internet ‚Üí **Firefox ESR**
  *(or click the fox icon)*

## **2) Go to Shodan**

Type in the address bar:

```
https://shodan.io
```

<img src="pic/ti49.png" alt="shodan output 3">

## **3) Sign In or Create an Account**

* Top-right ‚Üí **Log in** or **Sign up**

> **Layman note:** Logging in is like using a library card ‚Äî you get access to more books (more detailed results).

## **4) Paste the IP into the Search Bar**

Example:

```
217.101.xxx.xxx
```

Press **Enter**.

<img src="pic/ti51.png" alt="shodan output 3">


## **5) Wait for Results to Load**

Example IP: **217.199.160.41**

A high-level scan of **server.dalesweb.net** shows:

* Multiple typical shared hosting services exposed.
* Some outdated software versions.
* Some unnecessary open ports.
* Potentially exploitable configurations if left unaddressed.

<img src="pic/ti52.png" alt="shodan output 3">
---

# üóÇÔ∏è What the Results Page Contains

You will typically see:

### **Host Overview**

* ASN
* Owner organization
* Country
* Last seen online

### **Ports**

* List of open ports (80, 443, 3389, etc.)

### **Services / Banners**

* Server headers
* Protocol responses
* Software versions

### **Vulnerabilities**

* CVEs associated (if banner matches known vulnerabilities)

### **Geolocation Map**

* Approximate physical location

### **Related Hosts**

* Other hosts on the same ASN/network

---

# üß™ Detailed Findings

## üîå **3.1 Open Ports and Services**

<table>
<thead>
<tr><th>Port</th><th>Service</th><th>Description</th><th>Risk Level</th></tr>
</thead>
<tbody>
<tr><td>21</td><td>FTP</td><td>File Transfer (plaintext, insecure)</td><td><b>High</b></td></tr>
<tr><td>22</td><td>SSH</td><td>Remote login (OpenSSH 7.4, outdated)</td><td>Medium</td></tr>
<tr><td>25</td><td>SMTP</td><td>Email sending (weak auth methods)</td><td>Medium</td></tr>
<tr><td>53</td><td>DNS</td><td>Domain name resolution</td><td>Low</td></tr>
<tr><td>80</td><td>HTTP</td><td>Website (unencrypted)</td><td>Low</td></tr>
<tr><td>443</td><td>HTTPS</td><td>Secure website</td><td>Low</td></tr>
<tr><td>3306</td><td>MySQL</td><td>Database exposed to internet</td><td><b>High</b></td></tr>
<tr><td>8443 / 8880</td><td>Plesk</td><td>Hosting control panel</td><td>Medium</td></tr>
</tbody>
</table>

### Key Notes

* More open ports means larger attack surface
* **FTP + MySQL** are most critical

---

# üîç Service-by-Service Analysis

## üîê **SSH (Port 22)**

* Version: **OpenSSH 7.4** (2016)
* Supports weak algorithms/CBC ciphers

<img src="pic/ti53.png" alt="shodan output 3">

**Risks**

* Brute-force attacks possible
* Cryptographic weaknesses
* Older versions may contain known vulnerabilities

**Recommendations**

* Upgrade OpenSSH
* Disable weak ciphers

---

## üìÅ **FTP (Port 21)**

* Accepts plaintext login
* TLS may exist but may not be enforced

**Risks**

* Credentials visible on network
* Attackers could upload malicious files
* May expose website directories/backups

**Recommendation**

* Disable FTP
* Use **FTPS** or **SFTP** only

---

## üóÑÔ∏è **MySQL (Port 3306)**

* **Publicly accessible database**
* May accept external login attempts

**Risks**

* Password guessing
* Data theft (emails, accounts, payments)
* GDPR / compliance violations

**Recommendation**

* Firewall the port
* Allow only internal/VPN access

---

## ‚öôÔ∏è **Plesk (Ports 8443/8880)**

If compromised:

* Full control over **websites**, **mail**, **databases**
* Possible upload of malicious files
* Potential takeover of entire hosting environment

---

## üîí **SSL/TLS Configuration**

* Let‚Äôs Encrypt certificate (valid)
* RSA 2048-bit encryption (standard)
* No major SSL vulnerabilities known

**Risk: Low**
Good TLS configuration protects traffic, but underlying services remain exposed.

---

# üß† Low-Hanging Fruit for Credential Attacks

Attackers commonly try:

* `admin/admin`
* `root/password`
* Reused credentials

If **FTP**, **MySQL**, and **Plesk** reuse passwords ‚Üí
**One successful login = full server compromise.**

---

# üß≠ Potential Pivot Opportunities

Once inside one service, attackers can escalate:

### **FTP compromise: Web Shell**

* Upload shell script
* Read config files leads to steal credentials

### **SSH compromise: Full server control**

* Become root
* Install malware, ransomware, cryptominers

### **Database compromise: Sensitive data**

* Steal emails, hashed passwords, personal info
* Trigger GDPR penalties

---

# üéØ How an Attacker would use this Info

### **1. Scan for weaknesses**

* Identify outdated software
* Map open ports/services

### **2. Credential attacks**

* Brute-force SSH, FTP, MySQL, Plesk
* Try leaked password combinations

### **3. Exploit weak protocols**

* FTP / MySQL without encryption
* Upload malicious scripts
* Abuse Plesk vulnerabilities

### **4. Lateral movement**

* Move between services
* Explore internal directories

### **5. Exfiltration / Sabotage**

* Steal personal data, emails, accounts
* Deploy ransomware or deface sites

---

# üß© In Short

Attackers would see this server as a **moderately valuable** target because:

1. Multiple exposed services (FTP, SSH, MySQL, Plesk)
2. Outdated versions + weak ciphers
3. Publicly exposed database
4. Shared hosting = compromises may affect multiple clients
5. Possible weak credentials
6. Multiple escalation paths once inside

---

# ‚ö†Ô∏è Reminder

‚ùó **Do NOT connect to services** (no telnet, ssh, nmap).

Shodan is used only to **observe what is already public**, nothing more.



# <h1 style="color:#2A5DB0;">üîç Certificate Transparency Reconnaissance Using crt.sh (NHS Example)</h1>

> **‚ö†Ô∏è Passive OSINT Only**
> Everything here involves *reading* public Certificate Transparency logs ‚Äî **no active scanning, no probing, no interacting with NHS systems**.

Certificate Transparency (CT) logs are global, public databases that record every SSL/TLS certificate ever issued by trusted Certificate Authorities.
Because *all* certificates are logged ‚Äî including internal, forgotten, test, legacy, or third-party ones ‚Äî CT logs unintentionally reveal:

* Hidden subdomains
* Legacy infrastructure
* Third-party integrations
* Expired or abandoned systems
* Possible subdomain takeover points
* Internal structure and architecture patterns
* Certificate hygiene issues

### Why CT Logs Matter

* **To attackers:** CT logs are a blueprint of every digital door ‚Äî even the forgotten ones.
* **To defenders:** CT logs expose weak spots: expired certs, shadow IT, bad hygiene, orphaned domains.

The tool used: **crt.sh**: the most widely used CT search engine.

---

# <h2 style="color:#2A5DB0;">üß† Understanding certifcates</h2>

* **Certificate is like the Licence plate + passport** for a website
* **Expired certificate is like Expired passport** ‚Üí suspicious, vulnerable
* **Wildcard certificate (`*.nhs.uk`) is like One passport for many subdomains**
* **Subdomain takeover is like an old mailbox still registered to a company meaning attacker can claim it**

---


# <h2 style="color:#2A5DB0;">üéØ How crt.sh Was Used in this NHS Example</h2>

We queried CT logs for **nhs.uk**, using:

* **Browser / GUI inspection**
* **JSON API (curl + jq)**

This gives:

* Reproducibility
* Full evidence capture
* Ability to filter and correlate certificates
* Ability to automatically extract SANs, issuers, expiry, etc.

All certificate evidence was stored for analysis.

---

# <h2 style="color:#2A5DB0;">üÖê PART A ‚Äî Browser Method (Click-By-Click)</h2>

### 1) Open Firefox

```
Applications ‚Üí Internet ‚Üí Firefox ESR
```

### 2) Go to crt.sh

<img src="pic/ti54.png" alt="theHarvester output 3">
Enter:

```
https://crt.sh
```

<img src="pic/ti55.png" alt="theHarvester output 3">


### 3) Run a wildcard search

Search for:

```
%.nhs.uk
```

<img src="pic/ti56.png" alt="theHarvester output 3">

> **Why `%`?**
> `%` is crt.sh‚Äôs wildcard operator.
> It matches anything ‚Äî similar to SQL `LIKE`.

This retrieves **every certificate ever issued containing `.nhs.uk`** in CN or SAN.

You can click on a ID to learn more about

<img src="pic/ti57.png" alt="theHarvester output 3">

---

# <h2 style="color:#2A5DB0;">üìù What We Looked For</h2>

When examining each certificate on crt.sh, we analyze:

---

## <h3 style="color:#1a4c8f;">1. Issuer</h3>

Common issuers for NHS certificates:

* DigiCert
* Let‚Äôs Encrypt
* GlobalSign
* Sectigo
* Government or internal CA (rare)

**Why it matters:**
Unexpected issuers may signal misconfigurations or misuse.

---

## <h3 style="color:#1a4c8f;">2. Common Name (CN)</h3>

Examples:

* `vpn.nhs.uk`
* `autodiscover.nhs.uk`
* `*.nhs.uk` (wildcard)
* `mail.nhs.uk`
* `portal.nhs.uk`

**Why attackers care:**
CN reveals:

* VPN endpoints
* Email infrastructure
* Admin portals
* Cloud platforms
* Legacy systems

---

## <h3 style="color:#1a4c8f;">3. Validity Windows</h3>

* **Not Before** ‚Äì when certificate became valid
* **Not After** ‚Äì expiration date

**Red flags:**

* Certificates in use long after expiry
* Long renewal gaps
* No automation
* Recently revoked certificates

---

## <h3 style="color:#1a4c8f;">4. Serial & Fingerprint</h3>

Used for correlation during:

* Incident response
* Fraudulent cert detection
* Threat hunting

---

## <h3 style="color:#1a4c8f;">5. Subject Alternative Names (SANs) ‚Äî the most valuable OSINT field</h3>

SANs reveal:

* Hidden subdomains
* Staging/test/dev environments
* Vendor integrations
* APIs
* Internal conventions
* Legacy infrastructure

**Attack value:**
SANs expand attack surface dramatically.

---

# <h2 style="color:#2A5DB0;">üîé High-Value Intelligence Searched For</h2>

---

## <h3 style="color:#1a4c8f;">A. Expired Certificates</h3>

Indicate:

* Poor operational hygiene
* Possible MITM risk
* Legacy neglected services
* Low-hanging fruit

---

## <h3 style="color:#1a4c8f;">B. Wildcard Certificates</h3>

**One cert = protects many subdomains**.
If leaked or compromised ‚Üí mass impersonation.
If expired ‚Üí widespread outages.

Attackers love these because:

> ‚ÄúOne failure compromises everything.‚Äù

---

## <h3 style="color:#1a4c8f;">C. Old Certificates (5‚Äì10+ Years)</h3>

May imply systems running:

* Windows Server 2003/2008
* Old Exchange
* Outdated appliances
* Unpatched web servers

---

## <h3 style="color:#1a4c8f;">D. Subdomains from SAN</h3>

These reveal naming schemes such as:

* `dev.nhs.uk`
* `test.nhs.uk`
* `api.nhs.uk`
* `legacy.nhs.uk`
* `vpn.nhs.uk`
* `citrix.nhs.uk`

Attackers use these to identify weak dev/staging systems.

---

## <h3 style="color:#1a4c8f;">E. Third-Party Services</h3>

SANs sometimes include:

* Azure
* AWS
* GitHub Pages
* Heroku
* Cloudflare

If CNAME ‚Üí unclaimed cloud resource ‚Üí **subdomain takeover** possible.

---

# <h2 style="color:#2A5DB0;">üíæ Saving CT Data (CSV/JSON)</h2>

CSV output is ideal for audits.

<img src="pic/ti58.png" alt="theHarvester output 3">
---

# <h2 style="color:#2A5DB0;">üÖë PART B ‚Äî Programmatic Method (curl + jq)</h2>

### Query CT Logs (JSON):

```bash
curl -s "https://crt.sh/?q=%25nhs.uk&output=json" -o raw/crtsh_nhs.json
```

### Extract readable fields:

```bash
jq '.[0] | keys' raw/crtsh_nhs.json
```

### Produce CSV summary:

```bash
jq -r '.[] | [
  .common_name,
  (.name_value // "") | gsub("\n";" | "),
  .not_before,
  .not_after,
  .issuer_name
] | @csv' raw/crtsh_nhs.json > evidence/crtsh_cert_list.csv
```

### Extract expired certificates:

```bash
today=$(date -I)
jq -r --arg today "$today" '.[] |
select(.not_after < $today) |
[.common_name, .not_after, .issuer_name] | @csv' \
raw/crtsh_nhs.json > evidence/crtsh_expired.csv
```

---

# <h2 style="color:#2A5DB0;">üéØ Attacker Use Cases (NHS-Specific)</h2>

---

## <h3 style="color:#1a4c8f;">1. NHS Attack Surface Mapping</h3>

CT logs expose NHS subdomains such as:

* `api.nhs.uk`
* `developer.api.nhs.uk`
* `assets.nhs.uk`
* `login.nhs.uk`
* `test.api.nhs.uk`
* `legacy-n3.nhs.uk`

These provide an involuntary *blueprint* of the NHS ecosystem.

Attackers use CT data to:

* Identify entry points
* Map public APIs
* Detect legacy systems
* Infer segmentation

---

## <h3 style="color:#1a4c8f;">2. Subdomain Takeover Risk ‚Äî High Impact</h3>

If CT logs show:

```
xxx.nhs.uk ‚Üí CNAME ‚Üí github.io  
xxx.nhs.uk ‚Üí CNAME ‚Üí azurewebsites.net  
```

Attackers check:

* Is the resource still active?
* Is the cloud instance still claimed?

If not ‚Üí **takeover possible**.

Consequences:

* NHS-themed phishing
* Fake login portals
* Malware delivery under trusted ‚Äúnhs.uk‚Äù

---

## <h3 style="color:#1a4c8f;">3. Targeted NHS Phishing</h3>

Certificates reveal naming like:

* `autodiscover.nhs.uk`
* `securemail.nhs.uk`
* `passwordreset.nhs.uk`
* `accounts.nhs.uk`

Attackers craft extremely convincing phishing that mirrors real NHS flows.

---

## <h3 style="color:#1a4c8f;">4. Technology Fingerprinting</h3>

By analyzing issuers + SAN, attackers infer:

* Akamai
* Azure
* AWS
* API frameworks
* Developer environments
* Documentation platforms

Then match to:

* Known vulnerabilities
* Zero-days
* Misconfigurations

---

## <h3 style="color:#1a4c8f;">5. Legacy NHS Infrastructure</h3>

Old certs indicate forgotten systems, possibly running:

* Outdated Windows Server
* Old Exchange
* Unsupported devices

Attackers target these specifically.

---

# <h2 style="color:#2A5DB0;">üß© MITRE ATT&CK Mapping (NHS CT Recon)</h2>

| Technique                              | Description                                   |
| -------------------------------------- | --------------------------------------------- |
| **T1590 ‚Äì Gather Domain Information**  | Extracting NHS subdomains from CT logs        |
| **T1596 ‚Äì Search Public Records**      | Using crt.sh datasets                         |
| **T1589 ‚Äì Gather Identity Data**       | Email-style subdomains reveal naming patterns |
| **T1189 ‚Äì Drive-by Compromise**        | Subdomain takeover ‚Üí malware delivery         |
| **T1190 ‚Äì Exploit Public-Facing Apps** | Using CT-discovered endpoints                 |

---

# <h2 style="color:#2A5DB0;">üõ† Recommended Remediation</h2>

---

## <h3 style="color:#1a4c8f;">1. Review DNS Records for Takeover Risk</h3>

Any CNAME ‚Üí SaaS must be checked:

* Azure
* GitHub
* Heroku
* AWS

If the backend resource is unclaimed ‚Üí takeover risk.

---

## <h3 style="color:#1a4c8f;">2. Subdomain Inventory</h3>

Centralize:

* Active
* Legacy
* Third-party

subdomains.

Large NHS teams = easy for things to get lost.

---

## <h3 style="color:#1a4c8f;">3. Automated Certificate Monitoring</h3>

Weekly CT scraping to alert on:

* New certs
* Expired certs
* Unexpected issuers
* Wildcards

Recommended tools:

* **Let‚Äôs Encrypt ACME**
* **Venafi**
* **Sectigo Certificate Manager**

---

## <h3 style="color:#1a4c8f;">4. Subdomain Takeover Testing</h3>

Check:

* orphaned CNAMEs
* dead SaaS mappings
* cloud resources that no longer exist

---

## <h3 style="color:#1a4c8f;">5. Harden DNS</h3>

* Registrar locks
* DNSSEC (where appropriate)
* Remove unused DNS entries

---

# <h2 style="color:#2A5DB0;">üß™ PART D ‚Äî Live Certificate Check (Optional)</h2>

> This is *not* passive OSINT ‚Äî use only if allowed.

```bash
echo | openssl s_client -servername vpn.nhs.uk -connect vpn.nhs.uk:443 2>/dev/null | openssl x509 -noout -dates -issuer -subject
```

Outputs:

* notBefore
* notAfter
* issuer
* subject

Confirms live certificate validity.

---

# <h2 style="color:#2A5DB0;">üö® PART E ‚Äî How This Leads to Attacks</h2>

Concrete scenario:

### ‚úî Expired wildcard cert + orphaned subdomain

‚Üí **Subdomain takeover**

Example:

* `oldapp.nhs.uk` ‚Üí CNAME ‚Üí dead GitHub Pages
* Attacker registers new GitHub page
* Hosts malicious NHS-themed content
* Gets valid HTTPS via Let‚Äôs Encrypt
* Victims see **green lock + nhs.uk**

High-trust + high-impact = dangerous.




# <h1 align="center">üîç Identifying Important IOCs (Indicators of Compromise)</h1>

When you click on a suspicious file (sample), you will typically see:

* **SHA-256 hash** <span style="color:gray;">Unique fingerprint of the malware</span>
* **File type** <span style="color:gray;">EXE, DLL, VBS, macro, script, etc.</span>
* **Threat family** <span style="color:gray;">Remcos RAT, Qbot, AgentTesla, Qilin, etc.</span>
* **Network communication** <span style="color:gray;">The IPs/domains the malware talks to</span>
* **Behavioral analysis** <span style="color:gray;">Keylogging, persistence, downloading payloads, encryption actions</span>

---

# <h2>üîµ STEP 1 ‚Äî Understanding IOCs</h2>

**IOC = Indicator of Compromise**
Think of IOCs as *clues left behind by a cyber intruder.*

### IOC Analogies

| IOC Type             | Description                                               |
| -------------------- | ----------------------------------------------------- |
| **Malware hash**     | A fingerprint of a virus                              |
| **Malicious domain** | The suspicious address the intruder communicates with |
| **Malicious IP**     | The street the intruder uses to enter the network     |

### Why IOCs matter

Tracking IOCs allows defenders to:

* Detect an intrusion early
* Understand attacker behavior
* Prevent further spread
* Correlate events across multiple systems

Even **one hash or malicious domain** can reveal an active compromise.

---

# <h1>üß™ Search for Qilin Samples on MalwareBazaar</h1>

MalwareBazaar = a repository of community-submitted malware samples.
Think of it as **a shared evidence locker** where researchers upload seized items.

Go to 

https://bazaar.abuse.ch/

https://bazaar.abuse.ch/browse/

Searching for:

```
tag:qilin
```

<img src="pic/ti59.png" alt="malwarebazaar output 3">


You obtain:

* Multiple hashes
* Windows DLL/EXE samples
* Linux ELF samples
* Signatures tagged ‚ÄúQilin‚Äù
* Upload timestamps (2023‚Äì2025)

### What the sample types represent

* **Droppers** ‚Äì initial infection
* **Payloads** ‚Äì core malware
* **Encryptors** ‚Äì ransomware engine
* **Spreaders** ‚Äì worm components
* **Linux/ESXi payloads** ‚Äì used against hypervisors

### Why ELF samples matter

Hospitals heavily use **VMware ESXi**.
Finding Qilin ELF samples confirms:

> **Qilin is capable of targeting ESXi hypervisors ‚Äî critical for healthcare environments.**

<img src="pic/ti60.png" alt="malwarebazaar output 3">
---

# <h2>üîê What is a SHA-256 hash?</h2>

A **digital fingerprint**.
If two files have the same SHA-256 hash ‚Üí they are **identical**.

---

# <h1>üß¨ Pick One Hash ‚Üí Analyze It in VirusTotal</h1>

VirusTotal (VT) =
‚úî multi-AV engine
‚úî behavioral sandbox
‚úî IOC correlation graph
‚úî community intelligence hub

Open: **[https://www.virustotal.com](https://www.virustotal.com)**


<img src="pic/ti61.png" alt="malwarebazaar output 3">

<img src="pic/ti63.png" alt="malwarebazaar output 3">
<img src="pic/ti64.png" alt="malwarebazaar output 3">

### Search your sample

* Paste the **SHA-256 hash**
* Or search: `tag:qilin`

VirusTotal shows:

* **Detection ratio** (e.g., *56 / 70 AV engines flagged it*)
* **Threat label** (e.g., *ransomware.qilin*)

> Analogy:
> ‚Äú56 of 70 security guards report this person as suspicious.‚Äù

If major AV vendors detect it ‚Üí treat the file as malicious.

---

# <h2>üì° RELATIONS TAB ‚Äî What we Learned</h2>

<img src="pic/ti62.png" alt="malwarebazaar output 3">
Example observed URLs:

```
http://crt.sectigo.com/SectigoPublicCodeSigningCAR36.crt
http://crt.sectigo.com/SectigoPublicCodeSigningRootR46.p7c
```

### What this reveals

The malware is attempting to:

* Validate TLS certificates
* Download certificate chains
* Build trust stores

**Why?**
To appear legitimate and blend into normal network traffic.

> Analogy:
> ‚ÄúA thief wearing a uniform to avoid suspicion.‚Äù

In a real NHS compromise:
Such certificate requests could **blend into the hospital‚Äôs encrypted traffic**.

---

# <h1>üß™ Move to Behavioral Sandboxes</h1>

You can use:

* **Triage** ‚Üí [https://triage.ninja](https://triage.ninja)
* **AnyRun** ‚Üí [https://any.run](https://any.run)

<img src="pic/ti65.png" alt="malwarebazaar output 3">
Paste the **same hash**.

The sandbox will show:

* Execution behavior
* Network connections
* File system artifacts
* Registry modifications
* Persistence mechanisms
* Encryption patterns
* Dropped files
* Additional payloads

### Why this matters

Qilin operators **do not start with ransomware**.
They first need:

* Credentials
* Remote access
* Lateral movement
* Data exfiltration
* Privilege escalation

RATs such as **Remcos** are often used before the final encryption stage.

---

# <h1>üìÅ Behavior Tab ‚Äî What to Look For</h1>

### 1. Network Activity

Example:
`185.189.112.19:30311` ‚Üí record this IOC.

### 2. File System Actions

* Are shadow copies deleted?
* Are files overwritten or encrypted?
* Is a ransom note created?

### 3. Process / Service Manipulation

Does it:

* Stop backup services?
* Disable security tools?
* Spawn child processes?

These behaviors reveal the **actual impact** and **intrusion sequence**.

---

# <h1>üìò VirusTotal MITRE Mapping</h1>

VT automatically maps observed behaviors to MITRE ATT&CK.

<img src="pic/ti66.png" alt="malwarebazaar output 3">

Examples we saw:

* **TA0002 ‚Äì Execution**
* **TA0003 ‚Äì Persistence**
* **TA0004 ‚Äì Privilege Escalation**
* **TA0005 ‚Äì Defense Evasion**
* **TA0006 ‚Äì Credential Access**
* **TA0007 ‚Äì Discovery**
* **TA0009 ‚Äì Collection**
* **TA0011 ‚Äì C2 (Command & Control)**
* **TA0040 ‚Äì Impact**

This forms the backbone of your attack timeline.

---

# <h1>üîó STEP 3 ‚Äî Correlate IOCs & TTPs with Known Actors</h1>

### Tools to use:

* **VirusTotal** (relations, detections)
* **AlienVault OTX** (pulses, actor profiles)
* **MITRE ATT&CK** (technique mapping)
* **Vendor reports**:

  * CrowdStrike
  * CheckPoint
  * Blackpoint
  * Kaspersky
  * Microsoft

---

# <h2>IOC: VirusTotal gives us</h2>


* File hashes
* IPs
* Domains
* URLs
* Sandbox behaviors

and

* MITRE techniques

---

# <h1>üî• Introducing MITRE ATT&CK</h1>

**MITRE ATT&CK is like the global dictionary of hacker techniques.**

It allows us to understand:

* **What technique is being used?**
* **Where are we in the kill chain?**
* **Which defensive controls stop this?**

ATT&CK =

‚úî Structured
‚úî Universal
‚úî Recognized by governments
‚úî Used in cyber threat intelligence everywhere

We are going to use it in the Analysis phase.
