# Working Directory (First Step)

```bash
mkdir -p ~/recon/mike014 && cd ~/recon/mike014
```
What it does: Creates the folder to save all results and navigates into it.

---

## A — Passive OSINT (No Server Contact)

1. Download page + HTTP headers

```bash
curl -sSL -D headers.txt -o page.html https://mike014.github.io/michele-portfolio/resume.html
```
What it does: Saves HTTP headers in `headers.txt` and HTML source in `page.html`.

2. Extract all links from the page

```bash
grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" page.html | sort -u > links.txt
```
What it does: Finds and saves all unique URLs found in the HTML to `links.txt`.

3. Extract emails from the source

```bash
grep -Eio "[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}" page.html | sort -u > emails.txt
```
What it does: Finds and saves any exposed email addresses from the page.

4. Check robots.txt and sitemap (if they exist)

```bash
curl -sS https://mike014.github.io/robots.txt -o robots.txt || echo "no robots"
curl -sS https://mike014.github.io/sitemap.xml -o sitemap.xml || echo "no sitemap"
```
What it does: Saves `robots.txt` and `sitemap.xml` (or writes "no robots/no sitemap" if they don't exist).

5. Search public certificates with crt.sh (passive)

```bash
curl -s "https://crt.sh/?q=%25.mike014.github.io" | grep -Eo "mike014.github.io|[A-Za-z0-9.-]+\.mike014\.github\.io" | sort -u
```
What it does: Queries crt.sh and filters any records/certificates containing your domain or subdomains.

---

## B — Passive Collection with Tools

1. theHarvester (OSINT)

```bash
theharvester -d mike014.github.io -b all -l 500 -f theharvester.html
```
What it does: Collects emails, subdomains, and public information from multiple sources; saves an HTML report `theharvester.html`.

2. Limited alternative (Google only)

```bash
theharvester -d mike014.github.io -b google -l 200 -f th.html
```
What it does: Same as above but only via Google, less noise.

---

## C — Active (Light Scanning — OK Since It's Your Site)

> Use these only on assets you own (as in this case).

1. TLS Certificate Info

```bash
openssl s_client -connect mike014.github.io:443 -servername mike014.github.io </dev/null 2>/dev/null | openssl x509 -noout -text > cert_info.txt
```
What it does: Opens a TLS connection, extracts the certificate and saves it to `cert_info.txt` (issuer, expiration, SAN).

2. Light Web Scan (Ports 80 and 443 + Banner)

```bash
nmap -Pn -p 80,443 --script=http-title,http-server-header -oN nmap_http.txt mike014.github.io
```
What it does: Performs a targeted scan on 80/443, tries to read the HTTP title and server header; saves output to `nmap_http.txt`.

3. Deep Scan (Noisy — Use with Caution)

```bash
nmap -sV -A -p 1-65535 --min-rate=1000 -oN nmap_full.txt mike014.github.io
```
What it does: Full scan of all ports, service detection, OS detection and `-A` scripts. Very intrusive; not necessary for GitHub Pages.

---

## D — Static Resource Enumeration / Download

1. Download Site (Limited Mirror)

```bash
wget --mirror --level=1 --no-parent -e robots=off https://mike014.github.io/michele-portfolio/
```
What it does: Recursively downloads the page and linked resources up to depth 1.

2. Directory Brute-Force (Only If You Have Permission)

```bash
gobuster dir -u https://mike014.github.io/michele-portfolio/ -w /usr/share/wordlists/dirb/common.txt -t 20
```
What it does: Attempts to discover hidden paths using a wordlist; use cautiously and limit the rate.

---

## Complete Script (Passive → Light Active)

Save to `~/recon/recon_mike014.sh`, make it executable and run it:

```bash
#!/bin/bash
TARGET="mike014.github.io/michele-portfolio/resume.html"
HOST="mike014.github.io"
OUTDIR="$HOME/recon/mike014"
mkdir -p "$OUTDIR"
cd "$OUTDIR" || exit 1

curl -sSL -D headers.txt -o page.html "https://$TARGET"

grep -Eo "(http|https)://[A-Za-z0-9./?=_&%-]*" page.html | sort -u > links.txt || true
grep -Eio "[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}" page.html | sort -u > emails.txt || true

curl -sS "https://$HOST/robots.txt" -o robots.txt || echo "no robots" > robots.txt
curl -sS "https://$HOST/sitemap.xml" -o sitemap.xml || echo "no sitemap" > sitemap.xml

openssl s_client -connect $HOST:443 -servername $HOST </dev/null 2>/dev/null | openssl x509 -noout -text > cert_info.txt || echo "cert fetch failed" > cert_info.txt

nmap -Pn -p 80,443 --script=http-title,http-server-header -oN nmap_http.txt $HOST

ls -l "$OUTDIR"
```
What it does: Automates all essential commands and saves outputs to `~/recon/mike014/`.

Execution:

```bash
chmod +x ~/recon/recon_mike014.sh
~/recon/recon_mike014.sh
```

---

## 🔍 What to Look for in the Output Files

* `headers.txt` → Look for `Server:`, `X-Cache`, security headers (CSP, HSTS).
* `page.html` / `links.txt` → Check external resources, CDNs, analytics.
* `emails.txt` → Any exposed contacts.
* `cert_info.txt` → Issuer (Let's Encrypt or GitHub), SAN (subdomains).
* `nmap_http.txt` → Ports and banners (for GitHub Pages typically only 443).

