# Welcome to the Malware Portion 👋

In this section we’ll explore **adversarial machine learning for malware** using the **EMBER 2024 (feature v3)** dataset. Unlike older EMBER releases that depended on LIEF, v3 is **Colab-friendly**: features are extracted with a pure-Python pipeline and each Portable Executable (PE) is represented by a **fixed-length vector (2,568 features)**. These features capture signals like byte histograms, byte-entropy patterns, string statistics, and PE header metadata—compact summaries a classifier can learn from.

We’ve already done the heavy lifting: a **multi-layer perceptron (MLP)** has been **trained on millions of Win32 samples** (many gigabytes). Your job now is to **use that trained model** to probe how sensitive it is to small changes in feature space and to try techniques similar to the image portion (e.g., FGSM/PGD-style moves, top-k edits).

### What you’ll do here
- **Load the trained model** and the stats needed to standardize features (mean/variance).
- **Classify sample executables** and observe the model’s confidence.
- **Run controlled feature-space experiments** (one-step and iterative, top-k changes) to see how targeted adjustments affect the score.
- **Reflect on why changes work (or don’t)** by relating them back to feature families (byte hist, entropy, strings, headers).

### How to get the most out of this
- **Poke and prod.** Tweaking is encouraged—adjust `epsilon`, `k_top`, and which feature families you allow (`byte_hist`, `byte_entropy`, or `all`) and rerun cells to compare outcomes.
- **Think in first-order terms.** Many methods rely on gradients (local sensitivity) to propose a direction; connect those proposals to the underlying features and constraints.
- **Keep it safe and scientific.** We’ll stay in **feature space** for our manipulations—perfect for learning and discussion—without modifying real binaries.

**Ready?** In the next cells, we’ll load the model and a test sample, then start experimenting.


### Environment setup (run once per session)

This cell prepares your Colab runtime for the malware portion:

> - Run this **once** when you open the notebook.
> - If Colab restarts or you **Factory reset runtime**, run it again before continuing.

After this completes, you’re ready to load the trained model and begin experiments.

In [1]:
%%bash
set -e

REPO_URL="https://github.com/japheth45/Adversarial_ML25.git"
REPO_DIR="/content/Adversarial_ML25"
SUBDIR="$REPO_DIR/malware_classification"
REQ="$SUBDIR/requirements.txt"

rm -rf "$REPO_DIR"
git clone "$REPO_URL" "$REPO_DIR" -q
echo "✅ Cloned into: $REPO_DIR"
cd "$SUBDIR"
echo "📂 CWD is now: $(pwd)"

python -m pip install --upgrade pip setuptools wheel
echo "📦 Installing requirements from: $REQ"
python -m pip install -r "$REQ"

python - <<'PY'
import sys, numpy, sklearn, torch
print("Python:", sys.version.split()[0])
print("numpy:", numpy.__version__)
print("scikit-learn:", sklearn.__version__)
print("torch:", torch.__version__)
import thrember
print("thrember:", getattr(thrember, "__version__", "import ok"))
PY


[WinError 2] The system cannot find the file specified: '/content'
C:\Users\Student\PycharmProjects\imgMLforGithub\malware_classification


  bkms = self.shell.db.get('bookmarks', {})
'rm' is not recognized as an internal or external command,
operable program or batch file.


C:\content\Adversarial_ML25\malware_classification
✅ Cloned repo and changed directory to: /content/Adversarial_ML25/malware_classification


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


### Paths & imports

This cell defines the paths we’ll use throughout the malware section and makes the helper module importable:

- `REPO_DIR` → the cloned workshop repo under `/content`
- `SUBDIR` → the malware classification folder with helpers and models
- `MODELS_DIR` → trained model + scaler stats
- `DATA_DIR` / `PE_PATH` → where the sample PE lives

Feel free to **change `PE_PATH` later** to try different files. Re-running this cell is fast and safe.

In [None]:
import os, sys, json

REPO_DIR = "/content/Adversarial_ML25"
SUBDIR    = os.path.join(REPO_DIR, "malware_classification")
MODELS_DIR = os.path.join(SUBDIR, "models")
DATA_DIR   = os.path.join(SUBDIR, "data")
PE_PATH    = os.path.join(DATA_DIR, "toy_malware.exe")

# make helpers importable
if SUBDIR not in sys.path:
    sys.path.append(SUBDIR)


### Baseline classification (sanity check)

Let’s confirm everything is wired up by classifying our "ransomware" sample once with the **trained MLP**:

- Asserts that the model artifacts and sample file exist.
- Runs `classify_path(PE_PATH, MODELS_DIR)` to print the **logit**, **probability**, and **label**.
- This gives us a starting point for the feature-space experiments that follow.

In [None]:
from malware_helpers_v3 import classify_path

assert os.path.exists(PE_PATH), f"PE not found: {PE_PATH}"
assert os.path.isdir(MODELS_DIR), f"Models dir not found: {MODELS_DIR}"

print("Classifying:", PE_PATH)
result = classify_path(PE_PATH, MODELS_DIR)
print(json.dumps(result, indent=2))

### Single-step FGSM

This cell applies a **Fast Gradient Sign Method (FGSM)** step to the EMBER v3 feature vector of our sample.

1. **Extract & standardize** the 2,568-dim vector using the saved mean/variance.
2. **Compute the gradient** of the BCE-with-logits loss with respect to the standardized input (for a benign target, we move in the direction that reduces the malicious score).
3. **One-shot “sign” step:** take a single step of size `eps` in the gradient’s sign on all coordinates.
4. **Visualize Δ:** the helper plots the per-feature change and saves
   `original_scaled_vector_v3.npy` and `adversarial_scaled_vector_v3.npy` for reference.

**Parameter to try**
- `eps`: step size in standardized units (typical 0.01–0.10). Larger values push harder but can overshoot.

**Reading the plot**
- It’s a **line chart** over feature indices (0…2567). Adjacent features with opposite signs create vertical connections; long **white streaks** appear where adjacent features share the same sign (no crossing). Dense FGSM often looks like a band near ±`eps`.

> This is a **single** step. For stronger effects, we’ll try an **iterative** variant later (recompute gradient → step again).


In [None]:
from IPython.display import Image, display
from malware_helpers_v3 import attack_and_plot_v3

res = attack_and_plot_v3(
    pe_path=PE_PATH,
    models_dir=MODELS_DIR,
    eps=0.05,                         # tweak epsilon to taste
    plot_path=os.path.join(SUBDIR, "v3_perturbation.png"),
    save_npy=True,
)

display(Image(filename=res["plot_path"]))


### Sparse FGSM (top-K features)

In this step we run a **sparse** version of FGSM that updates only the **top-K most sensitive features** in **feature space**. Compared to the dense FGSM you just ran, this produces a **targeted** perturbation you can see clearly in the plot.

**What the helper does:**
1. **Extract & standardize** the EMBER v3 feature vector (2,568 dims).
2. **Differentiate the loss** (BCE-with-logits) with respect to the standardized input to get a **saliency/gradient** for each feature.
3. **Rank features by sensitivity** (magnitude of the gradient).
   The `k` highest-magnitude coordinates are chosen as the **update set**.
4. **One-shot “sign” step on K coords:** apply a step of size `ε` **only on those K features** (±ε depending on the target), leaving all others unchanged.
5. **Plot** the per-feature change (Δ). With sparse FGSM you’ll see **K distinct needles** rather than a solid band.

**Parameters to try**
- `k`: how many features to change (e.g., 5, 50, 200). Smaller `k` → more interpretable; larger `k` → stronger but less surgical.
- `eps` (`ε`): step size in standardized units (typical 0.01–0.10). Larger steps push harder but can overshoot.
- `family`: restrict which feature families are eligible (`"byte_hist"`, `"byte_entropy"`, `"generic"`, `"pe_only"`, or `"all"`). This lets you ask *which block carries most leverage?*

**Interpreting the output**
- The printed summary shows the **baseline** and **after-step** probabilities, plus **how many features changed** (should equal `k` unless masked by `family`).
- The plot shows Δ at each index; most entries will be zero, with **K spikes** at the selected coordinates.
- If the score doesn’t move much, try **increasing `k` or `ε`**, or allow a broader `family`.

> Tip: pair this sparse step with the dense FGSM plot you just saw to compare “spread the budget everywhere” vs “concentrate on a few coordinates.”


In [None]:
from IPython.display import Image, display
from malware_helpers_v3 import attack_and_plot_topk_v3

res = attack_and_plot_topk_v3(
    pe_path=PE_PATH,                  # from your earlier cell
    models_dir=MODELS_DIR,
    k=200,
    eps=0.05,
    family="all",                     # try 'byte_hist', 'byte_entropy', 'generic', or 'pe_only'
    plot_path=f"{SUBDIR}/v3_perturbation_topk.png",
)

display(Image(filename=res["plot_path"]))
#res  # show the JSON-style summary too


### Iterative PGD (top-K, projected)

Now we move from a single sign step to **Projected Gradient Descent (PGD)**—many small steps with a projection that keeps us inside a per-feature **L∞ budget**.

**What this cell does**
1. **Initialize** at the current standardized vector.
2. For each iteration:
   - **Compute gradient** toward the target.
   - **Select top-K** most sensitive coordinates (by magnitude), filtered by `family` and any masks.
   - **Step by `alpha`** on those K coordinates (sign of the gradient).
   - **Project** back into the L∞ ball of radius `eps` (so no coordinate moves more than `eps` total).
   - If `reselection=True`, re-pick the top-K each step (stronger but less stable); if `False`, keep the first set (easier to interpret).
3. **Plot** the per-feature Δ after all steps.

**Key parameters**
- `k`: number of coordinates updated each step (e.g., 50–200). Smaller K → more surgical; larger K → stronger but more coupled.
- `alpha`: step size per iteration (e.g., 0.005–0.02). If nothing moves, increase it; if you overshoot, reduce it.
- `steps`: number of iterations (e.g., 10–40). More steps let the attack “track” curvature.
- `eps`: maximum total change per coordinate (standardized units).
- `family`: restrict updates to certain feature families (`"byte_hist"`, `"byte_entropy"`, `"generic"`, `"pe_only"`, or `"all"`).
- `reselection`: `True` recomputes the K indices each step (usually stronger); `False` keeps a fixed set (more interpretable).

**Reading the plot**
- Expect **sparse points** at the indices updated across iterations. With reselection on, the pattern can shift as the optimizer adapts.
- If the score barely changes, try **raising `alpha` or `steps`**, or allow a broader `family`. If it becomes unstable, reduce `alpha` or K.


In [None]:
from IPython.display import Image, display
from malware_helpers_v3 import attack_and_plot_pgd_topk_v3

# Try 20 steps, alpha=0.01 first; bump steps or alpha if needed
res = attack_and_plot_pgd_topk_v3(
    pe_path=PE_PATH,                 # from your earlier cell
    models_dir=MODELS_DIR,
    k=200,
    eps=0.05,
    alpha=0.01,
    steps=20,
    family="all",                    # or 'byte_hist', 'byte_entropy', 'generic', 'pe_only'
    reselection=True,                # True often helps; try False for fixed set comparison
    plot_path=f"{SUBDIR}/v3_perturbation_pgd_topk.png",
)

display(Image(filename=res["plot_path"]))
#res  # show the JSON-style summary too


### Sparse L0 flip (find the smallest set of features that flips)

This cell searches for the **smallest number of feature coordinates (k)** that must be changed—within the selected family—to push the model past a target margin (e.g., `prob ≤ 0.499` for benign).

**How the search works**
1. **Exponential search:** start at `k_min=1` and try k = 1,2,4,8,… until *some* k succeeds.
2. **Binary search:** once a successful k is found, search downward to the **minimal k** that still meets the `margin`.
3. **Inner optimizer (for each k):** with the current k-mask, run a short gradient-based routine for up to `inner_steps` to find a good perturbation; multiple `random_restarts` improve robustness.

**Key controls**
- `family`: restricts the pool of editable features (e.g., `"byte_hist"`). Use `"all"` to give the search more room.
- `allowed_idx` / `exclude_idx`: optional fine-grained allow/deny lists.
- `margin`: how far past the decision boundary we require (tighter → harder).
- `inner_steps`, `inner_lr`: effort and step size of the inner optimization.
- `random_restarts`, `patience`: help escape flat spots and tolerate brief non-improvement.
- `max_outer_steps`: caps the exponential/binary search effort.

**Output**
- Prints the **baseline probability**, whether it **flipped**, the **minimal k** found, and **probability after**.
- Returns a summary that includes the list of **changed indices** (inside the final k-mask) and saves the before/after standardized vectors (`original_scaled_vector_v3.npy`, `perturbed_scaled_vector_v3.npy`).

> Tips: If the search stalls near the boundary, try increasing `inner_steps` (e.g., 400–800), `random_restarts`, or loosening `family` (e.g., from `"byte_hist"` to `"all"`).


In [None]:
from malware_helpers_v3 import run_sparse_flip_demo_v3

res = run_sparse_flip_demo_v3(
    pe_path=PE_PATH,
    models_dir=MODELS_DIR,
    k_min=1,
    inner_steps=200,          # try 400–800 if you tighten margin
    inner_lr=0.05,
    family="byte_hist",       # or 'all' to give the search more room
    allowed_idx=None,         # e.g., [0,1,2] for precise allow-list
    exclude_idx=None,         # e.g., [10,20,30] optional hard excludes
    margin=0.499,             # tighter targets may need more effort
    random_restarts=10,       # more restarts helps
    patience=5,               # tolerate more no-improve steps
    max_outer_steps=24,       # add a bit more room on the exp. search
    save_npy=True,
)
#res  # show the JSON-style summary too


### Overlay “closed loop” (exploratory)

This cell runs an **overlay loop** that appends small chunks of bytes to a **copy** of the sample and measures how the model’s score changes after each attempt. It is intended as an **exploratory, educational demo** of how byte-level edits can influence feature vectors (e.g., byte histograms). It is **not** intended for real-world evasion.

**What the loop does**
1. **Baseline:** measure the model’s probability on the original file.
2. **Propose a chunk:** pick a set of byte-histogram bins (controlled by `family` and `k_top`) and synthesize a small **overlay** chunk (`chunk_size_bytes`) targeting those bins.
3. **Evaluate & accept:** apply the candidate to a **temporary copy**, re-score, and **accept** only if the score moves in the desired direction by at least `min_improve`.
   - If the full chunk doesn’t help, the loop line-searches **smaller sub-chunks** (½, ¼, …) before counting it as a miss.
4. **Stop conditions:** stop when the model crosses the **target threshold**, when **patience** is exhausted (no useful improvement), or when the **total appended bytes** hits `max_total_append`.
5. **Optional shrink-to-fit:** after a successful step, try a smaller last chunk that still keeps the result.

**Key controls**
- `k_top`: how many bins to target per attempt (larger spreads the budget; smaller is more focused).
- `chunk_size_bytes`: size of each candidate overlay (the loop will try smaller sub-chunks if needed).
- `max_total_append`: global budget for all accepted overlays.
- `target_threshold`: stopping goal for the model’s probability (start easier, tighten later).
- `patience_steps` / `min_improve`: acceptance gate and early-stop behavior.
- `family`: restricts which feature families can be targeted (e.g., `"byte_hist"`; try `"all"` or include `"byte_entropy"` for more leverage).

**Output**
- A `summary` showing the **baseline** and **final** probabilities, whether the target was met, **total bytes appended**, and a brief **history** of attempts.

> **Note:** This step demonstrates *sensitivity* at the byte/feature level in a controlled setting. Results will vary by model and parameters, and the loop may stop early when improvements are too small or inconsistent.


In [None]:
from malware_helpers_v3 import OverlayConfig, run_overlay_closed_loop_v3

cfg = OverlayConfig(
    pe_in_path=f"{SUBDIR}/data/toy_malware.exe",
    pe_out_path=f"{SUBDIR}/data/demo_patch.exe",
    k_top=64,
    chunk_size_bytes=1024,
    max_total_append=1_048_576,
    target_threshold=0.49,    # start easier, tighten after you see progress
    patience_steps=10,
    min_improve=1e-4,         # allow small but real gains to accumulate
    shrink_to_fit=True,
    seed=1337,
    family="byte_hist",       # try 'all' or include 'byte_entropy' for more leverage
    allowed_idx=None,
    exclude_idx=None,
)
summary = run_overlay_closed_loop_v3(cfg, models_dir=MODELS_DIR, device="cpu")
# summary


### Iterative, normalization-aware FGSM (feature space)

Here we run a **multi-step** variant of FGSM that re-computes the gradient **each iteration** and only moves coordinates where an **addition is predicted to help** (normalization-aware selector):

**What happens each step**
1. **Gradient at the current point** (standardized features).
2. **Normalization-aware selection:** pick up to `k_top` histogram coordinates whose “add-bytes” direction lowers the loss (i.e., safe directions; we skip bins that would hurt).
3. **Nudge by `epsilon`** on those coordinates (optionally decaying `epsilon`), honoring an optional per-coordinate cap (`cap_linf`).
4. **Accept** the step if the probability improves by at least `min_improve`; stop early if the **margin** is met (e.g., `p ≤ 0.49` for benign).

**Parameters**
- `steps`: maximum iterations (try 5–20).
- `epsilon`: step size per iteration (e.g., 0.01–0.05); `epsilon_decay` can gently reduce it.
- `k_top`: number of coordinates to change per step (smaller = more focused).
- `family`: `"all"` or a subset like `"byte_hist"` to isolate where you want to act.
- `cap_linf`: optional per-coordinate budget across all steps (standardized units).
- `min_improve`, `patience`: simple monotonicity/early-stop controls.

**Output**
- Prints a small summary with the **before/after** probabilities, **how many steps** actually updated the vector, and the **number of coordinates** that received any change.
- For deeper inspection, check `run["history"]` to see per-step details (selected indices, step size, and acceptance).


In [None]:
from importlib import reload
import malware_helpers_v3 as mh
reload(mh)

model, mean, var, _ = mh.load_artifacts(MODELS_DIR)
x0 = mh.extract_scaled_from_exe_v3(PE_PATH, mean=mean, var=var, device="cpu")

# Run 10 iterative steps, normalization-aware, feature-space only
x_final, run = mh.fgsm_normaware_iter_v3(
    x_scaled=x0, model=model, mean=mean, var=var,
    steps=10, epsilon=0.03, k_top=50,
    family="all",               # or "byte_hist" to isolate
    allowed_idx=None, exclude_idx=None,
    target=None,                # flip current label
    margin=0.49,                # stop once <= 0.49 (for benign)
    epsilon_decay=1.0,          # keep step size constant; try 0.9 if you like
    cap_linf=None,              # e.g., 0.2 to bound total per-dim movement
    min_improve=0.0,
    patience=3,
)

print({
    "before": run["before_prob"],
    "after": run["after_prob"],
    "steps_run": run["steps_run"],
    "changed_coords": run["final_changed_coords"],
})
# If you want, inspect run["history"] to see per-step detail.
