# LAB 6

Networks and Systems Security
Week 06
Binary Analysis & Symbolic Execution

### Setting up: 
Download Process Monitor (Procmon.exe) from the official Microsoft Sysinternals website
https://learn.microsoft.com/sysinternals/downloads/procmon

Install the following libraries:

In [1]:
!pip install pefile yara-python

Collecting pefile
  Downloading pefile-2024.8.26-py3-none-any.whl.metadata (1.4 kB)
Collecting yara-python
  Downloading yara_python-4.5.4-cp312-cp312-win_amd64.whl.metadata (3.0 kB)
Downloading pefile-2024.8.26-py3-none-any.whl (74 kB)
Downloading yara_python-4.5.4-cp312-cp312-win_amd64.whl (1.8 MB)
   ---------------------------------------- 0.0/1.8 MB ? eta -:--:--
   ----------------- ---------------------- 0.8/1.8 MB 5.6 MB/s eta 0:00:01
   ---------------------------------------- 1.8/1.8 MB 4.8 MB/s eta 0:00:00
Installing collected packages: yara-python, pefile
Successfully installed pefile-2024.8.26 yara-python-4.5.4


### Understanding the Role of a Malware Analyst
Before analysing any file, a professional malware analyst typically:
1. Reviews file metadata (hashes, timestamps, size).
2. Performs static analysis to understand structure and imports.
3. Searches for suspicious strings, URLs, IP addresses, or encoded data.
4. Applies custom or organisational YARA rules.
5. Produces IOCs (Indicators of Compromise) for incident response.
6. Writes reports for SOC teams or automated detection pipelines.

### Hash Calculation (IOCs)
In malware analysis, cryptographic hashes are one of the most
fundamental Indicators of Compromise (IOCs).
They serve as unique identifiers for a file, allowing:
- Threat intelligence sharing
- Duplicate sample detection
- Quick reputation checks
- SOC correlation and automated triage

MD5, SHA1, and SHA256 all generate a fixed-size fingerprint of the file,
but SHA256 is the industry standard for reliability and collision
resistance.

code

In [None]:
# copy full path of a benign executable (Procmon.exe)
# Read this file as raw bytes and compute the hash
import hashlib

def compute_hash(path, algorithm):
    h = hashlib.new(algorithm)
    with open(path, "rb") as f:
        h.update(f.read())
    return h.hexdigest()

sample = r"c:\Users\kierm\Downloads\ProcessMonitor\Procmon.exe"

print("MD5: ", compute_hash(sample, "md5"))
print("SHA1: ", compute_hash(sample, "sha1"))
print("SHA256:", compute_hash(sample, "sha256"))

MD5:  c3e77b6959cc68baee9825c84dc41d9c
SHA1:  bc18a67ad4057dd36f896a4d411b8fc5b06e5b2f
SHA256: 3b7ea4318c3c1508701102cf966f650e04f28d29938f85d74ec0ec2528657b6e


#### Analysis
We read the exe and compute its hash for 
- MD5, which is a lot shorter than the other hashes, but faster
- SHA1, sligtly longer than MD5
- and finally SHA256, the longest and most likely most secure

These hashes allow for good integrity checks (like we have done in the previous workshop) since any small change will result in completly new hashes thanks to the avalanche effect

### String Extraction
Binary files, including malware, often contain human-readable text.
Analysts review these strings to identify:
- Hardcoded paths
- Registry keys
- Network infrastructure (domains, URLs, IPs)
- Encryption keys or markers
- Persistence mechanisms

Strings provide valuable hints early in the analysis—often before deeper
reverse engineering.

code

In [1]:
# Scan the file for printable ASCII sequences of at least four characters.
# These correspond to readable strings embedded in the executable.
import re

def extract_strings(path):
    with open(path, "rb") as f:
        data = f.read()
    pattern = rb"[ -~]{4,}"
    return re.findall(pattern, data)

sample = r"c:\Users\kierm\Downloads\ProcessMonitor\Procmon.exe"
strings = extract_strings(sample)

for s in strings[:20]:
    print(s.decode(errors="ignore"))

!This program cannot be run in DOS mode.
V*0T
0RichU
.text
`.rdata
@.data
.rsrc
@.reloc
hpqQ
h`EN
h|nN
h\nN
hlnN
=UUU
h_rM
hDLN
h`GO
hDLN
h|GO
hDLN


For benign tools like Procmon, strings look legitimate:
- DLL names
- Menu labels
- File paths
- Standard Windows messages

In actual malware, this step often reveals:
- Command-and-control domains
- Suspicious temp file names
- Embedded scripts
- Obfuscation artefacts

#### Analysis
Our output shows completly  anormal executable, with standard labels and windows messages. There is no suspicious readable commands or strings, just what seems to be random noise alongside our legitimate strings.

#### Interpretation
These strings can instantly reveal early signs of malware, super useful! Aside from this it can give us a look at file paths or DLL names super early which might be useful just as an early test.

### PE Header Inspection Using pefile
Most Windows malware is delivered as a Portable Executable (PE) file.

Learning to read PE headers reveals:
- How the program is structured
- Which libraries it relies on
- Whether the file shows signs of packing or obfuscation
- Possible capabilities (e.g., networking, registry manipulation)

code


In [1]:
# Script to load PE file and read:
# • Entry Point: where execution begins
# • Image Base: preferred loading address in memory
# • Import Table: all external functions (APIs) the binary relies on
# These details form part of an analyst’s early triage
import pefile

sample = r"c:\Users\kierm\Downloads\ProcessMonitor\Procmon.exe"
pe = pefile.PE(sample)

print("Entry Point:", hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)) #where execution starts
print("Image Base:", hex(pe.OPTIONAL_HEADER.ImageBase)) #preferred load address

print("\nImported DLLs and functions:") #all external functions the binary uses
for entry in pe.DIRECTORY_ENTRY_IMPORT:
    print(" ", entry.dll.decode())
    for imp in entry.imports[:5]:
        print(" -", imp.name.decode() if imp.name else "None")

Entry Point: 0xa7f70
Image Base: 0x400000

Imported DLLs and functions:
  WS2_32.dll
 - getsockname
 - listen
 - recv
 - closesocket
 - socket
  VERSION.dll
 - GetFileVersionInfoW
 - VerQueryValueW
 - GetFileVersionInfoSizeW
  COMCTL32.dll
 - ImageList_ReplaceIcon
 - ImageList_SetBkColor
 - ImageList_AddMasked
 - ImageList_BeginDrag
 - ImageList_EndDrag
  FLTLIB.DLL
 - FilterSendMessage
 - FilterGetMessage
 - FilterReplyMessage
 - FilterConnectCommunicationPort
  KERNEL32.dll
 - AcquireSRWLockExclusive
 - AcquireSRWLockShared
 - InitializeSRWLock
 - GetSystemInfo
 - VerSetConditionMask
  USER32.dll
 - LoadStringA
 - DrawEdge
 - GetMessageW
 - TranslateMessage
 - DispatchMessageW
  GDI32.dll
 - SaveDC
 - RestoreDC
 - SetBrushOrgEx
 - SetPixel
 - PatBlt
  COMDLG32.dll
 - ChooseColorW
 - GetOpenFileNameW
 - PrintDlgW
 - ChooseFontW
 - FindTextW
  ADVAPI32.dll
 - RegQueryValueExW
 - ConvertStringSidToSidW
 - ConvertSidToStringSidW
 - RegSetValueW
 - RegEnumKeyW
  SHELL32.dll
 - SHGetSpecia

 Procmon imports many legitimate Windows APIs, such as:
- kernel32.dll
- user32.dll
- advapi32.dll

If this were malware, suspicious API imports might include:
- CreateRemoteThread (process injection)
- VirtualAllocEx (shellcode allocation)
- GetProcAddress and LoadLibraryA (dynamic API resolving)
- WinExec or ShellExecuteA (execution of child processes)

#### Analysis
Just like before, the output shows no signs of malware, with no unusual imports. Aside from this it simply shows us a normal windows PE. 

### YARA Analysis
YARA is the primary tool for:
- Writing detection rules
- Identifying malware families
- Matching file characteristics in SOC pipelines

Analysts use YARA to express signatures based on strings, binary
patterns, and structural features.

code

In [3]:
# Defines a rule that triggers whenever the string "http" is found
# Compiles the rule using yara-python
# Runs it against the sample file
import yara

rule_source = """
rule ContainsHTTP {
    strings:
        $s = "http"
    condition:
        $s
}

rule SusSignatures {
    strings:
        $signiture1 = "eval("
        $signiture2 = "base64.b64decode"
        $signiture3 = "socket.connect"
        $signiture4 = "exec("
        $signiture5 = "import os"
    condition:
        any of ($signiture*)
}
"""

sample = r"c:\Users\kierm\Downloads\ProcessMonitor\Procmon.exe"
rules = yara.compile(source=rule_source) 
matches = rules.match(sample) 
print(matches)

[ContainsHTTP]


#### Analysis
Ive defined two rules, ContainsHTTP which looks for the string "http" anywhere in the file, and "SusSignatures" Which further looks for any extra suspicious patterns.

As you can tell from the output, yara only found http, no suspicious signatures. By itself, this doesnt mean anything malicious with executables using http quite often. This confirms yara is working as intended, demonstrating a bit how real analysts might use this tool for detecting patterns in large datasets or searching for specific key words.

#### Mini reflection
Since this is an analysis of an executable, im not excactly sure how to test it with a suspicous example, so I will simply leave it as is. In the future id like to do some more testing with these quick inspection tools.

### Complete Static Triage Workflow
This section shows how different analysis techniques come together to
form a coherent triage workflow, similar to what an analyst would do
during a real investigation—except here all files are safe.

You should learn how to:
1. Compute hashes
2. Extract readable strings
3. Enumerate imports
4. Identify potential IOCs
5. Apply a YARA rule

code

In [4]:
import hashlib, pefile, re, yara

# sample = "samples/procmon.exe"
sample = r"c:\Users\kierm\Downloads\ProcessMonitor\Procmon.exe"

def compute_hashes(path):
    algos = ["md5", "sha1", "sha256"]
    output = {}
    for a in algos:
        h = hashlib.new(a)
        with open(path, "rb") as f:
            h.update(f.read())
        output[a] = h.hexdigest()
    return output

def extract_strings(path):
    with open(path, "rb") as f:
        data = f.read()
    return re.findall(rb"[ -~]{4,}", data)

print("Hashes:", compute_hashes(sample))

print("\nStrings:")
print(extract_strings(sample)[:10])

print("\nImports:")
pe = pefile.PE(sample)
for entry in pe.DIRECTORY_ENTRY_IMPORT:
    print(entry.dll.decode())

print("\nIOCs:")
decoded = open(sample, "rb").read().decode(errors="ignore")
print("URLs:", re.findall(r"https?://[^\s\"']+", decoded))
print("IPs:", re.findall(r"\b\d{1,3}(?:\.\d{1,3}){3}\b", decoded))

print("\nYARA:")
rule = yara.compile(source="""
rule Simple {
    strings: $s = "http"
    condition: $s
}
""")
print(rule.match(sample))

Hashes: {'md5': 'c3e77b6959cc68baee9825c84dc41d9c', 'sha1': 'bc18a67ad4057dd36f896a4d411b8fc5b06e5b2f', 'sha256': '3b7ea4318c3c1508701102cf966f650e04f28d29938f85d74ec0ec2528657b6e'}

Strings:
[b'!This program cannot be run in DOS mode.', b'V*0T', b'0RichU', b'.text', b'`.rdata', b'@.data', b'.rsrc', b'@.reloc', b'hpqQ', b'h`EN']

Imports:
WS2_32.dll
VERSION.dll
COMCTL32.dll
FLTLIB.DLL
KERNEL32.dll
USER32.dll
GDI32.dll
COMDLG32.dll
ADVAPI32.dll
SHELL32.dll
ole32.dll
OLEAUT32.dll
SHLWAPI.dll
UxTheme.dll
dwmapi.dll
ntdll.dll

IOCs:
URLs: ['https://go.microsoft.com/fwlink/?LinkId=521839', 'https://go.microsoft.com/fwlink/?LinkId=521839', 'https://go.microsoft.com/fwlink/?LinkId=521839\\ul0\\cf0}}}}\\f0\\fs20', 'https://microsoft.com/exporting', 'https://microsoft.com/exporting}}}}\\f0\\fs19', 'http://www.microsoft.com/pkiops/crl/Microsoft%20Windows%20Third%20Party%20Component%20CA%202012.crl0\x06\x08+\x06\x01\x05\x05\x07\x01\x01\x04u0s0q\x06\x08+\x06\x01\x05\x05\x070\x02ehttp://www.microso

#### Analysis
A complete combination of all the techniques we just used shows how simple static analysis can come together to form a very useful pipeline, allowing us to quickly identify early malware and structure.

These sort of structures used in real security analysis, creating pipelines for detecting simple characteristics with yara, inspecting PE files, extracting header information, interpreting findings and identifying potential artifacts.