# Malware Analysis 
This notebook performs the initial stages of malware analysis.

## How To
1. Place the malware samples into the malware directory
2. Walk through the steps on the notebook
3. Interact with the prompts
4. Perform the exercises along the way

# Imports

In [103]:
import os
import hashlib
import requests
import yara
from virustotal_python import Virustotal
from IPython.display import display, Markdown

# Listing of the files in the Malware directory

In [104]:
file_directory_path = '/home/aggelos/Desktop/notebook/malicious'  # Directory containing files
files = [f for f in os.listdir(file_directory_path) if os.path.isfile(os.path.join(file_directory_path, f))]
print("Files found in the directory: /home/aggelos/Desktop/notebook/malicious/ : \n")
for file in files:
    print(file)


Files found in the directory: /home/aggelos/Desktop/notebook/malicious/ : 

eicar[dot]exe
text[dot]sh


# VirusTotal API Key Validation

Provide a valid VirusTotal key.

## Exercise

The Key you provide is in clear text, try to hide the text for safety (like a sensitive password) (getpass library?)

In [105]:
def validate_vt_api_key(key):
    """ Validate the VirusTotal API key by making a test request. """
    try:
        vtotal_temp = Virustotal(API_KEY=key, API_VERSION="v3")
        # Make a benign test call using a known file hash to check the API key validity
        response = vtotal_temp.request("files/908B64B1971A979C7E3E8CE4621945CBA84854CB98D76367B791A6E22B5F6D53")
        if response.status_code == 200:
            return True
        elif response.status_code == 403:
            print("Invalid API key. Please re-enter.")
            return False
        else:
            print(f"Failed to validate API Key: HTTP Status Code {response.status_code}. Response: {response.data}")
            return False
    except requests.exceptions.RequestException as e:
        print(f"Network error during API request: {str(e)}")
        return False
    except Exception as e:
        print(f"An error occurred: {str(e)}")
        return False

# Loop to ensure the user enters a valid API key
valid_key = False
while not valid_key:
    V_API_KEY = input("Enter your VirusTotal API key: ").strip()
    valid_key = validate_vt_api_key(V_API_KEY)
    if not valid_key:
        print("Please enter a valid VirusTotal API key.")

print("API Key validated successfully.")


Enter your VirusTotal API key: 58e37b1a6e72b955bcc35ca6abf7a4064b45c90b5e740ecf961da59bb8cf5917
API Key validated successfully.


# Defanging of files

Let's you defang files to avoid misclicking them. It provides feedback based on your answer.

## Exercise

The following code defangs all files to avoid misclicks, try to expand the code to defang any potential URLs that are within the files. For example some text files might contain malicious URLs, those should be defanged as well.

In [106]:
def defang_filenames(directory, perform_defang, user_input):
    files = os.listdir(directory)
    all_defanged = True  # Assume all are defanged unless proven otherwise
    any_defanged = False  # Track if any file has been defanged

    for filename in files:
        if not is_defanged(filename):  # Check if the file needs defanging
            all_defanged = False  # Found a file that is not defanged
            if perform_defang:
                defanged_name = defang_filename(filename)
                original_filepath = os.path.join(directory, filename)
                new_filepath = os.path.join(directory, defanged_name)
                os.rename(original_filepath, new_filepath)
                any_defanged = True
    
    if any_defanged:
        return "DEFANGED"
    elif all_defanged:
        return "No actions needed, all files are already DEFANGED"  # All were already defanged
    else:
        return "NOT DEFANGED" if user_input.lower() == 'no' else "INVALID INPUT"

def is_defanged(filename):
    """Check if the filename is already defanged."""
    return '[dot]' in filename or '[slash]' in filename or '[colon]' in filename

def defang_filename(filename):
    """Defang the filename by replacing certain characters."""
    defanged_name = filename.replace('.', '[dot]').replace('/', '[slash]').replace(':', '[colon]')
    return defanged_name

# User interaction for defanging decision
defang_choice = input("Do you want to defang the filenames? Yes/No: ")
perform_defang = defang_choice.strip().lower() == 'yes'

# Directory containing files
file_directory_path = '/home/aggelos/Desktop/notebook/malicious'

# Execute defanging based on user choice and print the appropriate response
defanging_result = defang_filenames(file_directory_path, perform_defang, defang_choice)
print(defanging_result)


Do you want to defang the filenames? Yes/No: yes
No actions needed, all files are already DEFANGED


# Calculating the sha256 of each file

## Exercise

Try to create a file analyzer which provides the MIME type of each file. This will give us a better all around view if we have many malware samples to analyze.

In [107]:
def sha256sum(filepath):
    with open(filepath, 'rb') as file:
        data = file.read()
        h = hashlib.sha256(data)
        return h.hexdigest()
    
hashes = {file: sha256sum(os.path.join(file_directory_path, file)) for file in files}
for file, hash_val in hashes.items():
    print(f"SHA256 of {file}: {hash_val}")

SHA256 of eicar[dot]exe: e038b5168d9209267058112d845341cae83d92b1d1af0a10b66830acb7529494
SHA256 of text[dot]sh: 5b5b114edabcdf33ad1dcc78f7c2e5be0ad3412a0c97afaaee2f2d585bc581bf


# Check the files on VirusTotal

## Exercise

Try to extend the existing code and add a section where you leverage the VirusTotal API to check malicious URLs as well.

In [108]:
vtotal = Virustotal(API_KEY=V_API_KEY, API_VERSION="v3")
def check_virustotal(hash):
    try:
        resp = vtotal.request(f"files/{hash}")
        score = f"{resp.data['attributes']['last_analysis_stats']['malicious']} / {sum(resp.data['attributes']['last_analysis_stats'].values())}" if resp.data['attributes']['last_analysis_stats']['malicious'] > 0 else "No detections"
    except Exception as e:
        score = "Not Found on VirusTotal"
    return score

vt_scores = {file: check_virustotal(hash_val) for file, hash_val in hashes.items()}
for file, score in vt_scores.items():
    print(f"VirusTotal Score for {file}: {score}")


VirusTotal Score for eicar[dot]exe: 60 / 74
VirusTotal Score for text[dot]sh: Not Found on VirusTotal


# Load Yara rules and Scan Files

## Exercise

Try creating your own yara rule which can be used to detect ransomware

In [109]:
yara_rule_directory_path = '/home/aggelos/Desktop/notebook/yara/test'
rules = {}
for file in os.listdir(yara_rule_directory_path):
    if file.endswith('.yar'):
        path = os.path.join(yara_rule_directory_path, file)
        with open(path, 'r') as f:
            rules[file] = f.read()

compiled_rules = yara.compile(sources=rules)
def scan_with_yara(filepath, rules):
    matches = rules.match(filepath)
    return ", ".join([match.rule for match in matches]) if matches else "No matches"

yara_results = {file: scan_with_yara(os.path.join(file_directory_path, file), compiled_rules) for file in files}
for file, result in yara_results.items():
    print(f"YARA Matches for {file}: {result}")


YARA Matches for eicar[dot]exe: eicar
YARA Matches for text[dot]sh: No matches


In [110]:
# Final Summary Report for Each File
print("\nFinal Summary Report:")
print("---------------------------------------------------------------------")
for file in files:
    file_path = os.path.join(file_directory_path, file)
    file_hash = hashes[file]
    vt_score = vt_scores[file]
    yara_result = yara_results[file]
    defanged_status = "DEFANGED" if is_defanged(file) else "NOT DEFANGED"

    print(f"File Name: {file}")
    print(f"File Hash: {file_hash}")
    print(f"VirusTotal Score: {vt_score}")
    print(f"Yara rule Matches: {yara_result}")
    print(f"Defanged: {defanged_status}")
    print("---------------------------------------------------------------------\n")


Final Summary Report:
---------------------------------------------------------------------
File Name: eicar[dot]exe
File Hash: e038b5168d9209267058112d845341cae83d92b1d1af0a10b66830acb7529494
VirusTotal Score: 60 / 74
Yara rule Matches: eicar
Defanged: DEFANGED
---------------------------------------------------------------------

File Name: text[dot]sh
File Hash: 5b5b114edabcdf33ad1dcc78f7c2e5be0ad3412a0c97afaaee2f2d585bc581bf
VirusTotal Score: Not Found on VirusTotal
Yara rule Matches: No matches
Defanged: DEFANGED
---------------------------------------------------------------------

