#**Task 1**
##**FILE INTEGRITY CHECKER**
A File Integrity Checker is a security tool that monitors files for unauthorized or unintended changes. It works by calculating cryptographic hash values (SHA-256) for each file and storing them as a trusted baseline. On subsequent checks, the tool recalculates the hash values and compares them with the stored hashes to detect file modifications, additions, or deletions. Any change in a fileâ€™s content results in a different hash value, allowing the system to accurately identify integrity violations and ensure data authenticity.

###**Hash Generation and Directory Scanning**
This below section implements the core functionality of the File Integrity Checker by calculating SHA-256 hash values for files. The code scans a specified directory, reads each file in binary mode, and generates a unique hash that represents the fileâ€™s content. These hash values are stored using file paths as identifiers and are later used to compare file states and detect any modifications, ensuring file integrity.

In [41]:
import hashlib
import os
import json

DIRECTORY = "/content/test_files"
HASH_FILE = "file_hashes.json"

def calculate_hash(file_path):
    sha256 = hashlib.sha256()
    with open(file_path, "rb") as f:
        while chunk := f.read(4096):
            sha256.update(chunk)
    return sha256.hexdigest()

def scan_directory(directory):
    hashes = {}
    for root, _, files in os.walk(directory):
        for file in files:
            path = os.path.join(root, file)
            hashes[path] = calculate_hash(path)
    return hashes


###**File Creation and Initial Setup**

This code creates the target directory for monitoring and generates sample text files used for integrity checking. If the directory already exists, it is reused without errors. Two text files are created with initial content, establishing a baseline state for subsequent hash generation and integrity verification.

In [42]:
os.makedirs(DIRECTORY, exist_ok=True)

with open(f"{DIRECTORY}/sample1.txt", "w") as f:
    f.write("Hello World")

with open(f"{DIRECTORY}/sample2.txt", "w") as f:
    f_attach = "File Integrity Checker"
    f.write(f_attach)

print("Files created")


Files created


###**Baseline Hash Generation and Storage**

This code generates the initial baseline by scanning the directory and calculating SHA-256 hash values for all files. The computed hashes are then stored in a JSON file, which serves as the trusted reference for future integrity checks. Printing the baseline hashes confirms successful baseline creation and allows verification of the stored hash values.

In [43]:
baseline_hashes = scan_directory(DIRECTORY)

with open(HASH_FILE, "w") as f:
    json.dump(baseline_hashes, f, indent=4)

print("ðŸ”¹ BASELINE HASHES CREATED")
print(json.dumps(baseline_hashes, indent=4))


ðŸ”¹ BASELINE HASHES CREATED
{
    "/content/test_files/sample2.txt": "63abe25b32836891729d470d1cfbdbe2ab3d55eae757b7bfb6d2790549e80951",
    "/content/test_files/sample1.txt": "a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e"
}


###**File Modification for Integrity Testing**

This code modifies an existing file by appending new content to it. The change is intentionally introduced to simulate a file update or unauthorized modification. This allows the File Integrity Checker to detect the change during the next integrity verification by comparing updated hash values with the stored baseline.

In [44]:
with open(f"{DIRECTORY}/sample1.txt", "a") as f:
    f.write("\nThis line was added later.")

print("sample1.txt modified")


sample1.txt modified


###**File Integrity Verification and Change Detection**

This code performs the integrity check by loading previously stored baseline hash values and recalculating the current hashes of all files in the directory. The new hash values are compared with the baseline hashes to identify any differences. If a mismatch is found, the corresponding file is reported as modified; otherwise, the system confirms that no changes have been detected, ensuring file integrity.

In [45]:
with open(HASH_FILE, "r") as f:
    old_hashes = json.load(f)

new_hashes = scan_directory(DIRECTORY)

print("CURRENT FILE HASHES:")
print(json.dumps(new_hashes, indent=4))

modified = []

for file in new_hashes:
    if file in old_hashes and new_hashes[file] != old_hashes[file]:
        modified.append(file)

print("\nINTEGRITY REPORT")
print("-------------------")

if modified:
    print("Modified Files:")
    for f in modified:
        print(" -", f)
else:
    print("No changes detected")


CURRENT FILE HASHES:
{
    "/content/test_files/sample2.txt": "63abe25b32836891729d470d1cfbdbe2ab3d55eae757b7bfb6d2790549e80951",
    "/content/test_files/sample1.txt": "d3eab03f21b7800a04368f4c08f7ca7a25fa0285fdd4ca8ec2bfca5b6266eed3"
}

INTEGRITY REPORT
-------------------
Modified Files:
 - /content/test_files/sample1.txt
