# **📜 LAB 2**  - Hashing and secured ledger


## Introduction to Hashing - hashlib

Before jumping into using the hashlib library, let's first understand what hashing is.

What is Hashing?

Hashing is a process where you take input data (like a text or number) and convert it into a fixed-size string of characters.

The output of hashing is called a hash or digest.

Hashing is a one-way process, meaning once you get the hash, you cannot reverse it to get the original data back.

Even a small change in the input data will produce a completely different hash.

A common use of hashing in blockchain and security is to ensure data integrity—that is, ensuring that the data hasn't been tampered with.

For example, if someone tries to change a patient's record in our hospital ledger, the hash will no longer match, and we will know that the data was altered.

Why is Hashing Important?
Data Integrity: It helps verify that the data has not been changed.
Security: It makes data unreadable and irreversible, which is essential for keeping information safe.

Step-by-Step Guide to Using the hashlib Library

Python has a built-in library called hashlib that makes hashing easy. Let's first show the basics of how to use it.

1. Import the hashlib Library
To use hashing, we need to import the hashlib library. This is a built-in Python library, so you don't need to install anything.

In [None]:
import hashlib

2. Creating a Hash
Now, let's create a simple hash. We'll start with the basic SHA-256 hash function, which is one of the most commonly used hash functions.

In [None]:
# Input data
data = "Hello, blockchain!"

# Create a SHA-256 hash of the input data
hashed_data = hashlib.sha256(data.encode()).hexdigest()

# Print the hash
print(f"Original Data: {data}")
print(f"Hashed Data: {hashed_data}")


Original Data: Hello, blockchain!
Hashed Data: e485541186cd67682999c1ad80eac78fb803ec57885c26d5489efb01f15ae913


Explanation:

* hashlib.sha256() creates a SHA-256 hash object.

* .encode() converts the string into bytes (which is required for hashing).

* .hexdigest() converts the hash object into a readable hexadecimal string.

3. Example: Hashing Simple Strings
Now, let's try it with different strings. Each string will generate a unique hash.

In [None]:
# Example 1: Hashing "apple"
hash1 = hashlib.sha256("apple".encode()).hexdigest()
print(f"Hash for 'apple': {hash1}")

# Example 2: Hashing "banana"
hash2 = hashlib.sha256("banana".encode()).hexdigest()
print(f"Hash for 'banana': {hash2}")


Hash for 'apple': 3a7bd3e2360a3d29eea436fcfb7e44c735d117c42d1c1835420b6b9942dd4f1b
Hash for 'banana': b493d48364afe44d11c0165cf470a4164d1e2609911ef998be868d46ade3de4e


Expected Output:

The hashes for "apple" and "banana" will be different, even though they are just small words. This shows that even tiny changes in input lead to completely different hashes.

4. Understanding the Output

The hash output will look something like this (it's a long string of numbers and letters):

Hash for 'apple': `5c6f6e5d2d0f4a978f86b93fd34a64f1efb87067b0cfd01ac3a98f9ec699da85`


This is the "digital fingerprint" of the word "apple".

Even if we change just one character in the input, the hash will be completely different:

In [None]:
# Example: Changing 'apple' to 'applz'
hash3 = hashlib.sha256("applz".encode()).hexdigest()
print(f"Hash for 'applz': {hash3}")


Hash for 'applz': 5af3dcd5b4a9c41fff050423eba31def641b469283f7e5b6431d0ba65d3f4f9a


Output: The hash for 'applz' will be completely different from 'apple'.

5. Verifying Data Integrity

Let's now apply this to our hospital ledger. By hashing the visit data (like treatment details), we can ensure the data has not been changed.

In [None]:
# A patient's visit details
patient_name = "John Doe"
treatment = "X-ray"
cost = 200
date_of_visit = "2025-02-09"

# Combine all visit details into one string
visit_data = patient_name + treatment + str(cost) + date_of_visit

# Create a hash for this visit data
visit_hash = hashlib.sha256(visit_data.encode()).hexdigest()

# Print the visit hash
print(f"Visit Data: {visit_data}")
print(f"Visit Hash: {visit_hash}")


Visit Data: John DoeX-ray2002025-02-09
Visit Hash: 835364b0944dfcaec6d532f430c171785540e87834933f5efe84fa124c43d5e5


Explanation:

The visit_data combines all the relevant information into one string.

We then hash the combined string to get a unique hash for this visit.

If someone tries to change the cost or treatment, the hash will no longer match.

Recap of Key Concepts

Hashing: A one-way process that converts input data into a fixed-length string.
Hash Function: A mathematical function (like SHA-256) that produces a unique hash for each unique input.

hashlib Library: Python’s built-in library to create hash values using different hash functions like sha256, md5, etc.

------------------

Can Hashes Be Decoded?

No, hashes cannot be decoded. Here's why:

One-Way Function: Hashing is a one-way function, meaning that once data is hashed, you cannot reverse the process to get the original data back. This is by design.

Fixed-Length Output: A hash function takes an input (which could be of any length) and converts it into a fixed-length string. For example, the SHA-256 function always produces a 256-bit (64-character) output, regardless of the size of the input. This means that even if two different inputs have the same hash length, you cannot tell anything about the input from the hash itself.

No Reversible Process: Unlike encryption, which has a method to decrypt the data and get the original information back, hashing does not have a decryption method. Once you generate a hash, the original data is lost, and it’s impossible to recover it.

Can Hashes Be Cracked?
While hashes cannot be decoded directly, they can sometimes be cracked or reversed through methods like:

* Brute Force: Trying all possible inputs until the hash matches.

* Rainbow Tables: Pre-computed tables of hash values for common inputs, like passwords. However, using a salt (random data added to the input before hashing) can make this much harder.

* Dictionary Attacks: Trying words from a list of common words (like from a dictionary) to see if any hash matches.
But even in these cases, the process is not decoding the hash. It’s just finding a match between the hash and a known input.

Example to Clarify
Let’s say we hash the word "hello":

In [None]:
import hashlib

data = "hello"
hashed_data = hashlib.sha256(data.encode()).hexdigest()

print(f"Original data: {data}")
print(f"Hashed data: {hashed_data}")


Original data: hello
Hashed data: 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824


Now, even if we know the hash

```
(2cf24dba5fb0a30e26e83b2ac5b9e29e1b169f3e4c315f2da14907a4381e9a1b)
```

 , there’s no way to reverse it to get the original input "hello" directly. The only way to figure out what input generated that hash is by trying every possible input (which is time-consuming and inefficient) or using a rainbow table (if the hash is simple).

Why Is This Useful?

This property of one-way functions is what makes hashing valuable in security contexts such as:

* Password Storage: Websites store hashed passwords instead of plain text. If someone gains access to the database, they can't easily see the users' passwords.
* Blockchain: In blockchain, hashing ensures that each block’s data cannot be tampered with. Any change in the data would completely change the hash, alerting the system to tampering.


Summary:
No, hashes cannot be decoded. They are one-way functions.
The only way to "break" a hash is by guessing the original input, but this is very difficult (especially if the input is long or complicated).
Hashing is used for data integrity and security, because the original data cannot be recovered from the hash.

-----------------

## Secured and Optimized Performance Ledger

3.1 Securing the Ledger: Hashing for Data Integrity

One of the most important aspects of securing data is hashing. A hash is like a "digital fingerprint" for any data.

Once we hash the data, it is nearly impossible to reverse the hash to get the original data. This ensures that our ledger’s records can’t be tampered with.

We will use the SHA-256 hash function (a common cryptographic hash function) to secure our records. By hashing the entire record (patient’s name, treatment, cost, and date), we can make sure that no one can change the record without being detected.

-------------------

Step 1: Add Hashing to the Ledger
First, we will import the hashlib library to generate the hash. Then, we will modify the function that adds visits to hash the data before storing it in the ledger.

In [None]:
import hashlib

# Initialize an empty hospital ledger (a dictionary where keys are patient names)
hospital_ledger_advanced = {}

# Function to generate a hash for a visit record
def generate_hash(patient_name, treatment, cost, date_of_visit):
    # Combine all visit details into one string
    visit_data = patient_name + treatment + str(cost) + date_of_visit
    # Generate the SHA-256 hash of the visit data
    visit_hash = hashlib.sha256(visit_data.encode()).hexdigest()
    return visit_hash

# Function to add or update patient visits
def add_patient_visit_advanced():
    # Get patient details from the user
    patient_name = input("Enter the patient's name: ")
    treatment = input("Enter the treatment received: ")
    cost = float(input("Enter the cost of the treatment: $"))
    date_of_visit = input("Enter the date of visit (YYYY-MM-DD): ")

    # Generate a hash for this visit record
    visit_hash = generate_hash(patient_name, treatment, cost, date_of_visit)

    # Create a dictionary for the visit with the hash
    visit = {
        "patient_name": patient_name,
        "treatment": treatment,
        "cost": cost,
        "date_of_visit": date_of_visit,
        "visit_hash": visit_hash  # Store the hash to verify data integrity
    }

    # Add the visit to the patient's list of visits
    if patient_name not in hospital_ledger_advanced:
        hospital_ledger_advanced[patient_name] = []

    hospital_ledger_advanced[patient_name].append(visit)
    print(f"Visit added for {patient_name} on {date_of_visit} for treatment {treatment} costing ${cost}.")
    print(f"Visit hash: {visit_hash}")

# Ask the user to add visits
add_patient_visit_advanced()
add_patient_visit_advanced()

# Display the advanced hospital ledger
print("\nAdvanced Hospital Ledger:")
for patient, visits in hospital_ledger_advanced.items():
    print(f"\nPatient: {patient}")
    for visit in visits:
        print(f"  Treatment: {visit['treatment']}, Cost: ${visit['cost']}, Date: {visit['date_of_visit']}, Hash: {visit['visit_hash']}")


Explanation:

The generate_hash function takes the patient’s name, treatment, cost, and date of visit, combines them into one string, and generates a SHA-256 hash.
This hash is stored alongside the other visit details in the ledger.
By storing the hash, we can later verify the integrity of each visit: if someone changes the visit data, the hash will no longer match, indicating that the data has been tampered with.


Exercise:

Add some visits and check the hash value.
Try changing one of the visit details and see how the hash changes, proving that the data is different.

------------------

3.2 Optimizing Performance: Fast Ledger Access with a Dictionary
In the previous sections, we stored visits using lists and dictionaries. While this is fine for small amounts of data, it can become slow if the ledger grows large. We can optimize our ledger to speed up access by using the hash table concept (which is already used by Python dictionaries).

----------------------

Step 1: Performance Optimization with Dictionary Access
The key to fast performance lies in the way we store and access data. Dictionaries allow fast lookups. We will continue to use the dictionary to store patient records, but we will focus on optimizing how we add, remove, and look up visits.


Let’s say we want to search for a patient's visits very quickly. Using a dictionary allows us to instantly access a patient's record with the patient’s name as the key. This avoids searching through all records, which can be slow.

In [None]:
# Initialize an empty hospital ledger (a dictionary where keys are patient names)
hospital_ledger_advanced = {}

# Function to add or update patient visits (optimized)
def add_patient_visit_advanced():
    # Get patient details from the user
    patient_name = input("Enter the patient's name: ")
    treatment = input("Enter the treatment received: ")
    cost = float(input("Enter the cost of the treatment: $"))
    date_of_visit = input("Enter the date of visit (YYYY-MM-DD): ")

    # Check if the patient already exists
    if patient_name in hospital_ledger_advanced:
        print(f"Updating visit record for {patient_name}.")
    else:
        print(f"Adding new visit record for {patient_name}.")

    # Generate a hash for this visit record
    visit_hash = generate_hash(patient_name, treatment, cost, date_of_visit)

    # Create a dictionary for the visit with the hash
    visit = {
        "treatment": treatment,
        "cost": cost,
        "date_of_visit": date_of_visit,
        "visit_hash": visit_hash  # Store the hash to verify data integrity
    }

    # Add the visit to the patient's list of visits (using a dictionary)
    if patient_name not in hospital_ledger_advanced:
        hospital_ledger_advanced[patient_name] = []

    hospital_ledger_advanced[patient_name].append(visit)
    print(f"Visit added for {patient_name} on {date_of_visit} for treatment {treatment} costing ${cost}.")
    print(f"Visit hash: {visit_hash}")

# Ask the user to add visits
add_patient_visit_advanced()
add_patient_visit_advanced()

# Searching for a patient’s visit quickly
search_patient = input("\nEnter patient name to search for: ")
if search_patient in hospital_ledger_advanced:
    print(f"\nVisit records for {search_patient}:")
    for visit in hospital_ledger_advanced[search_patient]:
        print(f"  Treatment: {visit['treatment']}, Cost: ${visit['cost']}, Date: {visit['date_of_visit']}, Hash: {visit['visit_hash']}")
else:
    print(f"Patient {search_patient} not found in the ledger.")


Enter the patient's name: kumar
Enter the treatment received: neuralink
Enter the cost of the treatment: $20
Enter the date of visit (YYYY-MM-DD): 5050-02-02
Adding new visit record for kumar.
Visit added for kumar on 5050-02-02 for treatment neuralink costing $20.0.
Visit hash: 1e83e6eca0fbc4b35199ccfb1f2fad3fad3b5078465f682218acdc5283eea95d
Enter the patient's name: kirti
Enter the treatment received: murti pooja
Enter the cost of the treatment: $0005
Enter the date of visit (YYYY-MM-DD): 1203-20-08
Adding new visit record for kirti.
Visit added for kirti on 1203-20-08 for treatment murti pooja costing $5.0.
Visit hash: 944e35c7b697ef56baf8bf95b02a955f1d24060bbc70826d14dd1186d2a0068f

Enter patient name to search for: kumar

Visit records for kumar:
  Treatment: neuralink, Cost: $20.0, Date: 5050-02-02, Hash: 1e83e6eca0fbc4b35199ccfb1f2fad3fad3b5078465f682218acdc5283eea95d


Explanation:

We still use a dictionary to store the patient’s name as the key and their list of visits as the value.
Searching for a patient is now instant because we are using the dictionary key (the patient’s name) to access the visits directly.

This optimizes performance, especially as the ledger grows large.


Exercise:

* Add multiple visits for different patients.

* Search for a patient’s visits by entering their name and see how fast the search is.

* Try adding a new patient, then update an existing one, and observe the performance.

-----------

3.3 Summary

In this section, we:

Secured the ledger by adding a hash for each visit. This ensures that once a visit is recorded, it can’t be tampered with.

Optimized the ledger’s performance by using a dictionary to store and quickly access patient visits, improving the speed of search and updates.

By introducing hashing, we learned how to protect our data from unauthorized changes. Using dictionaries allowed us to efficiently store and search for patient records, making our ledger both secure and fast.