Good reference: 
https://youtu.be/qHMLy5JjbjQ?feature=shared

### What Are Merkle Trees?  
A **Merkle Tree** is a data structure used in cryptography and computer science to efficiently verify the integrity and consistency of large sets of data. It is a binary tree where:  
1. **Leaf nodes** contain cryptographic hashes of data blocks.  
2. **Non-leaf nodes** contain hashes of their child nodes.  
3. The **root hash (Merkle Root)** represents the entire dataset's integrity.  

Merkle Trees are widely used in blockchain, distributed systems, and version control to verify data without transferring entire datasets.

---

### Example: Resolving Conflicts in Two Merkle Trees  
Let's assume we have two Merkle Trees representing two versions of a dataset (e.g., two nodes in a blockchain network or two repositories in a distributed system). The goal is to find differences efficiently and resolve conflicts.

#### **Scenario**  
- **Merkle Tree A** belongs to Node A  
- **Merkle Tree B** belongs to Node B  
- Each leaf represents a file (or transaction)  
- If the root hashes match, both datasets are identical. If not, we traverse the tree to find discrepancies.

#### **Step-by-Step Conflict Resolution**
1. **Compare the Merkle Root of A and B**
   - If they match, no conflict exists.
   - If they differ, at least one subtree has different data.

2. **Traverse the Tree to Find the Mismatched Subtree**
   - Compare each level's hashes.
   - When a mismatch is found, drill down to the child nodes.

3. **Resolve Conflicts**
   - If a leaf node differs, determine the latest version (e.g., using timestamps or version numbers).
   - If a node exists in A but not in B (or vice versa), decide whether to merge or delete the extra data.

#### **Example Tree Structure**
Consider two trees representing versions of a dataset:

```
  Tree A                   Tree B
     H1                        H2
    /   \                     /   \
  H3     H4                H3'    H5
 /  \    /  \              /  \    /  \
D1  D2  D3  D4          D1  D2  D3'  D5
```

Here:
- `H1 ≠ H2`, so we check their children.
- `H3 ≠ H3'` → Drill down further.
- `D3 ≠ D3'` → Conflict detected.
- `D4` missing in Tree B → Determine if it should be added.

#### **Resolution**
- If `D3'` is a newer version of `D3`, update A’s `D3` to `D3'`.
- If `D4` should exist in B, add it.

After resolution, both trees will have the same Merkle Root, ensuring consistency.

Here's a complete **Python implementation** of a **Merkle Tree** along with a **dummy distributed user profile database** that demonstrates conflict resolution using the four strategies:  

1. **Last-Write-Wins (LWW)**
2. **Operational Transformation (OT)**
3. **Vector Clocks**
4. **Custom Merge Rules**

---

### **🔹 Overview**
- **`MerkleTree`**: Implements the Merkle Tree structure.
- **`UserProfileDB`**: Simulates a distributed database with replicas.
- **Conflict Resolution Mechanisms**: Demonstrates how different strategies work.

---

### **🔹 Code Implementation**
Let's start with the **Merkle Tree implementation** and **conflict resolution mechanisms** in a distributed database.

### **🔹 Explanation**
1. **Merkle Tree Construction**
   - Computes SHA-256 hashes for user profile data.
   - Constructs a binary tree to efficiently detect changes.

2. **Simulated Distributed Database**
   - **Stores user profiles** with timestamps.
   - Updates only if the incoming data is newer.

3. **Conflict Resolution Strategies**
   - **LWW**: Picks the latest timestamp.
   - **OT**: Merges non-conflicting changes.
   - **Vector Clocks**: Combines values if timestamps match.
   - **Custom Rule**: Mixes local and remote values.

---

### **🔹 Example Output**
```
LWW: alice.new@example.com, 54321
OT: alice.new@example.com, 54321
Vector Clock: alice.new@example.com / alice.remote@example.com, 54321
Custom: alice.new@example.com, 11111
```

Would you like any modifications or explanations on a specific strategy? 🚀

In [18]:
import hashlib
import time
from typing import List, Dict

# Function to compute SHA-256 hash
def sha256(data: str) -> str:
    return hashlib.sha256(data.encode()).hexdigest()

# Merkle Tree Node
class MerkleNode:
    def __init__(self, hash_value: str, left=None, right=None):
        self.hash = hash_value
        self.left = left
        self.right = right

# Merkle Tree Class
class MerkleTree:
    def __init__(self, data_blocks: List[str]):
        self.data_blocks = data_blocks
        self.root = self.build_tree(data_blocks)

    def build_tree(self, data: List[str]) -> MerkleNode:
        nodes = [MerkleNode(sha256(d)) for d in data]

        while len(nodes) > 1:
            new_level = []
            for i in range(0, len(nodes), 2):
                if i + 1 < len(nodes):
                    combined_hash = sha256(nodes[i].hash + nodes[i + 1].hash)
                    parent = MerkleNode(combined_hash, nodes[i], nodes[i + 1])
                    new_level.append(parent)
                else:
                    new_level.append(nodes[i])
            nodes = new_level
        return nodes[0]

# User Profile Database
class UserProfile:
    def __init__(self, name: str, email: str, phone: str, timestamp: int):
        self.name = name
        self.email = email
        self.phone = phone
        self.timestamp = timestamp

# Simulated Distributed Database with Conflict Resolution
class UserProfileDB:
    def __init__(self):
        self.db: Dict[str, UserProfile] = {}

    def update(self, id: str, name: str, email: str, phone: str, timestamp: int):
        if id not in self.db or self.db[id].timestamp < timestamp:
            self.db[id] = UserProfile(name, email, phone, timestamp)

    def resolve_conflict_lww(self, id: str, local: UserProfile, remote: UserProfile):
        if remote.timestamp > local.timestamp:
            self.db[id] = remote

    def resolve_conflict_ot(self, id: str, local: UserProfile, remote: UserProfile):
        if local.email == remote.email:
            self.db[id].phone = remote.phone  # Merge non-conflicting changes
        else:
            self.resolve_conflict_lww(id, local, remote)

    def resolve_conflict_vector_clock(self, id: str, local: UserProfile, remote: UserProfile):
        if local.timestamp == remote.timestamp:
            self.db[id].email = f"{local.email} / {remote.email}"  # Merge both
        else:
            self.resolve_conflict_lww(id, local, remote)

    def resolve_conflict_custom(self, id: str, local: UserProfile, remote: UserProfile):
        self.db[id].email = local.email  # Prioritize local email
        self.db[id].phone = remote.phone  # Prioritize remote phone

# Example Usage
if __name__ == "__main__":
    user_data = ["Alice:alice@example.com:12345", "Bob:bob@example.com:67890"]
    tree1 = MerkleTree(user_data)
    tree2 = MerkleTree(user_data)

    db = UserProfileDB()
    db.update("Alice", "Alice", "alice@example.com", "12345", 1000)
    db.update("Alice", "Alice", "alice.new@example.com", "54321", 2000)

    local = db.db["Alice"]
    remote = UserProfile("Alice", "alice.remote@example.com", "11111", 1500)

    db.resolve_conflict_lww("Alice", local, remote)
    print("LWW:", db.db["Alice"].email, db.db["Alice"].phone)

    db.resolve_conflict_ot("Alice", local, remote)
    print("OT:", db.db["Alice"].email, db.db["Alice"].phone)

    db.resolve_conflict_vector_clock("Alice", local, remote)
    print("Vector Clock:", db.db["Alice"].email, db.db["Alice"].phone)

    db.resolve_conflict_custom("Alice", local, remote)
    print("Custom:", db.db["Alice"].email, db.db["Alice"].phone)


LWW: alice.new@example.com 54321
OT: alice.new@example.com 54321
Vector Clock: alice.new@example.com 54321
Custom: alice.new@example.com 11111
