In [5]:
class Chain_Joins:
    def hash_join(self, R1, R2):
        hash_table = {}
        # Building hash table for R2
        for tup in R2:
            key = tup[0]
            if key in hash_table:
                hash_table[key].append(tup)
            else:
                hash_table[key] = [tup]

        result = []
        # Perform the join
        for tup in R1:
            key = tup[-1]
            if key in hash_table:
                for r2_tup in hash_table[key]:
                    result.append(tup + (r2_tup[-1],))
        return result
    def chain_join(self, relations):
        current_result = relations[0]
        for R in relations[1:]:
            current_result = self.hash_join(current_result, R)
        return current_result

In [9]:
R1 = [
    (1, 2),
    (3, 4)
]

R2 = [
    (2, 5),
    (4, 6)
]

R3 = [
    (5, 7),
    (6, 8)
]

R4 = [
    (7, 9),
    (8, 10)
]

R5 = [
    (9, 11),
    (10, 12)
]

R = [R1,R2, R3, R4, R5]
chain_joiner = Chain_Joins()
final_result = chain_joiner.chain_join(R)
print("Final Result:")
for line in final_result:
    print(line)

Final Result:
(1, 2, 5, 7, 9, 11)
(3, 4, 6, 8, 10, 12)


In [2]:
def chain_join(relations):
    current_result = relations[0]
    for R in relations[1:]:
        current_result = hash_join(current_result, R)
    return current_result

In [3]:
R1 = [
    (1, 2),
    (3, 4)
]

R2 = [
    (2, 5),
    (4, 6)
]

R3 = [
    (5, 7),
    (6, 8)
]

R4 = [
    (7, 9),
    (8, 10)
]

R5 = [
    (9, 11),
    (10, 12)
]

relations = [R1,R2]

print(chain_join(relations))

[(1, 2, 5), (3, 4, 6)]


**Problem Statement:** Implement a simpler algorithm to evaluate a line join query as follows. Assume q(A1, . . . , Ak+1) : −R1(A1, A2), R2(A2, A3), . . . , Rk(Ak, Ak+1). We first compute R1−2 = R1 I R2 using the implementation from Problem 1. Then you compute R1−3 = R1−2 I R3 and so on. In the end, you compute R1−k = R1−(k−1) I Rk.

**Algorithm Implementation:**

The algorithm is structured into two main parts:

1. **The Join Function:**
Purpose: This function performs an inner join between two relations. For each tuple in the first relation, it searches for tuples in the second relation where the join condition is met. The join condition is that the last attribute of a tuple from the first relation matches the first attribute of a tuple from the second relation.
Implementation: It iterates over each tuple in the first relation, then iterates over each tuple in the second relation. If the last element of the tuple from the first relation equals the first element of the tuple from the second relation, the tuples are joined. The join excludes the repeated attribute to avoid redundancy in the resulting tuples.

2. **Chain Join Function:**
Purpose: This function sequentially applies the join operation across a series of relations, starting from the first relation and incrementally incorporating each subsequent relation through the join.
Implementation: It initializes the result with the first relation and iteratively joins this result with each subsequent relation in the list. The join operation is performed using the previously defined join function.


**Time Complexity Analysis:**

1. `hash_join` function:
   - Building the hash table for R2 takes O(n) time, where n is the number of tuples in R2.
   - Performing the join operation takes O(m) time, where m is the number of tuples in R1. This is because we iterate over each tuple in R1, and for each tuple, we perform a constant-time lookup in the hash table and join operations.
   - Therefore, the overall time complexity of the `hash_join` function is O(m + n).

2. `chain_join` function:
   - The `chain_join` function iterates over the list of relations, and for each relation, it calls the `hash_join` function with the current result and the new relation.
   - Assuming there are k relations, and each relation has N tuples, the time complexity of the `chain_join` function is O(k * N^2).


**Correctness Explanation**
1. Compliance with Join Conditions: Each step of the join ensures that tuples are combined only when they meet the specified join condition (matching attributes as per the chain's requirements). This ensures that no invalid tuple combinations are included in the result.
2. Sequential Processing: The chain join operation processes the relations in a sequential manner, preserving the logical order of joins as specified. This maintains the relational model's integrity, where each join builds upon the last, aligning with how SQL and other relational databases process multi-table joins.
3. No Data Loss or Duplication: The join excludes the repeated attribute from the second tuple, preventing redundancy in the output. Moreover, since every tuple that meets the join condition is included, no valid data is lost.
4. Generality and Flexibility: The implementation is general and flexible, working with any number of relations as long as they are connected by compatible join attributes. This simulates a real-world scenario where different database tables might be joined based on common keys.