This problem is the same as merge in Pandas and inner join in SQL. One soluton: loop through list1, and for each element in list1 you loop through every element in list2. That's a double for loop and the complexity of such a code would be **O(N^2)**.

Instead we could build a dictionary for list1. Then we will loop through the elements of list2 and join that way. The complexity would be **O(N)**.

In [None]:
def my_join(list1, list2):
    dict1 = {item[0]:item[1:] for item in list1}
    result = list()
    
    for item in list2:
        if item[0] in dict1:
            result.append([item[0]] + dict1[item[0]] + item[1:])
        
    return result

In general though, the dictionary building part is done prior to running the join/merge query. That step is called **indexing**. A dictionary type index is called **hash table index**. There are other indexing used, like b-tree. Each has its advantages and disadvantages. We'll be looking at those in the coming weeks.

In [None]:
def inner_join(list1, list2):
    dict1 = {item[0]:item[1:] for item in list1}
    dict2 = {item[0]:item[1:] for item in list2}
    
    result = list()
    
    for key, value in dict1.items():
        if key in dict2:
            result.append([key] + dict1[key] + dict2[key])
        
    return result

Below is a little routine to test the performance of the functions. You can use it to test your code. Try varying out the lists, dictionaries and list comprehensions and see what the difference in performance is. Of course, it will also depend on the speed of your computer.

In [None]:
import random
random.seed(123)
test1 = [[index] + [random.random() for _ in range(100)] for index in range(1000)]
test2 = [[index] + [random.random() for _ in range(100)] for index in range(1000)]

In [None]:
%timeit my_join(test1, test2)
%timeit inner_join(test1, test2)