# Optimize dict comparison

Suppose that we have two lists of dictionary, the two dictionaries have shared keys. How to find matching dict in the two list?
- solution 1: use a nested loop, the complexity will be O(n*2)
- solution 2: convert the two list to dict. Use the shared keys of dict to build a new key, the value can be the item or the index of the item of the original list

In below example, we have two lists of dictionary. The shared keys are ['domain','year','tableName']

In [15]:
list1 = [
    {"id": 1, "domain":"d1","year":2010,"tableName":"tab1","matchDescIndex":-1},
    {"id": 2, "domain":"d1","year":2011,"tableName":"tab1","matchDescIndex":-1},
    {"id": 3, "domain":"d1","year":2012,"tableName":"tab1","matchDescIndex":-1},
    {"id": 4, "domain":"d1","year":2013,"tableName":"tab1","matchDescIndex":-1},
    {"id": 5, "domain":"d1","year":2014,"tableName":"tab1","matchDescIndex":-1},
    {"id": 6, "domain":"d1","year":2015,"tableName":"tab1","matchDescIndex":-1}
]

list2 = [
    {"id": 1, "domain":"d1","year":2010,"tableName":"tab1","matchDsIndex":-1},
    {"id": 2, "domain":"d1","year":2011,"tableName":"tab1","matchDsIndex":-1},
    {"id": 3, "domain":"d1","year":2012,"tableName":"tab1","matchDsIndex":-1},
    {"id": 4, "domain":"d1","year":2013,"tableName":"tab1","matchDsIndex":-1},
    {"id": 5, "domain":"d1","year":2014,"tableName":"tab1","matchDsIndex":-1},
    {"id": 6, "domain":"d1","year":2015,"tableName":"tab1","matchDsIndex":-1},
    {"id": 7, "domain":"d1","year":2016,"tableName":"tab1","matchDsIndex":-1}
]

In [16]:
# build the key for the converted dictionary
def buildKey(inDict):
    return frozenset({("domain",inDict["domain"]),("year",inDict["year"]),("tableName",inDict["tableName"])})

In [17]:
# convert list1 to a ref dict which uses a unique id (domain, year, tableName) as the key
refDict = {}
for index, item in enumerate(list1):
    key = buildKey(item)
    refDict[key]=index

print(refDict)

{frozenset({('tableName', 'tab1'), ('domain', 'd1'), ('year', 2010)}): 0, frozenset({('tableName', 'tab1'), ('year', 2011), ('domain', 'd1')}): 1, frozenset({('tableName', 'tab1'), ('domain', 'd1'), ('year', 2012)}): 2, frozenset({('tableName', 'tab1'), ('domain', 'd1'), ('year', 2013)}): 3, frozenset({('tableName', 'tab1'), ('domain', 'd1'), ('year', 2014)}): 4, frozenset({('tableName', 'tab1'), ('domain', 'd1'), ('year', 2015)}): 5}


In [18]:
matchingIndex = -1

# loop through the second list and try to find matching index of the first list
for item in list2:
    key=buildKey(item)
    if key in refDict:
        matchingIndex = refDict[key]
        item["matchDsIndex"]=matchingIndex


for item in list2:
    print(item)

{'id': 1, 'domain': 'd1', 'year': 2010, 'tableName': 'tab1', 'matchDsIndex': 0}
{'id': 2, 'domain': 'd1', 'year': 2011, 'tableName': 'tab1', 'matchDsIndex': 1}
{'id': 3, 'domain': 'd1', 'year': 2012, 'tableName': 'tab1', 'matchDsIndex': 2}
{'id': 4, 'domain': 'd1', 'year': 2013, 'tableName': 'tab1', 'matchDsIndex': 3}
{'id': 5, 'domain': 'd1', 'year': 2014, 'tableName': 'tab1', 'matchDsIndex': 4}
{'id': 6, 'domain': 'd1', 'year': 2015, 'tableName': 'tab1', 'matchDsIndex': 5}
{'id': 7, 'domain': 'd1', 'year': 2016, 'tableName': 'tab1', 'matchDsIndex': -1}
