## Importing CSVs



First, let's import two csv files that I've generated. They contain one column with the same header, and several other columns with different headers. 

This will import the csv and print out each row as a list.

In [53]:
import csv

with open ("food_amounts.csv") as csvfile:
    readCSV = csv.reader(csvfile, delimiter = ";")
    for row in readCSV:
        print (row)

['Food', 'Amount', 'Expiration']
['Apples', '4', '04-Jun']
['Bananas', '2', '06-Jun']
['Noodles', '10', '10-Sep']
['Peanuts', '200', '12-Dec']
['Sausages', '3', '13-Jun']
['Yogurt', '1', '12-Jun']
['Pineapples', '2', '03-Jul']


What I would like to do is turn these lists into a list of dictionaries, i.e.

`list = [{"Food": "Apples", "Amount": 4, "Expiration": "04-Jun"}, {"Food": "Bananas", "Amount": 2, }]`



To achieve this, I start by turning these lists into one big list. This can easily be done with **`.append()`**. 

In [2]:
csvList = []
with open ("food_amounts.csv") as csvfile:
    readCSV = csv.reader(csvfile, delimiter = ";")
    for row in readCSV:
        csvList.append(row)
print (csvList)
        

[['Food', 'Amount', 'Expiration'], ['Apples', '4', '04-Jun'], ['Bananas', '2', '06-Jun'], ['Noodles', '10', '10-Sep'], ['Peanuts', '200', '12-Dec'], ['Sausages', '3', '13-Jun'], ['Yogurt', '1', '12-Jun'], ['Pineapples', '2', '03-Jul']]


In [3]:
type (csvList)

list

Now I want to turn this list of lists into a list of dictionaries. To do that I have to assign certain list items as keys and other list items as values. 

The format to do this is the following:

**`dictionary [key] = "value"`**

To point to a specific item in the list, use the format `list [x] [y]`, where `x` is a list in the list, and `y` is an item in this list, e.g. `[0][0]` would be the first item in the first list, i.e. `"Food"`. 

In [4]:
csvDict = {}
csvDict [csvList[0][0]] = csvList[1][0] 
print (csvDict)

{'Food': 'Apples'}


This looks good. The first entry in my dictionary has the key `"Food"` and the value `"Apples"`. 

Now I want to loop over the whole list of lists to add it to this dictionary as I did with `"Food"` and `"Apples"`. To do this, it's easier to have two separate lists, one that contains only the keys and one that contains only the values. 

In [5]:
keysList = csvList [0] [:]
print (keysList)

['Food', 'Amount', 'Expiration']


In [9]:
valuesList = csvList [1:][:]
print (valuesList)

[['Apples', '4', '04-Jun'], ['Bananas', '2', '06-Jun'], ['Noodles', '10', '10-Sep'], ['Peanuts', '200', '12-Dec'], ['Sausages', '3', '13-Jun'], ['Yogurt', '1', '12-Jun'], ['Pineapples', '2', '03-Jul']]


I will assign values to keys in the same way as I did above, i.e. 

`dictionary [key] = "value"`

But now I want to loop over each item in the `valuesList` list. The first item in mey `keysList` should be the key for all first items in each list in `valuesList`, the second item in `keysList` should be the key for all the second items in each list in `valuesList` and so on. 

I first create an empty dictionary and then use `for` loops to fill it with the right items. In these `for` loops I will call each sublist `foodInformation`. `foodInformation` consists of three items: Food, Amount and Expiration.

In [16]:
newDict = {}


for foodInformation in valuesList:
    newDict [keysList [0]] = foodInformation [0] #key: Food
    print (newDict)
    
for foodInformation in valuesList:
    newDict [keysList [1]] = foodInformation [1] #key: Amount
    print (newDict)

for foodInformation in valuesList:
    newDict [keysList [2]] = foodInformation [2] #key: Expiration
    print (newDict)
   




{'Food': 'Apples'}
{'Food': 'Bananas'}
{'Food': 'Noodles'}
{'Food': 'Peanuts'}
{'Food': 'Sausages'}
{'Food': 'Yogurt'}
{'Food': 'Pineapples'}
{'Food': 'Pineapples', 'Amount': '4'}
{'Food': 'Pineapples', 'Amount': '2'}
{'Food': 'Pineapples', 'Amount': '10'}
{'Food': 'Pineapples', 'Amount': '200'}
{'Food': 'Pineapples', 'Amount': '3'}
{'Food': 'Pineapples', 'Amount': '1'}
{'Food': 'Pineapples', 'Amount': '2'}
{'Food': 'Pineapples', 'Amount': '2', 'Expiration': '04-Jun'}
{'Food': 'Pineapples', 'Amount': '2', 'Expiration': '06-Jun'}
{'Food': 'Pineapples', 'Amount': '2', 'Expiration': '10-Sep'}
{'Food': 'Pineapples', 'Amount': '2', 'Expiration': '12-Dec'}
{'Food': 'Pineapples', 'Amount': '2', 'Expiration': '13-Jun'}
{'Food': 'Pineapples', 'Amount': '2', 'Expiration': '12-Jun'}
{'Food': 'Pineapples', 'Amount': '2', 'Expiration': '03-Jul'}


This could be written shorter if I don't make three separate `for` loops but add everything to just one single `for` loop.

In [22]:
newDict = {}

for foodInformation in valuesList:
    newDict [keysList [0]] = foodInformation [0]
    newDict [keysList [1]] = foodInformation [1]
    newDict [keysList [2]] = foodInformation [2]
    print (newDict)


{'Food': 'Apples', 'Amount': '4', 'Expiration': '04-Jun'}
{'Food': 'Bananas', 'Amount': '2', 'Expiration': '06-Jun'}
{'Food': 'Noodles', 'Amount': '10', 'Expiration': '10-Sep'}
{'Food': 'Peanuts', 'Amount': '200', 'Expiration': '12-Dec'}
{'Food': 'Sausages', 'Amount': '3', 'Expiration': '13-Jun'}
{'Food': 'Yogurt', 'Amount': '1', 'Expiration': '12-Jun'}
{'Food': 'Pineapples', 'Amount': '2', 'Expiration': '03-Jul'}


Even shorter would be to also loop over the position in `foodInformation`. This can be done by adding another `for` loop into the `for` loop and using the length of `foodInformation`. 

In [28]:
print (len(foodInformation))

3


In [29]:
newDict = {}

for foodInformation in valuesList:
    for i in range (0, len(foodInformation)):
        newDict [keysList [i]] = foodInformation [i]
    print (newDict)

{'Food': 'Apples', 'Amount': '4', 'Expiration': '04-Jun'}
{'Food': 'Bananas', 'Amount': '2', 'Expiration': '06-Jun'}
{'Food': 'Noodles', 'Amount': '10', 'Expiration': '10-Sep'}
{'Food': 'Peanuts', 'Amount': '200', 'Expiration': '12-Dec'}
{'Food': 'Sausages', 'Amount': '3', 'Expiration': '13-Jun'}
{'Food': 'Yogurt', 'Amount': '1', 'Expiration': '12-Jun'}
{'Food': 'Pineapples', 'Amount': '2', 'Expiration': '03-Jul'}


At the moment the new key:value pairs are not added to `newDict` but instead overwrite the previous key:value pair. I will all add them to a new list calles `newList` by using the `.append()` function for lists. 

In [51]:
newList = [] #create an empty list

for foodInformation in valuesList:
    newDict = {} #create an empty dictionary
    for i in range (0, len(foodInformation)):
        newDict [keysList [i]] = foodInformation [i]
    newList.append(newDict) #add the new entry of newDict to the list newList
    
print (newList)

[{'Food': 'Apples', 'Amount': '4', 'Expiration': '04-Jun'}, {'Food': 'Bananas', 'Amount': '2', 'Expiration': '06-Jun'}, {'Food': 'Noodles', 'Amount': '10', 'Expiration': '10-Sep'}, {'Food': 'Peanuts', 'Amount': '200', 'Expiration': '12-Dec'}, {'Food': 'Sausages', 'Amount': '3', 'Expiration': '13-Jun'}, {'Food': 'Yogurt', 'Amount': '1', 'Expiration': '12-Jun'}, {'Food': 'Pineapples', 'Amount': '2', 'Expiration': '03-Jul'}]


So now I have a list containing a dictionary with the `foodInformation`.

Since I want to do this with at least another csv file, I will write the above code as a function that creates a list containing a dictionary from a csv file. 

In [90]:
def csvToListDic (inputCsv):
    "Converts a csv into a list of dictionaries." #docstring
    
    import csv #import the csv package

    #create an empty list that will contain the data from the csv
    csvList = [] 
    
    #read in the csv file and convert each row into a list
    with open (inputCsv) as csvfile:
        readCSV = csv.reader(csvfile, delimiter = ";")
        for row in readCSV:
            csvList.append(row)
    
    #make a new list containing the keys (1st row = keys)
    keysList = csvList [0][:]
    
    #make a new list containing the values (all other rows)
    valuesList = csvList [1:][:]

    #create an empty list that will contain the dictionary
    newList = [] 

    for foodInformation in valuesList:
        newDict = {} #create an empty dictionary
        for i in range (0, len(foodInformation)):
            newDict [keysList [i]] = foodInformation [i]
            
        #add the new entry of newDict to the list newList
        newList.append(newDict)
        
    #return the list newList
    return newList 

In [98]:
amounts = csvToListDic ("food_amounts.csv")
print (amounts)

[{'Food': 'Apples', 'Amount': '4', 'Expiration': '04-Jun'}, {'Food': 'Bananas', 'Amount': '2', 'Expiration': '06-Jun'}, {'Food': 'Noodles', 'Amount': '10', 'Expiration': '10-Sep'}, {'Food': 'Peanuts', 'Amount': '200', 'Expiration': '12-Dec'}, {'Food': 'Sausages', 'Amount': '3', 'Expiration': '13-Jun'}, {'Food': 'Yogurt', 'Amount': '1', 'Expiration': '12-Jun'}, {'Food': 'Pineapples', 'Amount': '2', 'Expiration': '03-Jul'}]


In [99]:
prices = csvToListDic ("food_prices.csv")
print (prices)

[{'Food': 'Bananas', 'Price': '12', 'Bought': 'TRUE'}, {'Food': 'Pineapples', 'Price': '87', 'Bought': 'FALSE'}, {'Food': 'Cheese', 'Price': '50', 'Bought': 'FALSE'}, {'Food': 'Yogurt', 'Price': '10', 'Bought': 'TRUE'}, {'Food': 'Toast', 'Price': '23', 'Bought': 'TRUE'}, {'Food': 'Sausages', 'Price': '42', 'Bought': 'TRUE'}, {'Food': 'Tomatoes', 'Price': '11', 'Bought': 'FALSE'}]


Now finally I would like to merge these two lists of dictionaries, so that I get a new list containing dictionaries with keys and values from both `amounts` and `prices`. 

In [101]:
mergedDic1 = {x["Food"]:x for x in amounts + prices}.values()
print (results)

dict_values([{'Food': 'Apples', 'Amount': '4', 'Expiration': '04-Jun'}, {'Food': 'Bananas', 'Price': '12', 'Bought': 'TRUE'}, {'Food': 'Noodles', 'Amount': '10', 'Expiration': '10-Sep'}, {'Food': 'Peanuts', 'Amount': '200', 'Expiration': '12-Dec'}, {'Food': 'Sausages', 'Price': '42', 'Bought': 'TRUE'}, {'Food': 'Yogurt', 'Price': '10', 'Bought': 'TRUE'}, {'Food': 'Pineapples', 'Price': '87', 'Bought': 'FALSE'}, {'Food': 'Cheese', 'Price': '50', 'Bought': 'FALSE'}, {'Food': 'Toast', 'Price': '23', 'Bought': 'TRUE'}, {'Food': 'Tomatoes', 'Price': '11', 'Bought': 'FALSE'}])


So this solution didn't work as I wanted. It didn't *merge* the entries that have values in both lists, but instead deleted/overwrote them (e.g. the `"Bananas"` and `"Pineapple"` entries are now only present from the `"prices` list). Dict2 overwrote Dict1. 



Here is another solution that defines a function `merge_lists` with the attributes `list1`, `list2` and `dictKey`. 

The **`update()`** method updates a dictionary with the elements from another dictionary. It adds elements to the dictionary if the key is not in the dictionary. If the key is in the dictionary, it updates the key with the new value. 

In [158]:
def merge_lists (list1, list2, dictKey):
    merged = {}
    for item in list1 + list2: #for each dictionary in both lists
        if item[dictKey] in merged: #if dictionary with specified key is already in merged
            merged [item[dictKey]].update(item)
            print (item)
        else:
            merged[item[dictKey]] = item
    return [val for (_, val) in merged.items()]

In [157]:
print (merge_lists (amounts, prices, "Food"))


{'Food': 'Bananas', 'Price': '12', 'Bought': 'TRUE'}
{'Food': 'Pineapples', 'Price': '87', 'Bought': 'FALSE'}
{'Food': 'Yogurt', 'Price': '10', 'Bought': 'TRUE'}
{'Food': 'Sausages', 'Price': '42', 'Bought': 'TRUE'}
[{'Food': 'Apples', 'Amount': '4', 'Expiration': '04-Jun'}, {'Food': 'Bananas', 'Amount': '2', 'Expiration': '06-Jun', 'Price': '12', 'Bought': 'TRUE'}, {'Food': 'Noodles', 'Amount': '10', 'Expiration': '10-Sep'}, {'Food': 'Peanuts', 'Amount': '200', 'Expiration': '12-Dec'}, {'Food': 'Sausages', 'Amount': '3', 'Expiration': '13-Jun', 'Price': '42', 'Bought': 'TRUE'}, {'Food': 'Yogurt', 'Amount': '1', 'Expiration': '12-Jun', 'Price': '10', 'Bought': 'TRUE'}, {'Food': 'Pineapples', 'Amount': '2', 'Expiration': '03-Jul', 'Price': '87', 'Bought': 'FALSE'}, {'Food': 'Cheese', 'Price': '50', 'Bought': 'FALSE'}, {'Food': 'Toast', 'Price': '23', 'Bought': 'TRUE'}, {'Food': 'Tomatoes', 'Price': '11', 'Bought': 'FALSE'}]


For now I don't quite understand how the `merge_lists` function works. What exactly are the if and else conditions doing? Why is there a `+` between list1 and list2? If I exchange it for an `and` I get a different output, and if I change it for an `or` I get a different output. 
I also don't understand the `return` statement. In the original Stackoverflow answer, the `return` statement is simply `merged.values()`, but this results in a slightly different output than the `[val for (_, val) in merged.items()]` that I got from a blog pots. 
