

- One way to represent a CSV file's content is a "dictionary of dictionaries."
    - Each row on the CSV file maps to a Python dictionary.
        - The keys for the dictionary representing the row come from the CSV file's header row.
        - The values for the dictionary representing the row come from the data in the row's columns.
    - One of the columns becomes the "key" for identifying a dictionary row in the dictionary representing the file.


- Consider the CSV file below.
```
Word,Relative_Frequency,User_Created,Times_Selected,Frequency,Original
the,1,,0,"1,086,322,084",
and,0.524925431,True,11,"570,238,088",
to,0.522060324,,0,"567,125,659",
```


- We could identify the column ```Word``` to the the column identifying rows. In this case, the dictionary would have the following logical structure

```
{
  "the": {
    "Word": "the",
    "Relative_Frequency": "1",
    "User_Created": "",
    "Times_Selected": "0",
    "Frequency": "1,086,322,084",
    "Original": ""
  },
  "and": {
    "Word": "and",
    "Relative_Frequency": "0.524925431",
    "User_Created": "True",
    "Times_Selected": "11",
    "Frequency": "570,238,088",
    "Original": "",
    "Time_Selected": 1
  },
  "to": {
    "Word": "to",
    "Relative_Frequency": "0.522060324",
    "User_Created": "",
    "Times_Selected": "0",
    "Frequency": "567,125,659",
    "Original": ""
  }
}
```

- For the extra-credit assignment, you will implement the functions below. The functions should work for any CSV file with a header row and correct formatting, e.g. each row has the correct number of columns/fields.


In [196]:
# Load the dictionary from a CSV file using key_column as the key
import csv, json, pprint
from collections import OrderedDict
d={}

def load(fn, key_column):
    global d
    with open(fn) as f:
        for row in csv.DictReader(f):
            json_data = json.loads(json.dumps(row))
            d[row[key_column]] = json_data
    f.close()
    return d
    # Your Code goes here

# Get an entry
def get_entry(k):
    global d
    return d[k]

# Add a new entry
def add_entry(k, v):
    # Your Code goes here 
    global d
    d[k] = v
    
# Update the value of an entry
def set_entry(k, v):
    global d
    if k in d:
        d[k] = v
    else:
        return "key not in dictionary"
    # Your Code goes here

# Get the value associated with key c in dictionary entry associated with k
def get_cell(k, c):
    global d
    return d[k][c]
    # Your Code goes here

# Set the entry
def set_cell(k, c, v):
    global d
    if k in d:
        d[k][c] =v
    else:
        return "key or column not in dictionary"
    # Your Code goes here

# Save the changes
def save():
    global d
    print("In writer.")
    
    f = open("./really-small.csv","w")
    csv_writing = csv.writer(f)
    print()
    
    #write all rows
    i = d.keys()[0]
    csv_writing.writerow(d[i].keys()) # header row
    row = []
    for key in d.keys():
        for innerKey in d[key].keys():
            row.append(d[key][innerKey])
        csv_writing.writerow(row)  # header row
        row = []
        
    f.close()
     # Your Code goes here

# Print the JSON representation.
def to_str():
    global d
    pprint.pprint(d)
     # Your Code goes here

- Some sample function calls using the small file above are:

In [197]:
# Python file where I put my implementation.
#import dictionary_nonoo as d

In [198]:
load("really-small.csv", "Word")
print(to_str())

{'and': {u'Frequency': u'570,238,088',
         u'Original': u'',
         u'Relative_Frequency': u'0.524925431',
         u'Times_Selected': u'11',
         u'User_Created': u'TRUE',
         u'Word': u'and'},
 'the': {u'Frequency': u'1,086,322,084',
         u'Original': u'',
         u'Relative_Frequency': u'1',
         u'Times_Selected': u'0',
         u'User_Created': u'',
         u'Word': u'the'},
 'to': {u'Frequency': u'567,125,659',
        u'Original': u'',
        u'Relative_Frequency': u'0.522060324',
        u'Times_Selected': u'0',
        u'User_Created': u'',
        u'Word': u'to'}}
None


In [199]:
add_entry("coney", { "Word": "coney", "Relative_Frequency": 0, "User_Created": "", "Times_Selected": 0, \
                     "Frequency": 1, "Original": "con"})

In [200]:
print(to_str())

{'and': {u'Frequency': u'570,238,088',
         u'Original': u'',
         u'Relative_Frequency': u'0.524925431',
         u'Times_Selected': u'11',
         u'User_Created': u'TRUE',
         u'Word': u'and'},
 'coney': {'Frequency': 1,
           'Original': 'con',
           'Relative_Frequency': 0,
           'Times_Selected': 0,
           'User_Created': '',
           'Word': 'coney'},
 'the': {u'Frequency': u'1,086,322,084',
         u'Original': u'',
         u'Relative_Frequency': u'1',
         u'Times_Selected': u'0',
         u'User_Created': u'',
         u'Word': u'the'},
 'to': {u'Frequency': u'567,125,659',
        u'Original': u'',
        u'Relative_Frequency': u'0.522060324',
        u'Times_Selected': u'0',
        u'User_Created': u'',
        u'Word': u'to'}}
None


In [201]:
print("Data for 'to' is ", get_entry("to"))

("Data for 'to' is ", {u'Word': u'to', u'Times_Selected': u'0', u'Frequency': u'567,125,659', u'User_Created': u'', u'Relative_Frequency': u'0.522060324', u'Original': u''})


In [202]:
to_selected = int(get_cell("to", "Times_Selected"))
print("'to' was selected ", to_selected, 'times.')
set_cell("to", "Times_Selected", to_selected+1)
print("'to' was NOW selected ", get_cell("to", "Times_Selected"), 'times.')

("'to' was selected ", 0, 'times.')
("'to' was NOW selected ", 1, 'times.')


In [203]:
save()

In writer.
()


In [204]:
load("really-small.csv", "Word")
print(to_str())

{'and': {u'Frequency': u'570,238,088',
         u'Original': u'',
         u'Relative_Frequency': u'0.524925431',
         u'Times_Selected': u'11',
         u'User_Created': u'TRUE',
         u'Word': u'and'},
 'coney': {u'Frequency': u'1',
           u'Original': u'con',
           u'Relative_Frequency': u'0',
           u'Times_Selected': u'0',
           u'User_Created': u'',
           u'Word': u'coney'},
 'the': {u'Frequency': u'1,086,322,084',
         u'Original': u'',
         u'Relative_Frequency': u'1',
         u'Times_Selected': u'0',
         u'User_Created': u'',
         u'Word': u'the'},
 'to': {u'Frequency': u'567,125,659',
        u'Original': u'',
        u'Relative_Frequency': u'0.522060324',
        u'Times_Selected': u'1',
        u'User_Created': u'',
        u'Word': u'to'}}
None


- For the submission
    - Test your functions for two different CSV files. The files must have different columns and use different key columns.
    - The tests for each file require
        - Loading the file
        - Print the file
        - Calling each function at least once
        - Saving the the file
        - Reloading and reprinting the file.