## Storing the data used for TMA 02 16J, Q2 in a mongoDB database

In [1]:
!head -5 'data/EU-referendum-result-data.csv'

"id","Region_Code","Region","Area_Code","Area","Electorate","ExpectedBallots","VerifiedBallotPapers","Pct_Turnout","Votes_Cast","Valid_Votes","Remain","Leave","Rejected_Ballots","No_official_mark","Voting_for_both_answers","Writing_or_mark","Unmarked_or_void","Pct_Remain","Pct_Leave","Pct_Rejected"
"108","E12000006","East","E06000031","Peterborough","120892","87474","87469","72.35","87469","87392","34176","53216","77","0","32","7","38","39.11","60.89","0.09"
"109","E12000006","East","E06000032","Luton","127612","84633","84636","66.31","84616","84481","36708","47773","135","0","85","0","50","43.45","56.55","0.16"
"112","E12000006","East","E06000033","Southend-on-Sea","128856","93948","93939","72.90","93939","93870","39348","54522","69","0","21","0","48","41.92","58.08","0.07"
"113","E12000006","East","E06000034","Thurrock","109897","79969","79954","72.75","79950","79916","22151","57765","34","0","8","3","23","27.72","72.28","0.04"


In [2]:
!wc -l 'data/EU-referendum-result-data.csv'

383 data/EU-referendum-result-data.csv


The command to import files into Mongo is `mongoimport`. It imports a file into a specified collection in the specified database. It takes a number of parameters, but these are the most useful to you:

* `drop` drops the collection if it exists already
* `db` and `collection` specify where the imported data should go
* `headerline` indicates that the first line in the file contains the column names, which will be used as keys for the created documents
* `ignoreBlanks` means that keys with empty values will not be created in the imported documents
* `file` tells `mongoimport` where the data resides.

In [3]:
!/usr/bin/mongoimport --port 27351 --drop --db referendum --collection resultsdata \
    --type csv --headerline --ignoreBlanks \
    --file data/EU-referendum-result-data.csv

2018-02-22T18:21:21.146+0000	connected to: localhost:27351
2018-02-22T18:21:21.147+0000	dropping: referendum.resultsdata
2018-02-22T18:21:21.237+0000	imported 382 documents


In [4]:
# Import the required libraries
import pymongo
import bson

In [5]:
# Open a connection to the Mongo server
client = pymongo.MongoClient('mongodb://localhost:27351/')

In [6]:
# Open the imported database and collection.
db = client.referendum
results = db.resultsdata

In [8]:
# Check the number of documents matches that given above
results.find().count()

382

In [9]:
# Look at one document
results.find_one()

{'Area': 'Peterborough',
 'Area_Code': 'E06000031',
 'Electorate': 120892,
 'ExpectedBallots': 87474,
 'Leave': 53216,
 'No_official_mark': 0,
 'Pct_Leave': 60.89,
 'Pct_Rejected': 0.09,
 'Pct_Remain': 39.11,
 'Pct_Turnout': 72.35,
 'Region': 'East',
 'Region_Code': 'E12000006',
 'Rejected_Ballots': 77,
 'Remain': 34176,
 'Unmarked_or_void': 38,
 'Valid_Votes': 87392,
 'VerifiedBallotPapers': 87469,
 'Votes_Cast': 87469,
 'Voting_for_both_answers': 32,
 'Writing_or_mark': 7,
 '_id': ObjectId('5a8f0a21da8ba752b13bccaa'),
 'id': 108}