# MongoDB - Lab
https://github.com/learn-co-students/dsc-mongodb-lab-onl01-dtsc-pt-041320/tree/solution

## Introduction

In this lesson, we'll get some hands-on experience with MongoDB!

## Objectives
You will be able to: 

- Create a MongoDB database   
- Insert data into a MongoDB database   
- Read data from a MongoDB database   
- Update data in a MongoDB database   

## Getting Started

To begin this lab, make sure that you start up the mongoDB server in your terminal first! **You must do this lab locally on your computer, not in Learn.**


## Connecting to the MongoDB Database

In the cell below, import the appropriate library and connect to the mongoDB server. Create a new database called `'lab_db'`.

In [1]:
import pymongo
client = pymongo.MongoClient()
client

MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True)

In [2]:
client.list_database_names()

['admin', 'config', 'example_database', 'lab_db', 'local', 'test']

## Creating a Collection

Now, create a collection called `'lab_collection'` inside `'lab_db'`.

In [3]:
db = client['lab_db']
db

Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True), 'lab_db')

In [4]:
db.list_collection_names()

['lab_collection']

In [5]:
collection = db['lab_collection']
collection.delete_many({})

<pymongo.results.DeleteResult at 0x10a739b48>

## Adding Some Data

Now, we're going to add some example records into our database. In the cells below, create dictionary representations of the following customer records:


|     Name     |            Email           |  Mailing_Address  | Balance |                         Notes                         |
|:------------:|:--------------------------:|:-----------------:|:-------:|:-----------------------------------------------------:|
|  John Smith  |    j.smith@thesmiths.com   | 123 mulberry lane |   0.0   |    Called technical support, issue not yet resolved   |
|  Jane Smith  |  jane_smith@thesmiths.com  |         Null          |  25.00  |                   Null                                    |
|  Adam Enbar  | adam@theflatironschool.com |    11 Broadway    |  14.99  |           Set up on recurring billing cycle           |
| Avi Flombaum |  avi@theflatironschool.com |    11 Broadway    |   0.0   |                   Null                                    |
|   Steven S.  |     steven.s@gmail.net     |         Null          |  -20.23 | Refunded for overpayment due to price match guarantee |


Your first challenge is to take all of the data in the table above and create the corresponding documents and add then to our mongo database. Note that fields that contain 'Null' should not be included in the document (recall that since mongo is schema-less, each document can be different). 

Create the documents from the table listed above, and then use `insert_many()` to insert them into the collection.

|     Name     |            Email           |  Mailing_Address  | Balance |                         Notes                         |
|:------------:|:--------------------------:|:-----------------:|:-------:|:-----------------------------------------------------:|
|  John Smith  |    j.smith@thesmiths.com   | 123 mulberry lane |   0.0   |    Called technical support, issue not yet resolved   |
|  Jane Smith  |  jane_smith@thesmiths.com  |         Null          |  25.00  |                   Null                                    |
|  Adam Enbar  | adam@theflatironschool.com |    11 Broadway    |  14.99  |           Set up on recurring billing cycle           |
| Avi Flombaum |  avi@theflatironschool.com |    11 Broadway    |   0.0   |                   Null                                    |
|   Steven S.  |     steven.s@gmail.net     |         Null          |  -20.23 | Refunded for overpayment due to price match guarantee |

In [9]:
import pandas as pd
df = pd.read_clipboard(sep='|')
df

Unnamed: 0.1,Unnamed: 0,Name,Email,Mailing_Address,Balance,Notes,Unnamed: 6
0,,:------------:,:--------------------------:,:-----------------:,:-------:,:---------------------------------------------...,
1,,John Smith,j.smith@thesmiths.com,123 mulberry lane,0.0,"Called technical support, issue not yet re...",
2,,Jane Smith,jane_smith@thesmiths.com,Null,25.00,Null ...,
3,,Adam Enbar,adam@theflatironschool.com,11 Broadway,14.99,Set up on recurring billing cycle ...,
4,,Avi Flombaum,avi@theflatironschool.com,11 Broadway,0.0,Null ...,
5,,Steven S.,steven.s@gmail.net,Null,-20.23,Refunded for overpayment due to price match g...,


In [10]:
df.columns 

Index(['Unnamed: 0', '     Name     ', '            Email           ',
       '  Mailing_Address  ', ' Balance ',
       '                         Notes                         ',
       'Unnamed: 6'],
      dtype='object')

In [11]:
drop_cols = [col for col in df.columns if 'Unnamed'in col]
drop_cols

['Unnamed: 0', 'Unnamed: 6']

In [12]:
df.drop(columns=drop_cols,inplace=True)
df.columns

Index(['     Name     ', '            Email           ', '  Mailing_Address  ',
       ' Balance ', '                         Notes                         '],
      dtype='object')

In [13]:
[col.strip() for col in df.columns]

['Name', 'Email', 'Mailing_Address', 'Balance', 'Notes']

In [14]:
df.columns = [col.strip() for col in df.columns]
df.head()

Unnamed: 0,Name,Email,Mailing_Address,Balance,Notes
0,:------------:,:--------------------------:,:-----------------:,:-------:,:---------------------------------------------...
1,John Smith,j.smith@thesmiths.com,123 mulberry lane,0.0,"Called technical support, issue not yet re..."
2,Jane Smith,jane_smith@thesmiths.com,Null,25.00,Null ...
3,Adam Enbar,adam@theflatironschool.com,11 Broadway,14.99,Set up on recurring billing cycle ...
4,Avi Flombaum,avi@theflatironschool.com,11 Broadway,0.0,Null ...


In [15]:
df.drop(0,inplace=True)
df

Unnamed: 0,Name,Email,Mailing_Address,Balance,Notes
1,John Smith,j.smith@thesmiths.com,123 mulberry lane,0.0,"Called technical support, issue not yet re..."
2,Jane Smith,jane_smith@thesmiths.com,Null,25.0,Null ...
3,Adam Enbar,adam@theflatironschool.com,11 Broadway,14.99,Set up on recurring billing cycle ...
4,Avi Flombaum,avi@theflatironschool.com,11 Broadway,0.0,Null ...
5,Steven S.,steven.s@gmail.net,Null,-20.23,Refunded for overpayment due to price match g...


In [16]:
df['Name'].iloc[0]

'  John Smith  '

In [17]:
df['Name'].iloc[0].strip()

'John Smith'

In [18]:
for col in df.columns:
    df[col] = df[col].map(lambda x: x.strip())
df

Unnamed: 0,Name,Email,Mailing_Address,Balance,Notes
1,John Smith,j.smith@thesmiths.com,123 mulberry lane,0.0,"Called technical support, issue not yet resolved"
2,Jane Smith,jane_smith@thesmiths.com,Null,25.0,Null
3,Adam Enbar,adam@theflatironschool.com,11 Broadway,14.99,Set up on recurring billing cycle
4,Avi Flombaum,avi@theflatironschool.com,11 Broadway,0.0,Null
5,Steven S.,steven.s@gmail.net,Null,-20.23,Refunded for overpayment due to price match gu...


In [19]:
df['Balance']=df['Balance'].astype(float)

In [20]:
records = df.to_dict('records')
records

[{'Name': 'John Smith',
  'Email': 'j.smith@thesmiths.com',
  'Mailing_Address': '123 mulberry lane',
  'Balance': 0.0,
  'Notes': 'Called technical support, issue not yet resolved'},
 {'Name': 'Jane Smith',
  'Email': 'jane_smith@thesmiths.com',
  'Mailing_Address': 'Null',
  'Balance': 25.0,
  'Notes': 'Null'},
 {'Name': 'Adam Enbar',
  'Email': 'adam@theflatironschool.com',
  'Mailing_Address': '11 Broadway',
  'Balance': 14.99,
  'Notes': 'Set up on recurring billing cycle'},
 {'Name': 'Avi Flombaum',
  'Email': 'avi@theflatironschool.com',
  'Mailing_Address': '11 Broadway',
  'Balance': 0.0,
  'Notes': 'Null'},
 {'Name': 'Steven S.',
  'Email': 'steven.s@gmail.net',
  'Mailing_Address': 'Null',
  'Balance': -20.23,
  'Notes': 'Refunded for overpayment due to price match guarantee'}]

In [21]:
list(records[0].keys())

['Name', 'Email', 'Mailing_Address', 'Balance', 'Notes']

In [22]:

# customer_1 = None
# customer_2 = None
# customer_3 = None
# customer_4 = None
# customer_5 = None

# all_records = None

insertion_results = collection.insert_many(records)


Now, access the appropriate attribute from the results object returned from the insertion to see the unique IDs for each record inserted, so that we can confirm each were inserted correctly. 

In [23]:
insertion_results.inserted_ids

[ObjectId('5ebc5d8e6457b8a3fe3c83fc'),
 ObjectId('5ebc5d8e6457b8a3fe3c83fd'),
 ObjectId('5ebc5d8e6457b8a3fe3c83fe'),
 ObjectId('5ebc5d8e6457b8a3fe3c83ff'),
 ObjectId('5ebc5d8e6457b8a3fe3c8400')]

## Querying and Filtering

In the cell below, return the name and email address for every customer record. Then, print every item from the query to show that it worked correctly. 

In [24]:
query_1 = collection.find({})
[print(x) for x in query_1]

{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fc'), 'Name': 'John Smith', 'Email': 'j.smith@thesmiths.com', 'Mailing_Address': '123 mulberry lane', 'Balance': 0.0, 'Notes': 'Called technical support, issue not yet resolved'}
{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fd'), 'Name': 'Jane Smith', 'Email': 'jane_smith@thesmiths.com', 'Mailing_Address': 'Null', 'Balance': 25.0, 'Notes': 'Null'}
{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fe'), 'Name': 'Adam Enbar', 'Email': 'adam@theflatironschool.com', 'Mailing_Address': '11 Broadway', 'Balance': 14.99, 'Notes': 'Set up on recurring billing cycle'}
{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83ff'), 'Name': 'Avi Flombaum', 'Email': 'avi@theflatironschool.com', 'Mailing_Address': '11 Broadway', 'Balance': 0.0, 'Notes': 'Null'}
{'_id': ObjectId('5ebc5d8e6457b8a3fe3c8400'), 'Name': 'Steven S.', 'Email': 'steven.s@gmail.net', 'Mailing_Address': 'Null', 'Balance': -20.23, 'Notes': 'Refunded for overpayment due to price match guarantee'}


[None, None, None, None, None]

#### Removing By ID

In [25]:
## https://api.mongodb.com/python/current/api/bson/objectid.html
cur = collection.find({'Name':'John Smith'},{"_id":1,"Name":1})
list(cur)

[{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fc'), 'Name': 'John Smith'}]

In [26]:
from bson import ObjectId
collection.delete_one({"_id":ObjectId('5ebc5a5381decc05df4dc872')})

<pymongo.results.DeleteResult at 0x11a7e7e08>

In [27]:
cur = collection.find({'Name':'John Smith'},{"_id":1})
list(cur)

[{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fc')}]

Great! Now, let's write a query that gets an individual record based on a stored key-value pair a document contains. 

In the cell below, write a query to get the record for `'John Smith'` by using his name. Then, print the results of the query to demonstrate that it worked correctly.  

In [28]:
query_2 = collection.find({'Name':'John Smith'})
[print(x) for x in query_2]

{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fc'), 'Name': 'John Smith', 'Email': 'j.smith@thesmiths.com', 'Mailing_Address': '123 mulberry lane', 'Balance': 0.0, 'Notes': 'Called technical support, issue not yet resolved'}


[None]

Great! Now, write a query to get the names, email addresses, and balances for customers that have a balance greater than 0. Use a modifier to do this. 

**_HINT_**: In the query below, you'll be passing in two separate dictionaries. The first one defines the logic of the query, while the second tells which fields we want returned. 

In [29]:
query_3 = collection.find({'Balance':{'$gt':0.0}},{'Name':1,'Email':1,'Balance':1})
[print(x) for x in query_3]

{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fd'), 'Name': 'Jane Smith', 'Email': 'jane_smith@thesmiths.com', 'Balance': 25.0}
{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fe'), 'Name': 'Adam Enbar', 'Email': 'adam@theflatironschool.com', 'Balance': 14.99}


[None, None]

## Updating a Record

Now, let's update some records. In the cell below. set the mailing address for `'John Smith'` to `'367 55th St., apt 2A'`.

In [30]:
record_to_update_1 = {'Name':'John Smith'}
update_1 = {'$set':{'Mailing Address':'367 55th St., apt 2A'}}
collection.update_one(record_to_update_1,update_1)


<pymongo.results.UpdateResult at 0x11a8282c8>

Now, write a query to check that the update worked for this document in the cell below:  

In [31]:
query_4 = collection.find(record_to_update_1)
[print(x) for x in query_4]

{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fc'), 'Name': 'John Smith', 'Email': 'j.smith@thesmiths.com', 'Mailing_Address': '123 mulberry lane', 'Balance': 0.0, 'Notes': 'Called technical support, issue not yet resolved', 'Mailing Address': '367 55th St., apt 2A'}


[None]

Now, let's assume that we want to add birthdays for each customer record. Consider the following table:

|     Name     |  Birthday  |
|:------------:|:----------:|
|  John Smith  | 02/20/1986 |
|  Jane Smith  | 07/07/1983 |
|  Adam Enbar  | 12/02/1982 |
| Avi Flombaum | 04/17/1983 |
|   Steven S.  | 08/30/1991 |

We want to add birthdays for each person, but we want to do so in a way where we don't have to do the same repetitive task over and over again. This seems like a good opportunity to write a function to handle some of the logic for us!

In the cell below:

* Store the names in the `names_list` variable as a list.
* Store the birthdays in the `birthdays_list` variable as a list.
* Write a function that takes in the two different lists, and updates each record by adding in the appropriate key-value pair containing that user's birthday.

**_Hint:_** There are several ways that you could write this, depending on whether you want to use the `update_one()` method or the `update_many()` method. See if you can figure out both approaches!

In [32]:
names_list = list(df['Name'].values)
birthday_list = ['02/20/1986' , '07/07/1983' , '12/02/1982' , '04/17/1983' , '08/30/1991' ]

In [33]:
# names_list = None
# birthdays_list = None

def update_birthdays(names, birthdays):
    data= list(zip(names, birthdays))
    for (name,birthday) in data:
        query = {'Name':name}
        update = {'$set':{'Birthday':birthday}}
        collection.update_many(query,update)
        
update_birthdays(names_list, birthday_list)

Now, write a query to check your work and see that the birthdays were added correctly.

In [34]:
for name in names_list:
    res = collection.find({'Name':name})
    [print(r) for r in res]
    print('\n\n\n')

{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fc'), 'Name': 'John Smith', 'Email': 'j.smith@thesmiths.com', 'Mailing_Address': '123 mulberry lane', 'Balance': 0.0, 'Notes': 'Called technical support, issue not yet resolved', 'Mailing Address': '367 55th St., apt 2A', 'Birthday': '02/20/1986'}




{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fd'), 'Name': 'Jane Smith', 'Email': 'jane_smith@thesmiths.com', 'Mailing_Address': 'Null', 'Balance': 25.0, 'Notes': 'Null', 'Birthday': '07/07/1983'}




{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83fe'), 'Name': 'Adam Enbar', 'Email': 'adam@theflatironschool.com', 'Mailing_Address': '11 Broadway', 'Balance': 14.99, 'Notes': 'Set up on recurring billing cycle', 'Birthday': '12/02/1982'}




{'_id': ObjectId('5ebc5d8e6457b8a3fe3c83ff'), 'Name': 'Avi Flombaum', 'Email': 'avi@theflatironschool.com', 'Mailing_Address': '11 Broadway', 'Balance': 0.0, 'Notes': 'Null', 'Birthday': '04/17/1983'}




{'_id': ObjectId('5ebc5d8e6457b8a3fe3c8400'), 'Name': 'Steven S.', 'Emai

Great! It looks like the birthdays have been successfully added to every record correctly!

## Summary

In this lesson, we got some hands-on practice working with MongoDB!