
1. SQL vs noSQL
2. Installation
3. Create a database
4. Adding a document to a collection
5. Retrieving documents from a collection
6. Deleting a document
7. Modifying a document
8. Practical application

## Intro to Mongo
MongoDB is one of the most accepted NoSQL database and stores data in a JSON structure.

NoSQL can be defined as a database which is employed for managing the massive collection of unstructured data and when your data is not piled up in a tabular format or relations like that of relational databases.

## SQL vs noSQL
Why? When?

SQL is the programming language used to interface with relational databases. (Relational databases model data as records in rows and tables with logical links between them). NoSQL is a class of DBMs that are non-relational and generally do not use SQL.

## Installation
Install mongodb

### Windows
- https://docs.mongodb.com/manual/installation/
- https://www.mongodb.com/try/download/


### OS X / linux
Using [Homebrew ](https://brew.sh/)

- `brew tap mongodb/brew`
- `brew install mongodb-community`
- `brew services start mongodb/brew/mongodb-community`

### Pymongo
`!conda install pymongo -y`

In [1]:
%conda install pymongo -y

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /Users/suneelchakravorty/opt/anaconda3

  added / updated specs:
    - pymongo


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    pymongo-3.11.3             |   py38h23ab428_0         1.2 MB
    ------------------------------------------------------------
                                           Total:         1.2 MB

The following packages will be UPDATED:

  pymongo                             3.11.2-py38h23ab428_0 --> 3.11.3-py38h23ab428_0



Downloading and Extracting Packages
pymongo-3.11.3       | 1.2 MB    | ##################################### | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done


In [3]:
from pymongo import MongoClient

In [7]:
client = MongoClient("mongodb://localhost:27017/")

In [10]:
client.list_database_names()

['admin', 'config', 'local']

## Create a database

In [8]:
mydb = myclient['mydatabase']
# database created!

## Create a collection

In [12]:
import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]

mycol = mydb["customers"]

[]

# ***Inserting a document***

## ***Let's create a dictionary***

`insert_one`

Insert a record in the "customers" collection:

In [None]:
mydict = { "name": "John", "address": "Highway 37" }
x = mycol.insert_one(mydict)

`insert_many`

To insert multiple documents into a collection in MongoDB, we use the insert_many() method.

The first parameter of the insert_many() method is a list containing dictionaries with the data you want to insert:

In [None]:
import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

mylist = [
  { "name": "Amy", "address": "Apple st 652"},
  { "name": "Hannah", "address": "Mountain 21"},
  { "name": "Michael", "address": "Valley 345"},
  { "name": "Sandy", "address": "Ocean blvd 2"},
  { "name": "Betty", "address": "Green Grass 1"},
  { "name": "Richard", "address": "Sky st 331"},
  { "name": "Susan", "address": "One way 98"},
  { "name": "Vicky", "address": "Yellow Garden 2"},
  { "name": "Ben", "address": "Park Lane 38"},
  { "name": "William", "address": "Central st 954"},
  { "name": "Chuck", "address": "Main Road 989"},
  { "name": "Viola", "address": "Sideway 1633"}
]

x = mycol.insert_many(mylist)

#print list of the _id values of the inserted documents:

print(x.inserted_ids)

`inserted_ids`

If you do not want MongoDB to assign unique ids for you document, you can specify the _id field when you insert the document(s).

Remember that the values has to be unique. Two documents cannot have the same _id.

In [None]:
import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

mylist = [
  { "_id": 1, "name": "John", "address": "Highway 37"},
  { "_id": 2, "name": "Peter", "address": "Lowstreet 27"},
  { "_id": 3, "name": "Amy", "address": "Apple st 652"},
  { "_id": 4, "name": "Hannah", "address": "Mountain 21"},
  { "_id": 5, "name": "Michael", "address": "Valley 345"},
  { "_id": 6, "name": "Sandy", "address": "Ocean blvd 2"},
  { "_id": 7, "name": "Betty", "address": "Green Grass 1"},
  { "_id": 8, "name": "Richard", "address": "Sky st 331"},
  { "_id": 9, "name": "Susan", "address": "One way 98"},
  { "_id": 10, "name": "Vicky", "address": "Yellow Garden 2"},
  { "_id": 11, "name": "Ben", "address": "Park Lane 38"},
  { "_id": 12, "name": "William", "address": "Central st 954"},
  { "_id": 13, "name": "Chuck", "address": "Main Road 989"},
  { "_id": 14, "name": "Viola", "address": "Sideway 1633"}
]

x = mycol.insert_many(mylist)

#print a list of the _id values of the inserted documents:
print(x.inserted_ids)

### Exercise: What happens when...
- You specify the `_id`?
- What if it matches an existing `_id`?

## Data Retrieval

`find_one`

Find the first document in the customers collection:

In [None]:
import pymongo
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]
x = mycol.find_one()
print(x)

`find`

Return all documents in the "customers" collection, and print each document:

In [None]:
import pymongo
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

for x in mycol.find():
  print(x)

`{ "_id": 0, "name": 1, "address": 1 }`

Return only the names and addresses, not the _ids:

In [None]:
import pymongo
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]
for x in mycol.find({},{ "_id": 0, "name": 1, "address": 1 }):
    print(x)

## Querying
Equality

Starts with

`{ "$gt": "S"}`

In [None]:
import pymongo
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

#address greater than S:
myquery = { "address": {"$gt": "S"} }
mydoc = mycol.find(myquery)
for x in mydoc:
    print(x)

Regex

In [None]:
import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

#address starts with S:
myquery = { "address": { "$regex": "^S" } }

mydoc = mycol.find(myquery)

for x in mydoc:
  print(x)

`count_documents`

`.sort()`

*   -1(Descending order)
*   1 (Ascending order)

In [None]:
# Sort the result reverse alphabetically by name:

import pymongo
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]
mydoc = mycol.find().sort("name")
for x in mydoc:
  print(x)

## Delete 

`delete_one()`

Delete the document with the address "Mountain 21":

In [None]:
import pymongo
myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]
myquery = { "address": "Mountain 21" }
mycol.delete_one(myquery)

#print the customers collection after the deletion:
for x in mycol.find():
  print(x)

### Exercise
`delete_many`

In [None]:
#delete several documents

## Update

`.update_one(query, new_values_dict)`

In [None]:
# Can you guess this one?

`.update_many`

To update all documents that meets the criteria of the query, use the update_many() method.

In [None]:
import pymongo

myclient = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = myclient["mydatabase"]
mycol = mydb["customers"]

myquery = { "address": { "$regex": "^S" } }

newvalues = { "$set": { "name": "Minnie" } }

x = mycol.update_many(myquery, newvalues)

print(x.modified_count, "documents updated.")

`.drop()`


# ***Excercise***

   

1.   Change the address from "Valley 345" to "Canyon 123" using update_one.
2.   Delete all documents in a collection.  Hint: Use the delete_many method.
3. Insert another record in the "customers" collection, and return the value of the _id field:

# (Optional) Mini-Lab
MongoDB is great for data does not easily conform to a tabular, SQL-esque structure. For example documents, or highly variable JSON.

Also, if there is a bunch of nested data that we don't care to query on but want to store and access in some form.

Let's go through such a case.

### Part 1: MongoDB as Cache datastore
Create a function `smart_request` that will first check if we have made that request by searching in Mongo by the target URL and then if not make the request and save the result in Mongo. If it's already there, then return it.

You'll need to create a collection for this.

### Part 2: Invalidate cache if it has been more than a day
Modify the above function to check WHEN the record was last modified (HINT: you will need to track this date now) and if a day has transpired, then make the request and update it.

You can manually alter the last modified date to a much older date in order to check that your function works.

### Part 3: Populate our DB with HN API
Write code to hit the HN API and populate an Authors collection, based on the top stories.

### Part 4: API Endpoint
Write an (Bottle or other) API endpoint that returns the top 5 highest ranked authors, as far as karma points, from our dataset.