<h1 style='text-align:center'>NoSQL - Not Only SQL</h1>

#### What's wrong with SQL? 

- SQL offers a ton of structure for storing data 
    - That structure requires data to come in, in a certain way (aka your data must have structure) 
    - Structure comes at the cost of speed 
    
    
- SQL structure is very rigid - if you want to change the schema it requires you to change all of your existing data to match the new schema 


- Large data requires distributed computing (many computers working together to accomplish the same task) - Executing distributed joins is a very complex problem in relational databases. 

#### What does NoSQL offer? 

- Schemaless − Number of fields, content and size of the data object can differ from one data object to another.
- You can store virtually any kind of data. 
- Structure of a single object is clear.
- No complex joins.
- To scale up and handle more queries, just add more machines
- You can change the schema of your database on the fly

#### Types of NoSQL Databases

<img style='width: 400px' src='images/nosql-types.png/'>

<b>Document databases</b> pair each key with a complex data structure known as a document. Documents can contain many different key-value pairs, or key-array pairs, or even nested documents.

<div   style='clear: both; display: table;'>
    <div style='float:left; size: 250px'>
        <img  style='align: center; width:150px' src='images/mongodb.png' /></div>
    <div style='float:left; size: 250px'>
        <img style='align: center;' src='images/couchdb.png' /></div>
    <div style= 'float: left; width: 250px'>
        <img style='align: center; width: 200px' src='images/documentdb.png' /></div>
</div>

<b>Graph stores</b> are used to store information about networks of data, such as social connections. Graph stores include Neo4J and Giraph.

<div style='clear: both; display: table;'>
    <div style='float:left; size: 250px'>
        <img  style='align: center; width:150px' src='images/ApacheGiraph.svg' /></div>
    <div style='float:left; size: 250px'>
        <img style='align: center;' src='images/neo4j.png' /></div>
</div>

<b>Key-value</b> stores are the simplest NoSQL databases. Every single item in the database is stored as an attribute name (or 'key'), together with its value. Examples of key-value stores are Riak and Berkeley DB. 

<div   style='clear: both; display: table;'>
    <div style='float:left; size: 250px'>
        <img  style='align: center; width:150px' src='images/riak.png' /></div>
    <div style='float:left; size: 250px'>
        <img style='align: center;' src='images/dynamodb.jpeg' /></div>
</div>

<b>Wide-column stores</b> such as Cassandra and HBase are optimized for queries over large datasets, and store columns of data together, instead of rows.

<img src='images/widcolumn.jpg'/>

#### What is MongoDB

MongoDB stores data in flexible, JSON-like documents, meaning fields can vary from document to document and data structure can be changed over time

<b>Data Structure</b>

Single Entry = Document

` { 
  _id: ObjectId(8af37bd7891c), 
  title: 'MongoDB Lab',
  description: 'Introductory lab on how to use MongoDB',
  by: 'Flatiron School',
  topics: ['mongodb', 'database', 'NoSQL', 'JSON']  
   } `

You can embed documents inside documents! 

<img src ='images/househouse.gif' />

`{ 
  _id: ObjectId(8af37bd78ssc), 
  title: 'Other Lab',
  description: 'Introductory lab on how to use something',
  by: 'Flatiron School',
  topics: ['blah', 'blah', 'blah', 'blah'],
  author: {
          _id: ObjectId(83928shkjw183),
          name: Vishal Patel,
          building: 11 Broadway
          }
   }`

##### Why would we want to nest objects? 

Multiple Documents = Collection

` { 
  _id: ObjectId(8af37bd7891c), 
  title: 'MongoDB Lab',
  description: 'Introductory lab on how to use MongoDB',
  by: 'Flatiron School',
  topics: ['mongodb', 'database', 'NoSQL', 'JSON']  
   }, 
{ 
  _id: ObjectId(8af37bd78ssc), 
  title: 'Other Lab',
  description: 'Introductory lab on how to use something',
  by: 'Flatiron School',
  topics: ['blah', 'blah', 'blah', 'blah']  
   }
`

#### Working with MongoDB

Assuming you have installed/setup mongo and pip installed pymongo...

In [None]:
import pymongo

In [None]:
#connect to your server - it should be running at 'mongodb://127.0.0.1:27017/'
myclient = pymongo.MongoClient("mongodb://127.0.0.1:27017/")

#grab a database from your server 
mydb = myclient['example_data']

#this can be a new one or an existing one (if it doesn't exist, it will get 
# create when you write data into it)

In [None]:
myclient.list_database_names()

In [None]:
#initialize an empty collection - this where your 'documents' will go
mycollection = mydb['example_collection']

In [None]:
example_data = {'name': 'John Doe', 'address': '123 elm street', 'age': 28, 'children': ['Jane', 'Joe']}
mycollection.insert_one(example_data)

In [None]:
#get all the documents in a collection
query = mycollection.find({})

In [None]:
for document in query:
    print(document)

In [None]:
example_data_2 = [{ 'name': 'FangFang', 'address': 'Somewhere'}, 
                  {'name':'Vishal'}, 
                  {'name' : 'TayTay', 'address': 'everywhere'}
                 ]
mycollection.insert_many(example_data_2)

In [None]:
query_1 = mycollection.find({})

In [None]:
for document in query_1:
    print(document)

In [None]:
query_2 = mycollection.find({'name': 'John Doe'})

In [None]:
for document in query_2:
    print(document)

In [None]:
#searching for a record
query_3 = mycollection.find({'name': 'John Doe'})

In [None]:
for document in query_3:
    print(document)

In [None]:
#updating records is super easy! 
record_to_update = {'name' : 'John Doe'}
update_1 = {'$set': {'age': 29, 'birthday': '2/8/1990'}}

mycollection.update_many(record_to_update, update_1)

In [None]:
#searching in a list in a document
query_4 = mycollection.find({'children': 'Jane'})
for item in query_4:
    print(item)

In [None]:
#removing a key:value from a document
update_2 = {'$unset': {'brithday': ''}}

mycollection.update_many(record_to_update, update_2)

In [None]:
query_5 = mycollection.find({'name': 'John Doe'})
for item in query_5:
    print(item)

In [None]:
#delete record
mycollection.delete_one({'name' : 'John Doe'})

In [None]:
#delete all records
mycollection.delete_many({})

#### Your Turn

Open password.py and enter your Instagram username and password (remember to remove this or add it your .gitignore before pushing up to GitHub) 

In [None]:
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By

from password import *

In [None]:
driver = webdriver.Chrome()
driver.get("https://www.instagram.com/accounts/login/")
email_input = driver.find_elements_by_css_selector('form input')[0]
password_input = driver.find_elements_by_css_selector('form input')[1]
email_input.send_keys(username)
password_input.send_keys(password)
login = driver.find_element_by_xpath('//*[@id="react-root"]/section/main/div/article/div/div[1]/div/form/div[4]/button')
login.click()
try: 
    not_now = WebDriverWait(driver, 15).until(
        lambda d: d.find_element_by_xpath('//button[text()="Not Now"]')
    )
    not_now.click()
except: 
    pass
driver.get("https://www.instagram.com/explore/tags/puppies/")
grid = driver.find_element_by_xpath('//*[@id="react-root"]/section/main/article/div[1]/div/div')
html = grid.get_attribute('innerHTML')
driver.close()
soup = BeautifulSoup(html)

The variable soup now contains a beautiful soup object of all the html elements related to the image grid on Instagram. Loop over this object and store the image url and the category text into your MongoDB. 

In [None]:
#the code below is to access the image src (url) and the description of the first element (an image) 
#in your 'soup'
soup.findAll('img')[0]['src']
soup.findAll('img')[0]['alt']

In [None]:
myclient = pymongo.MongoClient("mongodb://127.0.0.1:27017/")
insta_db = myclient['insta_db']
#insert your data into this collection
insta_collection = insta_db['insta_collection']