# Mongo tutorial

JSON - https://www.json.org/json-en.html

Mongo uses JSON (BSON or binary JSON) - you can easily check if JSON is formatted properly by going someplace like this: 
https://jsonlint.com/

JSON is pretty easy to get used to if you haven't used it before and it's quite relevant outside of MongoDB.  Fun factoid - because of the way Python dictionaries are structured, they are automatically JSON for free without having to do anything extra!

Gmail credentials below

  **gmail address:** bradleyscheduler@gmail.com
  
  **password:** gotoBRADLEY
  
  **mongo website:** https://cloud.mongodb.com/

I've whitelisted all IP's
You may get an alert about a firewall exception for Python (allow it, you are connecting to a cloud based database)

Mongo documentation:

# I'll provide a few code samples, but you will want to take a brief moment to learn the syntax behind querying in mongodb

https://docs.mongodb.com/manual/introduction/

https://docs.mongodb.com/manual/tutorial/query-documents/

https://docs.mongodb.com/manual/reference/method/db.collection.count/

https://docs.mongodb.com/manual/tutorial/insert-documents/

In [None]:
# if in Jupyter you can run this in jupyter, if you are in terminal remove the !
# Only need to do this once (just to install)
# IMPORTANT: make sure you do what is below and not simply install pymongo
!pip3 install 'pymongo[srv]'

In [None]:
# Install Faker to generate Fake data
# I've decided we can replicate the same structure as QPI if want but we are starting from scratch 
# on the off hand chance that an actual employee's information was captured (I'm taking no chances)
!pip3 install Faker

In [None]:
from faker import Faker

In [None]:
from pymongo import MongoClient
import json

In [None]:
client = MongoClient("mongodb+srv://bob:bob@bradleyschedulerapplica.s3n3e.mongodb.net/myFirstDatabase?retryWrites=true&w=majority")
if client:
    print("successful connection")
else:
    print(":(")

In [None]:
def nuke_mongo(mongo_client):
    database = mongo_client.list_database_names()
    print("Current database count:",len(database))
    for db in database:
        # admin and local will persist after a delete, don't worry about them (they are used for stuff we don't need)
        if db != "admin" and db != "local":
            print("dropping...",db)
            client.drop_database(name_or_database=db)
    database = mongo_client.list_database_names()
    print("Current database count:", len(database))
nuke_mongo(client)

In [None]:
def get_fake_person(seed=None):
    if seed is not None:
        Faker.seed(seed)
    x = {
        "First Name":fake.first_name(),
        "Last Name":fake.last_name(),
        "Phone Number":fake.phone_number(),
        "Email":fake.email(),  
        "Street":fake.street_address(),
        "City":fake.city(),
        "State":fake.state_abbr(),
        "Zipcode":fake.zipcode()
    }
    return x

In [None]:
mydb = client["application_database"]
my_collection = mydb["employees"]

In [None]:
# Add 1337 fake people to the database
def load_fake_data(client):
    people=[get_fake_person(i) for i in range(1,1338)]
    mydb = client["application_database"]
    my_collection = mydb["employees"]
    my_collection.insert_many(people)
load_fake_data(client)

In [None]:
# Find all records in a database, equivalent to a select * from that database
# You will notice there is a field _id, that is a unique field for that document and cannot be replicated in mongo
cursor = my_collection.find({})

In [None]:
for i in list(cursor)[:10]:
    print(i)

In [None]:
# Query to find all people that live in the state Illinois or Texas (I used state codes)
results = my_collection.find({"$or": [{"State":"IL"}, {"State":"TX"}] } )

In [None]:
results.distinct(key="State")

In [None]:
# Get the distinct states for all fake people 
my_collection.find({}).distinct("State")

In [None]:
# Count all documents
my_collection.count_documents({})

In [None]:
# Count only documents that match a query
my_collection.count_documents({"$or": [{"State":"IL"}, {"State":"TX"}] })

In [None]:
# TX only
my_collection.count_documents({"State":"TX"})

In [None]:
# IL only
my_collection.count_documents({"State":"IL"})

In [None]:
# All people who have a first name starting with "P"
my_collection.count_documents({"First Name":{"$regex": "^P"}})

In [None]:
# Find all people in Illinois with a last name starting with "S"s
results = my_collection.find({"$and": [{"State":"IL"}, {"Last Name":{"$regex": "^S"}}]})

for person in results:
    print(person)