<h1>MongoDB and JSON: Using Python to Insert Data</h1>
Data is pulled from the google maps api and inserted into MongoDB.

Before running this code, make sure you've started MongoDB from your shell and connected to it, otherwise your records won't be added.
<p>On Windows and Mac: https://docs.mongodb.com/manual/mongo/</p>
<p>On Linux: https://dzone.com/articles/mongodb-commands-cheat-sheet-for-beginners</p>

In this file we will:
<ol type = '1'>  
<li>Set things up by bringing in our dependencies and connecting to MongoDB </li>
<li>Insert records using an external JSON file</li>
<li>Insert records using an API call</li>
<li>Look up records by ID</li>
<li>Look up records by string search</li>
</ol>


<h3><u>Setup</u?</h3>

In [5]:
#Dependencies

##create tables
import pandas as pd 

##Tool for converting Zip Codes into Latitude and Longitude coordinates
import pgeocode 

##Use API key
from config import gkey, okey 

##API lookup
import requests

##Reading JSONS
import json 

##Connecting and using MongoDB commands in Python
import pymongo 

##Create artificial delay if lookups cause timeouts
import time 

#Makes data readable
from pprint import pprint

In [6]:
#Connect to mongo db and create "locations_mdb" database
conn = 'mongodb://localhost:27017'
client = pymongo.MongoClient(conn)

#Create "locations_mdb" database and assign it to a variable
#(note: MongoDB automatically creates a database when you call it, so this will do both)
db = client.locations_mdb

#Creating or swithcing to a collection
clients = db["clients"]  
hotels = db["hotels"]

#Notice that the database and collections don't appear in MongoDB at this stage. This is because MongoDB does not save collections until data is loaded into them. This step isn't really necessary; the database and collections are created as they are referenced.

<h2>Method 1: Insert Data into MongoDB from an External JSON file</h2>
<p>Using a randomly generated JSON of fictional clients, we will add records to our collection.</p>

In [7]:
#Read and Parse the JSON File
myfile=open("generated.json","r")
json_file=json.load(myfile)

#Load into MongoDB
db.clients.insert_many(json_file)

<pymongo.results.InsertManyResult at 0x7ff99c8b2640>

<h2>Method 2: Insert Data Directly into MongoDB from the API Pull</h2>

In [35]:
# If you are looking up data from an API, you might have a table of values that you are using for your queries. Here, we will use zip code data to look up the locations of hotels in San Fransisco.

# Build the zip code reference table in PANDAS
zips_df=pd.read_csv('resources/zip_codes.csv')
zips_df.head()

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,Unnamed: 0.1.1,ZIPCode,Type,County,Population,Area Code(s),Latitude,Longitude
0,0,0,0,94102,Standard,San Francisco,31176,415 / 510,37.7813,-122.4167
1,1,1,1,94103,Standard,San Francisco,27170,415 / 510 / 650,37.7725,-122.4147
2,2,2,2,94104,Standard,San Francisco,406,415 / 510 / 650 / 628,37.7915,-122.4018
3,3,3,3,94105,Standard,San Francisco,5846,415 / 510 / 650 / 628,37.7864,-122.3892
4,4,4,4,94107,Standard,San Francisco,26599,415 / 510 / 650,37.7621,-122.3971


In [28]:
#Create a collection ("hotels_full") in MongoDB and load in data from google api

#Save the api_url to a variable
api_url = "https://maps.googleapis.com/maps/api/place/nearbysearch/json"

#Loop through the zip code data and...
 
#-->save each lat,long pair to a variable.
for index, row in zips_df.iterrows():
    #time.sleep(0.2) #--> This prevents timeouts from pulling API data too quickly.
    target_coordinates = f"{zips_df.loc[index, 'Latitude']},{zips_df.loc[index, 'Longitude']}" 
    target_radius = 50000
    target_type = "lodging"

#--> set up a parameters dictionary
    params = {
        "location": target_coordinates,
        "radius": target_radius,
        "type": target_type,
        "key": gkey}

#--> Request a JSON using our parameters dictionary
    response = requests.get(api_url, params=params).json()
    
#-->Insert the record into MongoDB using our connected database and Mongo's "insert_one" function.
#Note: Remember that we created a variable "db" earlier to represent our database. That's where the "db" below comes from.
    for hotel in response["results"]:
        db.hotels.insert_one(hotel) 
    

<h3>Retrieving Your Data</h3>
<p>Once you've connected to MongoDB, you can use Python to search and organize the data however you like<p>

In [62]:
#Retrieve entire collection
records=db.clients.find()
for entry in records:
    pprint(entry)

{'_id': '5fc8c59d0fe2bdfd21c0d22d',
 'about': 'Id reprehenderit ut nostrud anim reprehenderit aliqua irure '
          'occaecat aliquip occaecat aliqua aute ut amet. Officia in ad '
          'voluptate est aliquip quis labore nostrud et ullamco amet occaecat. '
          'Fugiat dolore dolore commodo reprehenderit sit id adipisicing '
          'aliquip mollit anim commodo laboris laborum velit. Elit ea non '
          'laboris elit laborum duis labore fugiat consectetur. Magna nostrud '
          'Lorem labore proident et.\r\n',
 'address': '434 Ash Street, Nutrioso, Connecticut, 9665',
 'age': 38,
 'balance': '$2,823.20',
 'company': 'ZENOLUX',
 'email': 'helenegrimes@zenolux.com',
 'eyeColor': 'blue',
 'favoriteFruit': 'banana',
 'friends': [{'id': 0, 'name': 'Estrada Walker'},
             {'id': 1, 'name': 'Lorena Becker'},
             {'id': 2, 'name': 'Leah Dominguez'}],
 'gender': 'female',
 'greeting': 'Hello, Helene Grimes! You have 10 unread messages.',
 'guid': '1757c76e

In [19]:
#Retrieve entries by index and field value:
female_clients=db.clients.find({},{ "gender": "female"})
for entry in female_clients:
    print(entry)

{'_id': '5fc8c59d0fe2bdfd21c0d22d', 'gender': 'female'}
{'_id': '5fc8c59df2000e6c437bef8b', 'gender': 'female'}
{'_id': '5fc8c59d5fd2c767dd475516', 'gender': 'female'}
{'_id': '5fc8c59d404bfede4557dc80', 'gender': 'female'}
{'_id': '5fc8c59dae099587f4db3628', 'gender': 'female'}
{'_id': '5fc8c59d3fdd41504485c2ff', 'gender': 'female'}


In [74]:
#Retrieve entries by any field and by a range of values:
clients=db.clients.find()
for person in clients:
    balance=person["balance"]
    balance=balance.replace("$","")
    balance=balance.replace(",","")
    balance=float(balance)
    if balance > 2000:
        name=person["name"]
        age=person["age"]
        print(f"Name:{name}, Age:{age}")

Name:Helene Grimes, Age:38
Name:Lamb Bird, Age:26
Name:Mathews Maldonado, Age:30
Name:Nona Mcfadden, Age:23
Name:Noble Newman, Age:39
