<h1>Working With JSON Data in Python</h1>

<p>Since its inception, JSON has quickly become the de facto standard for information exchange. Chances are you’re here because you need to transport some data from here to there. Perhaps you’re gathering information through an API or storing your data in a document database. One way or another, you’re up to your neck in JSON, and you’ve got to Python your way out.</p>

<p> Luckily, this is a pretty common task, and—as with most common tasks—Python makes it almost disgustingly easy. Have no fear, fellow Pythoneers and Pythonistas. This one’s gonna be a breeze!</p>

<p>So, we use JSON to store and exchange data? Yup, you got it! It’s nothing more than a standardized format the community uses to pass data around. Keep in mind, JSON isn’t the only format available for this kind of work, but XML and YAML are probably the only other ones worth mentioning in the same breath.</p>

<a href="https://realpython.com/python-json/">https://realpython.com/python-json/ This is where I got the info on this page </a>


<h2> A (Very) Brief History of JSON </h2>

<p>Not so surprisingly, JavaScript Object Notation was inspired by a subset of the JavaScript programming language dealing with object literal syntax. They’ve got a nifty website that explains the whole thing. Don’t worry though: JSON has long since become language agnostic and exists as its own standard, so we can thankfully avoid JavaScript for the sake of this discussion.</p>

<p>Ultimately, the community at large adopted JSON because it’s easy for both humans and machines to create and understand.</p>

<h2>Look, it’s JSON!</h2>

<p>
Get ready. I’m about to show you some real life JSON—just like you’d see out there in the wild. It’s okay: JSON is supposed to be readable by anyone who’s used a C-style language, and Python is a C-style language…so that’s you! </p>

In [4]:
a= {
    "firstName": "Jane",
    "lastName": "Doe",
    "hobbies": ["running", "sky diving", "singing"],
    "age": 35,
    "children": [
        {
            "firstName": "Alice",
            "age": 6
        },
        {
            "firstName": "Bob",
            "age": 8
        }
    ]
}
print (type(a))
print (a)


<class 'dict'>
{'firstName': 'Jane', 'lastName': 'Doe', 'hobbies': ['running', 'sky diving', 'singing'], 'age': 35, 'children': [{'firstName': 'Alice', 'age': 6}, {'firstName': 'Bob', 'age': 8}]}



<p>As you can see, JSON supports primitive types, like strings and numbers, as well as nested lists and objects.</p>

<p><b><font style="background-color:yellow">Wait, that looks like a Python dictionary! I know, right? It’s pretty much universal object notation at this point, but I don’t think UON rolls off the tongue quite as nicely. Feel free to discuss alternatives in the comments.</font></b><p>

<p>Whew! You survived your first encounter with some wild JSON. Now you just need to learn how to tame it.</p>

<h2>Python Supports JSON Natively!</h2>

<p>Python comes with a built-in package called json for encoding and decoding JSON data.</p>

<p>Just throw this little guy up at the top of your file:</p>


In [2]:
import json

<h2>A Little Vocabulary</h2>

<p>The process of encoding JSON is usually called <font style="color:blue"><b><i>serialization</b></i></font>. This term refers to the transformation of data into a series of bytes (hence serial) to be stored or transmitted across a network. You may also hear the term <font style="color:blue"><b><i>marshaling</b></i></font>, but that’s a whole other discussion. Naturally, <font style="color:blue"><b><i>deserialization</b></i></font> is the reciprocal process of decoding data that has been stored or delivered in the JSON standard.</p>

<p><font style="background-color:tan">Yikes! That sounds pretty technical. Definitely. But in reality, all we’re talking about here is reading and writing. Think of it like this: encoding is for writing data to disk, while decoding is for reading data into memory.</font></p>

<h2>Serializing JSON - json.dump() and json.dumps()</h2>

<p>What happens after a computer processes lots of information? It needs to take a data <b>dump</b>. Accordingly, the json library exposes the dump() method for writing data to files. There is also a <b>dumps()</b> method (pronounced as “dump-s”) for writing to a Python string.</p>

<p>Simple Python objects are translated to JSON according to a fairly intuitive conversion.</p>


<table>
<tr>
Python	JSON
</tr>
<tr>
dict	object
</tr>
<tr>
list, tuple	array
</tr>
<tr>
str	string
</tr>
<tr>
int, long, float	number
</tr>
<tr>
True	true
</tr>
<tr>
False	false
</tr>
<tr>
None	null
</tr>
</table>

<table>
<tr>
    <th>Python</th><th>JSON</th>
</tr>
<tr>
    <td>dict</td><td>object</td>
</tr>
<tr>
<td>list, tuple</td><td>	array</td>
</tr>
<tr>
<td>str	</td><td>string</td>
</tr>
<tr>
<td>int, long, float</td><td>	number</td>
</tr>
<tr>
<td>True	</td><td>true</td>
</tr>
<tr>
<td>False</td><td>	false</td>
</tr>
<tr>
<td>None</td><td>	null</td>
</tr>
</table>

<h2>A Simple Serialization Example</h2>

<p>Imagine you’re working with a Python object in memory that looks a little something like this:</p>

In [7]:
data = {
    "president": {
        "name": "Zaphod Beeblebrox",
        "species": "Betelgeusian"
    }
}

<p>It is critical that you save this information to disk, so your mission is to write it to a file.</p>

<p>Using Python’s context manager, you can create a file called data_file.json and open it in write mode. (JSON files conveniently end in a .json extension.)</p>

In [8]:
with open("data_file.json", "w") as write_file:
    json.dump(data, write_file)

<p>Note that dump() takes two positional arguments: (1) the data object to be serialized, and (2) the file-like object to which the bytes will be written.</p>

<p>Or, if you were so inclined as to continue using this serialized JSON data in your program, you could write it to a native Python str object.</p>

In [10]:
json_string = json.dumps(data)
print (type(json_string))
print (json_string)

<class 'str'>
{"president": {"name": "Zaphod Beeblebrox", "species": "Betelgeusian"}}


<p>Notice that the file-like object is absent since you aren’t actually writing to disk. Other than that, dumps() is just like dump().</p>

<p>Hooray! You’ve birthed some baby JSON, and you’re ready to release it out into the wild to grow big and strong.</p>

<h2>Some Useful Keyword Arguments</h2>

<p>Remember, JSON is meant to be easily readable by humans, but readable syntax isn’t enough if it’s all squished together. Plus you’ve probably got a different programming style than me, and it might be easier for you to read code when it’s formatted to your liking.</p>

<font style="background-color:yellow"><b>NOTE:</b> Both the dump() and dumps() methods use the same keyword arguments.</font>

<p>
The first option most people want to change is whitespace. You can use the indent keyword argument to specify the indentation size for nested structures. Check out the difference for yourself by using data, which we defined above, and running the following commands in a console:
</p>

In [11]:
print(json.dumps(data))
print(json.dumps(data, indent=4))

{"president": {"name": "Zaphod Beeblebrox", "species": "Betelgeusian"}}
{
    "president": {
        "name": "Zaphod Beeblebrox",
        "species": "Betelgeusian"
    }
}


<p>Another formatting option is the separators keyword argument. By default, this is a 2-tuple of the <b>separator</b> strings <b> (", ", ": ")</b>, but a common alternative for compact JSON is <b>(",", ":")</b>. Take a look at the sample JSON again to see where these separators come into play.</p>

<p>There are others, like sort_keys, but I have no idea what that one does. You can find a whole list in the docs if you’re curious.</p>

<h2>Deserializing JSON - json.load() and json.loads()</h2>

<p>Great, looks like you’ve captured yourself some wild JSON! Now it’s time to whip it into shape. In the json library, you’ll find <b>load()</b> and <b>loads()</b> for turning JSON encoded data into Python objects.</p>

<p>Just like serialization, there is a simple conversion table for deserialization, though you can probably guess what it looks like already.</p>

<table>
<tr>
    <th>JSON</th><th>Python</th>
</tr>
<tr>
    <td>object</td><td>dict</td>
</tr>
<tr>
<td>	array</td><td>list</td>
</tr>
<tr>
<td>string</td><td>str	</td>
</tr>
<tr>
<td>	number(int)</td><td>int</td>
</tr>
<tr>
<td>	number(real)</td><td>float</td>
</tr>
<tr>
<td>true</td><td>True	</td>
</tr>
<tr>
<td>	false</td><td>False</td>
</tr>
<tr>
<td>	null</td><td>None</td>
</tr>
</table>

<p>Technically, this conversion isn’t a perfect inverse to the serialization table. That basically means that if you encode an object now and then decode it again later, you may not get exactly the same object back. I imagine it’s a bit like teleportation: break my molecules down over here and put them back together over there. Am I still the same person?</p>

<p>In reality, it’s probably more like getting one friend to translate something into Japanese and another friend to translate it back into English. Regardless, the simplest example would be encoding a tuple and getting back a list after decoding, like so:</p>

In [16]:
blackjack_hand = (8, "Q")
encoded_hand = json.dumps(blackjack_hand)
decoded_hand = json.loads(encoded_hand)

print (decoded_hand)         # prints [8, 'Q']
print (type(blackjack_hand)) # prints <class 'tuple'>
print (type(decoded_hand))   # prints <class 'list'>

[8, 'Q']
<class 'tuple'>
<class 'list'>


<h2>A Simple Deserialization Example - json.load(read_file) and json.loads(read_string)</h2>

<p>This time, imagine you’ve got some data stored on disk that you’d like to manipulate in memory. You’ll still use the context manager, but this time you’ll open up the existing data_file.json in read mode.</p>

<p>If you’ve pulled JSON data in from another program or have otherwise obtained a string of JSON formatted data in Python, you can easily deserialize that with <b>loads()</b>, which naturally loads from a string:</p>

In [17]:
with open("data_file.json", "r") as read_file:
    data = json.load(read_file)
    
print (type(data))
print (data)    

<class 'dict'>
{'president': {'name': 'Zaphod Beeblebrox', 'species': 'Betelgeusian'}}


<p>Things are pretty straightforward here, but keep in mind that the result of this method could return any of the allowed data types from the conversion table. This is only important if you’re loading in data you haven’t seen before. In most cases, the root object will be a dict or a list.</p>

<p>If you’ve pulled JSON data in from another program or have otherwise obtained a string of JSON formatted data in Python, you can easily deserialize that with loads(), which naturally loads from a string:</p>

In [18]:
json_string = """
{
    "researcher": {
        "name": "Ford Prefect",
        "species": "Betelgeusian",
        "relatives": [
            {
                "name": "Zaphod Beeblebrox",
                "species": "Betelgeusian"
            }
        ]
    }
}
"""

data = json.loads(json_string)

print (type(data))
print (data)   

<class 'dict'>
{'researcher': {'name': 'Ford Prefect', 'species': 'Betelgeusian', 'relatives': [{'name': 'Zaphod Beeblebrox', 'species': 'Betelgeusian'}]}}


<p>Voilà! You’ve tamed the wild JSON, and now it’s under your control. But what you do with that power is up to you. You could feed it, nurture it, and even teach it tricks. It’s not that I don’t trust you…but keep it on a leash, okay?</p>

<h2>A Real World Example (sort of)</h2>

<p>or your introductory example, you’ll use JSONPlaceholder, a great source of fake JSON data for practice purposes.</p>

<a href="https://jsonplaceholder.typicode.com/"> JSONPlaceholder </a>

<p>First create a script file called scratch.py, or whatever you want. I can’t really stop you.</p>

<p>You’ll need to make an API request to the JSONPlaceholder service, so just use the requests package to do the heavy lifting. Add these imports at the top of your file:</p>

In [14]:
import json
import requests

<p>Now, you’re going to be working with a list of TODOs cuz like…you know, it’s a rite of passage or whatever.</p>

<p>Go ahead and make a request to the JSONPlaceholder API for the /todos endpoint. If you’re unfamiliar with requests, there’s actually a handy json() method that will do all of the work for you, but you can practice using the json library to deserialize the text attribute of the response object. It should look something like this:</p>


In [4]:
response = requests.get("https://jsonplaceholder.typicode.com/todos")
todos = json.loads(response.text)
print (type(todos))

for i in todos:
    print(i)


<class 'list'>
{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False}
{'userId': 1, 'id': 2, 'title': 'quis ut nam facilis et officia qui', 'completed': False}
{'userId': 1, 'id': 3, 'title': 'fugiat veniam minus', 'completed': False}
{'userId': 1, 'id': 4, 'title': 'et porro tempora', 'completed': True}
{'userId': 1, 'id': 5, 'title': 'laboriosam mollitia et enim quasi adipisci quia provident illum', 'completed': False}
{'userId': 1, 'id': 6, 'title': 'qui ullam ratione quibusdam voluptatem quia omnis', 'completed': False}
{'userId': 1, 'id': 7, 'title': 'illo expedita consequatur quia in', 'completed': False}
{'userId': 1, 'id': 8, 'title': 'quo adipisci enim quam ut ab', 'completed': True}
{'userId': 1, 'id': 9, 'title': 'molestiae perspiciatis ipsa', 'completed': False}
{'userId': 1, 'id': 10, 'title': 'illo est ratione doloremque quia maiores aut', 'completed': True}
{'userId': 1, 'id': 11, 'title': 'vero rerum temporibus dolor', 'completed': True}
{'userId': 1,

In [9]:
todos == response.json()
print(type(todos))
print (todos[:10])

<class 'list'>
[{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False}, {'userId': 1, 'id': 2, 'title': 'quis ut nam facilis et officia qui', 'completed': False}, {'userId': 1, 'id': 3, 'title': 'fugiat veniam minus', 'completed': False}, {'userId': 1, 'id': 4, 'title': 'et porro tempora', 'completed': True}, {'userId': 1, 'id': 5, 'title': 'laboriosam mollitia et enim quasi adipisci quia provident illum', 'completed': False}, {'userId': 1, 'id': 6, 'title': 'qui ullam ratione quibusdam voluptatem quia omnis', 'completed': False}, {'userId': 1, 'id': 7, 'title': 'illo expedita consequatur quia in', 'completed': False}, {'userId': 1, 'id': 8, 'title': 'quo adipisci enim quam ut ab', 'completed': True}, {'userId': 1, 'id': 9, 'title': 'molestiae perspiciatis ipsa', 'completed': False}, {'userId': 1, 'id': 10, 'title': 'illo est ratione doloremque quia maiores aut', 'completed': True}]


<p>All right, time for some action. You can see the structure of the data by visiting the endpoint in a browser, but here’s a sample TODO:<p>

In [None]:
{
    "userId": 1,
    "id": 1,
    "title": "delectus aut autem",
    "completed": false
}

<p>There are multiple users, each with a unique userId, and each task has a Boolean completed property. Can you determine which users have completed the most tasks?</p>

In [14]:
# Map of userId to number of complete TODOs for that user
todos_by_user = {}

# Increment complete TODOs count for each user.
for todo in todos:
    if todo["completed"]:
        try:
            # Increment the existing user's count.
            todos_by_user[todo["userId"]] += 1
        except KeyError:
            # This user has not been seen. Set their count to 1.
            todos_by_user[todo["userId"]] = 1

print (todos_by_user)
# Create a sorted list of (userId, num_complete) pairs.
top_users = sorted(todos_by_user.items(), 
                   key=lambda x: x[1], reverse=True)
print (top_users)

# Get the maximum number of complete TODOs.
max_complete = top_users[0][1]
print (max_complete)

# Create a list of all users who have completed
# the maximum number of TODOs.
users = []
for user, num_complete in top_users:
    if num_complete < max_complete:
        break
    users.append(str(user))

max_users = " and ".join(users)
print (max_users)

{1: 11, 2: 8, 3: 7, 4: 6, 5: 12, 6: 6, 7: 9, 8: 11, 9: 8, 10: 12}
[(5, 12), (10, 12), (1, 11), (8, 11), (7, 9), (2, 8), (9, 8), (3, 7), (4, 6), (6, 6)]
12
5 and 10


<p>Yeah, yeah, your implementation is better, but the point is, you can now manipulate the JSON data as a normal Python object!</p>

<p>I don’t know about you, but when I run the script interactively again, I get the following results:<>/p>

In [15]:
s = "s" if len(users) > 1 else ""
print(f"user{s} {max_users} completed {max_complete} TODOs")

users 5 and 10 completed 12 TODOs


<p>That’s cool and all, but you’re here to learn about JSON. For your final task, you’ll create a JSON file that contains the completed TODOs for each of the users who completed the maximum number of TODOs.</p>

<p>All you need to do is filter todos and write the resulting list to a file. For the sake of originality, you can call the output file filtered_data_file.json. There are may ways you could go about this, but here’s one:</p>

In [16]:
# Define a function to filter out completed TODOs 
# of users with max completed TODOS.
def keep(todo):
    is_complete = todo["completed"]
    has_max_count = str(todo["userId"]) in users
    return is_complete and has_max_count

# Write filtered TODOs to file.
with open("filtered_data_file.json", "w") as data_file:
    filtered_todos = list(filter(keep, todos))
    json.dump(filtered_todos, data_file, indent=2)

<h2>Another example - Getting Geocoding information from the US Census Burea</h2>
https://geocoding.geo.census.gov/geocoder/Geocoding_Services_API.pdf
<h3>Geocoding Definition</h3>

<p>Geocoding is the process of taking an address and returning an actual or calculated latitude/longitude
coordinate. Depending on the parts of the address that are provided, determines to what granularity it
is possible to geocode.</p>

<p>The current Geocoding Services engine requires a structure address be provided. The resulting lat/long
is calculated along an address range.</p>

<p>There are two entry points for the geocoding service – single record submission and batch.</p>

<p>The acceptable input address parts are:</p>
<ul>
<li>Structure number and street name (required)</li>
<li>City name (optional)</li>
<li>State (optional)</li>
<li>ZIP code (optional)</li>
</ul>
<p>The single record service allows for all of these parts to be submitted in a single line, or as separate
fields. The batch requires each field to exist (either with text or blank) in a delimited form, preceded by
a unique ID.</p>

<h3>Audience</h3>
<p>This document is intended for application, website, and mobile developers within the U.S. Census
Bureau and the general public who want to leverage the Geocoding Services capability.</p>

<p>This service is designed for coding a provided address, or file of addresses, to a latitude/longitude
coordinate based on data that’s been loaded into the geocoding engine from a MAF/TIGER benchmark
database.</p>

<p>The optional inclusion of the Geographic Lookup (geoLookup) adds information to the result relating to
various levels of geography that cover the aforementioned latitude/longitude coordinate. GeoLookup
results can also be obtained directly by searching on the latitude/longitude coordinates.</p>


<h2>Single Record Geocoding Service Requests</h2>

<p>A Geocoding Service API request must be in the following form:</p>

https://geocoding.geo.census.gov/geocoder/returntype /searchtype?parameters

<h3>Required Parameters</h3>

<ul>
    <li><font style="color:green">returntype</font> – <b>locations</b>(to get just geocoding response) or <b>geographies</b>(to get geocoding
response as well as geoLookup)</li>
    <li><font style="color:green">searchtype</font> – <b>onelineaddress</b> OR <b>address</b> OR <b>coordinates</b></li>
<li><font style="color:green">benchmark</font> – A numerical ID or name that references what version of the locator should be
searched. This generally corresponds to MTDB data which is benchmarked twice yearly. A full
    list of options can be accessed at <b>https://geocoding.geo.census.gov/geocoder/benchmarks</b>. The
general format of the name is DatasetType_SpatialBenchmark. The valid values for these
include:</li>
<ul>
<li>DatasetType</li>
     <b>Public_AR</b>
<li>SpatialBenchmark</li>
     <b>Current</b>
     <b>ACS2018</b>
     <b>Census2010</b>
</ul>
   
<li><font style="color:green">benchmark</font> (continued) So a resulting benchmark name could be “Public_AR_Current”, “Public_AR_Census2010”, etc.
Over time, there will always be a “Current” benchmark. It will change as the underlying dataset
changes</li>

<li>
<font style="color:green">vintage</font> – a numerical ID or name that references what vintage of geography is desired for
the geoLookup (only needed when returntype = geographies). ). A full list of options for a given
benchmark can be accessed at
https://geocoding.geo.census.gov/geocoder/vintages?benchmark=benchmarkId. The general
format of the name is GeographyVintage_SpatialBenchmark. The SpatialBenchmark variable
should always match the same named variable in what was chosen for the benchmark
parameter. The GeographyVintage can be Current, ACS2018, etc. So a resulting vintage name
could be “ACS2018_Current”, “Current_Census2010”, etc. Over time, there will always be a
“Current” vintage. It will change as the underlying dataset changes
</li>
<li>
    <font style="color:green">address (searchtype = onelineaddress)</font> - A single line containing the full address
to be searched
OR    
</li>
<li>
<font style="color:green">street, city, state, zip (searchtype = address)</font>   – The address split into
the parts indicated. Not all parts need to be specified.
geographies.
<li>
    <font style="color:green">x,y (searchtype = coordinates)</font> – The latitude and longitude represented as
decimal x/y values. Only returns geoLookup data. Can only be used with returntype =
geographies.
    
</li>
</ul>

<h3>Optional Parameters</h3>
<ul>
<li>
    <font style="color:green">format</font> – The format to be used for returning the standardized output (json, html).    
</li>
    
<li>
    <font style="color:green">layers </font>  – By default, State, County, Tract, and Block layers are displayed when “geographies”
is the chosen returntype. If additional or different layers are desired, they can be specified in a
comma delimited list by ID or name as listed in the TigerWeb WMS layers, for instance here:
3
<font style="color:blue">https://tigerweb.geo.census.gov/ArcGIS/rest/services/TIGERweb/tigerWMS_Current/MapServer</font>
a valid entry could be: layers=14,16,18 OR layers=Unified School Districts,Secondary School
Districts,Elementary School Districts. Only layers without the word “Labels” are considered. If
all layers are desired, layers=all is an accepted entry.
In cases where the SpatialBenchmark selected is Census2010, the TIGERweb WMS needed is:
<font style="color:blue">https://tigerweb.geo.census.gov/ArcGIS/rest/services/Census2010/tigerWMS_Census2010/MapServer</font>
    
    
</li>
</ul>

<h3>Geocoding Service Responses</h3>

<p>Geocoding Services responses are returned in the format indicated by the format parameter value in
the URL request’s path.</p>

<h3>JSON Output Format</h3>
<p>In this example, Geocoding Services API requests a json response for the address “4600 Silver Hill Rd,
Suitland, MD 20746”:</p>


<h2>Example for returntype=locations<h2>

In [50]:
# create the Geocoding Service Request
geocodeRequest = \
"https://geocoding.geo.census.gov/geocoder/locations/onelineaddress?address=4600+Silver+Hill+Rd%2C+Suitland%2C+MD+20746&benchmark=9&format=json"

#send the request
geocodeResponse = requests.get(geocodeRequest)
print (type(geocodeResponse))  # prints <class 'requests.models.Response'>

# convert JSON to python data
geocodeResponseJSON = json.loads(geocodeResponse.text)
print (type(geocodeResponseJSON))  # prints <class 'dict'>

# serialize/dump data
print(json.dumps(geocodeResponseJSON, indent=4))

<class 'requests.models.Response'>
<class 'dict'>
{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "address": {
                "address": "4600 Silver Hill Rd, Suitland, MD 20746"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "4600 Silver Hill Rd, SUITLAND, MD, 20746",
                "coordinates": {
                    "x": -76.92691,
                    "y": 38.846542
                },
                "tigerLine": {
                    "tigerLineId": "613199520",
                    "side": "L"
                },
                "addressComponents": {
                    "fromAddress": "4600",
                    "toAddress": "4712",
                    "preQualifier": "",
        

In [22]:

geocodeRequest1 = "https://geocoding.geo.census.gov/geocoder/locations/address?street=4600+Silver+Hill+Rd&city=Suitland&state=MD&zip=20746&benchmark=Public_AR_Census2010&format=json"
geocodeResponse1 = requests.get(geocodeRequest1)
print (geocodeResponse1)

geocodeRequest2 ="https://geocoding.geo.census.gov/geocoder/locations/address?street=4600+Silver+Hill+Rd&city=Suitland&state=MD&benchmark=9&format=json"
geocodeResponse2 = requests.get(geocodeRequest2)
print (geocodeResponse2)

<Response [200]>
<Response [200]>


<h3>Getting geoLookup data </h3>

<p>In the example below we set the <font style="color:green">returntype</font> argument to geographies to get <b>geoLookup</b> data </p>

In [6]:

geocodeRequest3 = "https://geocoding.geo.census.gov/geocoder/geographies/address?street=4600+Silver+Hill+Rd&city=Suitland&state=MD&benchmark=Public_AR_Census2010&vintage=Census2010_Census2010&layers=14&format=json"
geocodeResponse3 = requests.get(geocodeRequest3)
geocodeResponseJSON = json.loads(geocodeResponse3.text)
print(json.dumps(geocodeResponseJSON, indent=4))

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "4600 Silver Hill Rd",
                "city": "Suitland",
                "state": "MD"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "4600 Silver Hill Rd, SUITLAND, MD, 20746",
                "coordinates": {
                    "x": -76.92691,
                    "y": 38.846542
                },
                "tigerLine": {
                    "tigerLineId

In [7]:
# pull out the state and county code from the geoLookup section of response
STATE_CODE=geocodeResponseJSON["result"]["addressMatches"][0]["geographies"]["Census Blocks"][0]["STATE"]
COUNTY_CODE=geocodeResponseJSON["result"]["addressMatches"][0]["geographies"]["Census Blocks"][0]["COUNTY"]

<class 'dict'>


In [22]:
STATE_CODE=geocodeResponseJSON["result"]["addressMatches"][0]["geographies"]["Census Blocks"][0]["STATE"]
COUNTY_CODE=geocodeResponseJSON["result"]["addressMatches"][0]["geographies"]["Census Blocks"][0]["COUNTY"]

In [23]:
print (STATE_CODE)
print (COUNTY_CODE)

24
033


In [27]:
# let's see if I can get a dealer's information
geocodeRequest4 = "https://geocoding.geo.census.gov/geocoder/geographies/address?street=6252+VIRGINIA+BEACH+BLVD&city=NORFOLK&state=VA&benchmark=Public_AR_Census2010&vintage=Census2010_Census2010&layers=14&format=json"
geocodeResponse4 = requests.get(geocodeRequest4)
geocodeResponseJSON = json.loads(geocodeResponse4.text)
print(json.dumps(geocodeResponseJSON, indent=4))

STATE_CODE=geocodeResponseJSON["result"]["addressMatches"][0]["geographies"]["Census Blocks"][0]["STATE"]
COUNTY_CODE=geocodeResponseJSON["result"]["addressMatches"][0]["geographies"]["Census Blocks"][0]["COUNTY"]

print (STATE_CODE)
print (COUNTY_CODE)

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "6252 VIRGINIA BEACH BLVD",
                "city": "NORFOLK",
                "state": "VA"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "6252 E Virginia Beach Blvd, NORFOLK, VA, 23502",
                "coordinates": {
                    "x": -76.19094,
                    "y": 36.85509
                },
                "tigerLine": {
                    "ti

In [52]:
# need to change second request to use http://www.yaddress.net/WebApi
def getCounty(street_address, city, state,zipcode):
    # let's see if I can get a dealer's information 
    geocodeRequest = "https://geocoding.geo.census.gov/geocoder/geographies/address?street=" + street_address + "&city=" + city + "&state=" + state + "&zip=" + zipcode +"&benchmark=Public_AR_Census2010&vintage=Census2010_Census2010&layers=14&format=json"
    geocodeResponse = requests.get(geocodeRequest)
    geocodeResponseJSON = json.loads(geocodeResponse.text)
    print(json.dumps(geocodeResponseJSON, indent=4))

    if len(geocodeResponseJSON["result"]["addressMatches"]) > 0 :
        STATE_CODE=geocodeResponseJSON["result"]["addressMatches"][0]["geographies"]["Census Blocks"][0]["STATE"]
        COUNTY_CODE=geocodeResponseJSON["result"]["addressMatches"][0]["geographies"]["Census Blocks"][0]["COUNTY"]
        return ((STATE_CODE, COUNTY_CODE))
    else: # if trying the us gov failed, try YAddress - limited to 1000 
        geocodeRequest = "https://geocoding.geo.census.gov/geocoder/geographies/address?street=" + street_address + "&city=" + city + "&state=" + state + "&zip=" + zipcode +"&benchmark=Public_AR_ACS2018&vintage=ACS2018_ACS2018&layers=14&format=json"
        geocodeResponse = requests.get(geocodeRequest)
        geocodeResponseJSON = json.loads(geocodeResponse.text)
        print(json.dumps(geocodeResponseJSON, indent=4))
        
        if len(geocodeResponseJSON["result"]["addressMatches"]) > 0 :
            STATE_CODE=geocodeResponseJSON["result"]["addressMatches"][0]["geographies"]["Census Blocks"][0]["STATE"]
            COUNTY_CODE=geocodeResponseJSON["result"]["addressMatches"][0]["geographies"]["Census Blocks"][0]["COUNTY"]
            return ((STATE_CODE, COUNTY_CODE))
        else:
            return(("NA","NA"))


In [43]:
r = getCounty("46+Silver+Hill+Rd","Suitland","MA","746" )

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "46 Silver Hill Rd",
                "city": "Suitland",
                "state": "MA",
                "zip": "746"
            }
        },
        "addressMatches": []
    }
}
{
    "result": {
        "input": {
            "benchmark": {
                "id": "8",
                "benchmarkName": "Public_AR_ACS2018",
                "benchmarkDescription": "Public Address Ranges - ACS2018 Benchmark",
           

In [44]:
print (r)

('N/A', 'N/A')


In [54]:
# connect to database
import cx_Oracle
import os.path
con = cx_Oracle.connect('gmspo/swonk*03@prod_db')
cur = con.cursor()
cur2 = con.cursor()
cur.execute("select BAC_CODE_A, trim(replace(ADDRESS,' ', '+')) ADDRESS, trim(replace(CITY,' ','+')) CITY, trim(STATE) STATE, substr(trim(ZIP),1,5) zip from pco_polk_to_hh_census_20")
for result in cur:

    bacCode = result[0]
    address = result[1]
    city = result[2]
    state = result[3]
    zipCode = result[4]
    print(bacCode,address,city,state,zipCode)
    r = getCounty(address,city,state,zipCode)    
    print (r)
    
    sqlStr = "insert into pco_dealer_counties values ('" + bacCode + "','" + r[0] + "','" + r[1] + "')"
    cur2.execute(sqlStr)

    print(sqlStr)
cur.close()
cur2.execute("commit")
cur2.close()
con.close()  

# get a dealer's address that needs to be mapped to a state and county code

# insert the mapped strate and county code to database.



113725 6252+VIRGINIA+BEACH+BLVD NORFOLK VA 23502
{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "6252 VIRGINIA BEACH BLVD",
                "city": "NORFOLK",
                "state": "VA",
                "zip": "23502"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "6252 E Virginia Beach Blvd, NORFOLK, VA, 23502",
                "coordinates": {
                    "x": -76.19094,
                    "y": 3

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "1940 E JOPPA RD",
                "city": "BALTIMORE",
                "state": "MD",
                "zip": "21234"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "1940 E Joppa Rd, BALTIMORE, MD, 21234",
                "coordinates": {
                    "x": -76.54671,
                    "y": 39.399876
                },
                "tigerLine": {
      

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "2375 VIRGINIA BEACH BLVD",
                "city": "VIRGINIA BEACH",
                "state": "VA",
                "zip": "23454"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "2375 Virginia Beach Blvd, VIRGINIA BCH, VA, 23454",
                "coordinates": {
                    "x": -76.05137,
                    "y": 36.84176
                },
            

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "791 SAINTE GENEVIEVE DRIVE",
                "city": "SAINTE GENEVIEVE",
                "state": "MO",
                "zip": "63670"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "791 Ste Genevieve Dr, SAINTE GENEVIEVE, MO, 63670",
                "coordinates": {
                    "x": -90.05099,
                    "y": 37.96815
                },
        

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "2244 S KINGSHIGHWAY",
                "city": "SAINT LOUIS",
                "state": "MO",
                "zip": "63110"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "2244 S Kingshighway Blvd, SAINT LOUIS, MO, 63110",
                "coordinates": {
                    "x": -90.26742,
                    "y": 38.613636
                },
                "tig

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "6127 S LINDBERGH BLVD",
                "city": "SAINT LOUIS",
                "state": "MO",
                "zip": "63123"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "6127 S Lindbergh Blvd, SAINT LOUIS, MO, 63123",
                "coordinates": {
                    "x": -90.34303,
                    "y": 38.519188
                },
                "tige

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "665 SW 8TH ST",
                "city": "MIAMI",
                "state": "FL",
                "zip": "33130"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "665 SW 8th Ave, MIAMI, FL, 33130",
                "coordinates": {
                    "x": -80.20747,
                    "y": 25.767277
                },
                "tigerLine": {
                 

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "8880 BISCAYNE BLVD",
                "city": "MIAMI",
                "state": "FL",
                "zip": "33138"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "8880 Biscayne Blvd, MIAMI, FL, 33138",
                "coordinates": {
                    "x": -80.18477,
                    "y": 25.856937
                },
                "tigerLine": {
        

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "11701 SW 152ND ST",
                "city": "MIAMI",
                "state": "FL",
                "zip": "33177"
            }
        },
        "addressMatches": []
    }
}
{
    "result": {
        "input": {
            "benchmark": {
                "id": "8",
                "benchmarkName": "Public_AR_ACS2018",
                "benchmarkDescription": "Public Address Ranges - ACS2018 Benchmark",
            

{
    "result": {
        "input": {
            "benchmark": {
                "id": "9",
                "benchmarkName": "Public_AR_Census2010",
                "benchmarkDescription": "Public Address Ranges - Census 2010 Benchmark",
                "isDefault": false
            },
            "vintage": {
                "id": "910",
                "vintageName": "Census2010_Census2010",
                "vintageDescription": "Census2010 Vintage - Census2010 Benchmark",
                "isDefault": true
            },
            "address": {
                "street": "1415 BOXWOOD TERRACE",
                "city": "BEDFORD",
                "state": "VA",
                "zip": "24523"
            }
        },
        "addressMatches": [
            {
                "matchedAddress": "1415 Boxwood Ter, BEDFORD, VA, 24523",
                "coordinates": {
                    "x": -79.50244,
                    "y": 37.32558
                },
                "tigerLine": {
     

In [46]:
geocodeRequest4 = "http://www.yaddress.net/api/address?AddressLine1=506+Fourth+Avenue+Unit+1&AddressLine2=Asbury+Prk+NJ&UserKey="
geocodeResponse4 = requests.get(geocodeRequest4)
geocodeResponseJSON = json.loads(geocodeResponse4.text)
print(json.dumps(geocodeResponseJSON, indent=4))


{
    "ErrorCode": 0,
    "ErrorMessage": "",
    "AddressLine1": "506 4TH AVE APT 1",
    "AddressLine2": "ASBURY PARK, NJ 07712-6086",
    "Number": "506",
    "PreDir": "",
    "Street": "4TH",
    "Suffix": "AVE",
    "PostDir": "",
    "Sec": "APT",
    "SecNumber": "1",
    "City": "ASBURY PARK",
    "State": "NJ",
    "Zip": "07712",
    "Zip4": "6086",
    "County": "MONMOUTH",
    "StateFP": "34",
    "CountyFP": "025",
    "CensusTract": "8070.03",
    "CensusBlock": "1015",
    "Latitude": 40.223571,
    "Longitude": -74.005973,
    "GeoPrecision": 5
}


In [47]:
geocodeResponseJSON["StateFP"]

'34'