# Parsing JSON

Python has a lot of options for parsing GeoJSON data.  In almost all cases, the encoded string must be loaded into memory in its entirity for the parser to work on it. In our case, this could be quite a large chunk of data.  

The alternatie is to use `ijson`, which will parse as data is read and produce items from the JSON using Python iterators.

In [1]:
import ijson

try:
    with open("/home/trantham/nldi-crawler-py/CrawlerData_10_dfw0go0s.geojson", "r") as fh:
        count=1
        for  itm in ijson.items(fh, 'features.item'):
            print(".", end="")
            count+=1
            if count % 120 == 0:
                print(" ")
except ijson.JSONError:
    print("\nDone.\n")

......................................................................

In [2]:
count

71

In [29]:
import shapely
import json

itm_string = r'{ "type": "Feature", "properties": { "Site Name": "Aching Shoulder Slope, New Mexico, USA", "SBID": "5fe395bbd34ea5387deb4950", "Location": " Mitten Rock, New Mexico USA", "Principal Investigator": "William Emmett", "Date of site establishment and/or field measurements": "08/1963; 08/1964; 08/1965; 08/1968", "Original date of submission to the Vigil Network": null, "Purposes": "Erosion\r\nChannel change\r\nMass-movement\r\nSedimentation", "Annual Precipitation (mm)": 220.0, "Elevation (m)": 1815.0, "Drainage Area (square km)": null, "Geology": "Igneous-rhyolite and Sedimentary-hornstone", "Hydrology": null, "Vegetation": "Sparse grasses", "Bench marks": "10", "Photography": null, "USGS 7.5 minute maps": "Mitten Rock, NM", "Hillslopes": "Erosion stakes: 2\r\nMass-movement pins: 1\r\nPainted rock lines: 1\r\nProfiles: 2", "Stream Channels": "Channel cross sections: 3\r\nBed profile: 1\r\nHeadcut retreat: 1\r\nOther: 8 n/w sections", "Vegetation Data": null, "Miscellaneous": null, "COMID": null, "REACHCODE": null, "REACH_meas": null, "offset": null, "SBURL": "https://www.sciencebase.gov/catalog/item/5fe395bbd34ea5387deb4950" }, "geometry": { "type": "Point", "coordinates": [ -108.945277777777775, 36.605277777777779 ] } }'
itm = json.loads(itm_string)



The 'parsed' GeoJSON item (stored in `itm`) is what we get from `ijson`.  Need to proceed with that data structure. 

In [30]:
from shapely import from_geojson
shp = from_geojson(json.dumps(itm))

In [32]:
shapely.to_wkb(shp)

b'\x01\x01\x00\x00\x00;L]n\x7f<[\xc0\x8bF\x02\xbeyMB@'

In [33]:
from geoalchemy2.shape import from_shape
from geoalchemy2.elements import WKBElement

In [34]:
e = WKBElement(shapely.to_wkb(shp))

In [35]:
e

<WKBElement at 0x7f72dd8a2e80; 01010000003b4c5d6e7f3c5bc08b4602be794d4240>