## Demonstrate Basic Querying of DocumentDB

### Prerequisite:
1. Install DocumentDB Python SDK  (pip install pydocumentdb)
1. Create DocumentDB account and Document DB database from Azure portal
1. Download "DocumentDB Migration tool" from [here](http://www.microsoft.com/downloads/details.aspx?FamilyID=cda7703a-2774-4c07-adcc-ad02ddc1a44d)
1. Import JSON data (volcano data) stored on a local file into documentDB with following command parameters to the migration tool. You can also use the GUI tool and enter the source and target location parameters from below. 

<code>/s:JsonFile /s.Files:[JSON File Location] /t:DocumentDBBulk /t.ConnectionString:AccountEndpoint=https://[DocDBAccountName].documents.azure.com:443/;AccountKey=[[KEY];Database=volcano /t.Collection:volcano1</code>

Copy of volcano data also be found on a blob: https://cahandson.blob.core.windows.net/samples/volcano.json

Execute rest of the code. 

In [4]:
import pydocumentdb.documents as documents
import pydocumentdb.document_client as document_client
import pydocumentdb.errors as errors

In [5]:
#Doc DB access parameters. 
# You can find the DocDB Account name and "Key" on Azure Portal. 
# ReadyOnly key is adequate if you are not writing new records
masterKey = 'ENTER DOC DB Master KEY'
host = 'https://[ENTER DOCDB ACCOUNT NAME].documents.azure.com:443'
db = u'volcano'
collection = 'volcano1'

In [6]:
# client object is the main object to operate with Doc DB
client = document_client.DocumentClient(host,{'masterKey': masterKey})

In [7]:
# Get the pointer to the database you want
database = next((data for data in client.ReadDatabases() if data['id'] == db))

In [8]:
# Get the link to the collection within the database
coll = next((coll for coll in client.ReadCollections(database['_self']) if coll['id'] == collection))

In [9]:
# Use the Doc DB SQL like query language.
# Cheat sheet for DocDB SQL: https://azure.microsoft.com/en-us/documentation/articles/documentdb-sql-query-cheat-sheet/

# Query tries to get list of volcanoes within 300 kms from a given coordinates (Redmond, WA in this case)
# Uses Geospatial Built-in functions ST_DISTANCE( point1, point2 )

query = u'SELECT * \
FROM volcanoes v \
WHERE ST_DISTANCE(v.Location, { \
	"type": "Point", \
	"coordinates": [-122.19, 47.36] \
	}) < 300 * 1000 \
AND v.Type = "Stratovolcano" \
AND v["Last Known Eruption"] = "Last known eruption from 1800-1899, inclusive"'

In [10]:
query

u'SELECT * FROM volcanoes v WHERE ST_DISTANCE(v.Location, { \t"type": "Point", \t"coordinates": [-122.19, 47.36] \t}) < 300 * 1000 AND v.Type = "Stratovolcano" AND v["Last Known Eruption"] = "Last known eruption from 1800-1899, inclusive"'

In [11]:
# Run the query
docs = list(client.QueryDocuments(coll['_self'], {'query': query, 'parameters':[]}))

In [12]:
docs

[{u'Country': u'United States',
  u'Elevation': 4392,
  u'Last Known Eruption': u'Last known eruption from 1800-1899, inclusive',
  u'Location': {u'coordinates': [-121.758, 46.87], u'type': u'Point'},
  u'Region': u'US-Washington',
  u'Status': u'Dendrochronology',
  u'Type': u'Stratovolcano',
  u'Volcano Name': u'Rainier',
  u'_attachments': u'attachments/',
  u'_etag': u'"0000c501-0000-0000-0000-5696f9700000"',
  u'_rid': u'+-hhAPg2OAC5AQAAAAAAAA==',
  u'_self': u'dbs/+-hhAA==/colls/+-hhAPg2OAA=/docs/+-hhAPg2OAC5AQAAAAAAAA==/',
  u'_ts': 1452734832,
  u'id': u'682fe1d3-1e2a-c135-d47f-f3351afd03e3'},
 {u'Country': u'United States',
  u'Elevation': 3426,
  u'Last Known Eruption': u'Last known eruption from 1800-1899, inclusive',
  u'Location': {u'coordinates': [-121.694, 45.374], u'type': u'Point'},
  u'Region': u'US-Oregon',
  u'Status': u'Historical',
  u'Type': u'Stratovolcano',
  u'Volcano Name': u'Hood',
  u'_attachments': u'attachments/',
  u'_etag': u'"0000d202-0000-0000-0000-56