**Secrets**

The secrets below  like the Cosmos account key are retrieved from a secret scope. If you don't have defined a secret scope for a Cosmos Account you want to use when going through this sample you can find the instructions on how to create one here:
- Here you can [Create a new secret scope](./#secrets/createScope) for the current Databricks workspace
  - See how you can create an [Azure Key Vault backed secret scope](https://docs.microsoft.com/azure/databricks/security/secrets/secret-scopes#--create-an-azure-key-vault-backed-secret-scope) 
  - See how you can create a [Databricks backed secret scope](https://docs.microsoft.com/azure/databricks/security/secrets/secret-scopes#create-a-databricks-backed-secret-scope)
- And here you can find information on how to [add secrets to your Spark configuration](https://docs.microsoft.com/azure/databricks/security/secrets/secrets#read-a-secret)
If you don't want to use secrets at all you can of course also just assign the values in clear-text below - but for obvious reasons we recommend the usage of secrets.

In [None]:
cosmosEndpoint = spark.conf.get("spark.cosmos.accountEndpoint")
cosmosMasterKey = spark.conf.get("spark.cosmos.accountKey")

**Preparation - creating the Cosmos DB container to ingest the data into**

Configure the Catalog API to be used

In [None]:
import uuid
spark.conf.set("spark.sql.catalog.cosmosCatalog", "com.azure.cosmos.spark.CosmosCatalog")
spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountEndpoint", cosmosEndpoint)
spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountKey", cosmosMasterKey)
spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.views.repositoryPath", "/viewDefinitions" + str(uuid.uuid4()))

Creating a new container to be used for the push down sample and inserting a couple of test records

In [None]:
%sql
CREATE DATABASE IF NOT EXISTS cosmosCatalog.PushDownSample;

CREATE TABLE IF NOT EXISTS cosmosCatalog.PushDownSample.PushDownSample
USING cosmos.oltp
TBLPROPERTIES(partitionKeyPath = '/id', manualThroughput = '400', indexingPolicy = 'OnlySystemProperties');

Setting up the write config to ingest data into the new container

In [None]:
writeCfg = {
  "spark.cosmos.accountEndpoint": cosmosEndpoint,
  "spark.cosmos.accountKey": cosmosMasterKey,
  "spark.cosmos.database": "PushDownSample",
  "spark.cosmos.container": "PushDownSample",
  "spark.cosmos.write.strategy": "ItemOverwrite",
}

readCfg = {
  "spark.cosmos.accountEndpoint": cosmosEndpoint,
  "spark.cosmos.accountKey": cosmosMasterKey,
  "spark.cosmos.database": "PushDownSample",
  "spark.cosmos.container": "PushDownSample",
  "spark.cosmos.read.inferSchema.includeSystemProperties": "True"
}

Ingesting some sample data with a 5 seconds delay to ensure different _ts values

In [None]:
from pyspark.sql import Row
import time
initialRows = [('00001','First Record'),('00002','Second Record')]
initialRdd = sc.parallelize(initialRows)
initialDF = sqlContext.createDataFrame(initialRdd.map(lambda x: Row(id=x[0], someValue=x[1])))

initialDF \
  .write \
  .format("cosmos.oltp") \
  .mode("Append") \
  .options(**writeCfg) \
  .save()

time.sleep(5)

tsThreshold = int(time.time())
nextRows = [('00003','Third Record'),('00004','Fourth Record')]
nextRdd = sc.parallelize(nextRows)
nextDF = sqlContext.createDataFrame(nextRdd.map(lambda x: Row(id=x[0], someValue=x[1])))

nextDF \
  .write \
  .format("cosmos.oltp") \
  .mode("Append") \
  .options(**writeCfg) \
  .save()


Get all records to be able to see the _ts values

In [None]:
query_df = spark.read.format("cosmos.oltp").options(**readCfg).load()
query_df.show()

assert query_df.count() == 4

Show the query plan for the unfiltered query

In [None]:
query_df.explain()

Get all records with a _ts high enough to filter only some of the records

In [None]:
raw_query_df = spark.read.format("cosmos.oltp").options(**readCfg).load()
filtered_query_df = raw_query_df.where("_ts >= " + str(tsThreshold))
filtered_query_df.show()

assert filtered_query_df.count() == 2

Show the query plan for the filtered query

In [None]:
filtered_query_df.explain()

**Cleanup - deleting the Cosmos DB container and database again (to reduce cost) - skip this step if you want to keep them**

In [None]:
%sql
DROP TABLE IF EXISTS cosmosCatalog.PushDownSample.PushDownSample;
DROP DATABASE IF EXISTS cosmosCatalog.PushDownSample CASCADE;