# Batch ingestion into Azure Cosmos DB collection

In this notebook, we'll 

1. Load the IoTDeviceInfo dataset from ADLS Gen2 to a dataframe
2. Write the dataframe to the Azure Cosmos DB collection

>**Did you know?**  [Azure Synapse Link for Azure Cosmos DB](https://docs.microsoft.com/en-us/azure/cosmos-db/synapse-link) is a hybrid transactional and analytical processing (HTAP) capability that enables you to run near real-time analytics over operational data in Azure Cosmos DB.
&nbsp;

>**Did you know?**  [Azure Cosmos DB analytical store](https://docs.microsoft.com/en-us/azure/cosmos-db/analytical-store-introduction) is a fully isolated column store for enabling large scale analytics against operational data in your Azure Cosmos DB, without any impact to your transactional workloads.
&nbsp;


### 1. Load the IoTDeviceInfo dataset from ADLS Gen2 to a dataframe
>**Did you know?**  The Synapse workspace is attached to an ADLS Gen2 storage account and the files placed on the default storage account can be accessed using the relative path as below.
&nbsp;



In [3]:
dfDeviceInfo = (spark
                .read
                .csv("/IoTData/IoTDeviceInfo.csv", header=True, inferSchema='true')
              )

### 2. Write the dataframe to the Azure Cosmos DB collection
>**Did you know?** The "cosmos.oltp" is the Spark format that enables connection to the Cosmos DB Transactional store.

>**Did you know?** The ingestion to the Azure Cosmos DB collection is always performed through the Transactional store irrespective of whether the Analytical Store is enabled or not.

In [4]:
dfDeviceInfo.write\
            .format("cosmos.oltp")\
            .option("spark.synapse.linkedService", "CosmosDBIoTDemo")\
            .option("spark.cosmos.container", "IoTDeviceInfo")\
            .option("spark.cosmos.write.upsertEnabled", "true")\
            .mode('append')\
            .save()
