In [1]:

!jupyter nbconvert --to html --TemplateExporter.exclude_code_cell=True --TemplateExporter.exclude_input_prompt=True --TemplateExporter.exclude_output_prompt=True scc2425-lab3.ipynb 2> /dev/null
!jupyter nbconvert --to slides --TemplateExporter.exclude_input_prompt=True --TemplateExporter.exclude_output_prompt=True scc2425-lab3.ipynb 2> /dev/null

# Cloud Computing Systems
## 2024/25

Lab 3
https://smduarte.github.io/scc2425/

Sérgio Duarte, Kevin Gallagher 

# Goals

+ Create a Cosmos DB account / database and container @ Azure;
+ Create the resources for the project by storing data at Azure Cosmos DB;
+ Start converting Tukano to leverage CosmosDB NoSQL

# Goals

+ **Create a Cosmos DB account / database and container @ Azure;**
+ Start converting Tukano to leverage CosmosDB NoSQL

# Add CosmosDB Account (1)

<img src="cosmosdb-1.png" width="75%"></img>

# Add CosmosDB Account (2)

<img src="cosmosdb-2.png" width="75%"></img>

# Add CosmosDB Account (3)

<img src="cosmosdb-3.png" width="75%"></img>

# Add CosmosDB Account (4)

<img src="cosmosdb-4.png" width="75%"></img>

# Add CosmosDB Account (5)

<img src="cosmosdb-5.png" width="75%"></img>

# Add CosmosDB Account (6)

<img src="cosmosdb-6.png" width="75%"></img>

# Add CosmosDB Account (7)

<img src="cosmosdb-7.png" width="75%"></img>

Note: This takes time...a several minutes sometimes.

# Add Database + Container (1)

<img src="cosmosdb-8.png" width="75%"></img>

# Add Database + Container (2)

<img src="cosmosdb-9.png" width="75%"></img>

### A database may include a set of containers. A container can be a *collection*, *table*, *graph*, etc.

# Add Database + Container (3)

<img src="cosmosdb-10.png" width="75%"></img>

A container is **horizontally partitioned** across multiple machines according to the **partition key**.


# Add Database + Container (4)

Cosmos DB throughput is provisioned. 

Better use at the database level: ***first 1000 RU/s are free***.

**Minimum**: 400 RU/s.

Reading 1 KB document costs 1 RU.

[Info on request units](https://docs.microsoft.com/en-us/azure/cosmos-db/request-units)

<img src="cosmosdb-11.png" width="45%"></img>

# Add Database + Container (5)

By default, CosmosDB indexes every property, of every item, and enforces range indexes for any string or number in all containers.

It is possible to override the default policy by:

+ Overriding the indexing mode: support for consistent, lazy (where indices are updated in background), none;

+ Specifying that some properties do not need to be indexed, using the exclude path.


# Add Database + Container (6)

<img src="cosmosdb-12-1.png" width="45%"></img>

<img src="cosmosdb-12-2.png" width="45%"></img>
### The partition key allows to control which items will be co-located...

Note: Some data models support "transactions", often with limitations, where atomic updates are only available for items belonging to the same partition. The partition key is important for this kind of usage.

# Cosmos DB account: access keys (for code)

<img src="cosmosdb-13.png" width="75%"></img>

# Goals

+ Create a Cosmos DB account / database and container @ Azure;
+ **Create the resources for the project by storing data at Azure Cosmos DB;**
+ Start converting Tukano to leverage CosmosDB NoSQL

# Accessing Cosmos DB: useful links

We will use the library provided by Microsoft.

Java Docs available at:

+ [https://azuresdkdocs.blob.core.windows.net/web/java/azure-cosmos/latest/index.html](https://azuresdkdocs.blob.core.windows.net/web/java/azure-cosmos/latest/index.html)

Overview on how to use at:

+ [https://docs.microsoft.com/en-us/azure/cosmos-db/create-sql-api-java](https://docs.microsoft.com/en-us/azure/cosmos-db/create-sql-api-java)

Cosmos DB SQL cheat sheet:

+ [https://go.microsoft.com/fwlink/?LinkId=623215](https://go.microsoft.com/fwlink/?LinkId=623215)


# Goals

+ Create a StorageAccount + Blob Container @ Azure;**
+ **Rewrite last week's sample project to use the blob storage service**
+ Start converting Tukano to leverage Azure

# Maven dependencies

```xml
	<dependency>
		<groupId>com.azure</groupId>
		<artifactId>azure-cosmos</artifactId>
		<version>4.63.3</version>
	</dependency>
	<dependency>
		<groupId>org.slf4j</groupId>
		<artifactId>slf4j-api</artifactId>
		<version>2.0.16</version>
	</dependency>
	<dependency>
		<groupId>org.slf4j</groupId>
		<artifactId>slf4j-simple</artifactId>
		<version>2.0.16</version>
	</dependency>
```

# Step 1: create client to cosmos DB (1)

```java
    private static final String CONNECTION_URL = ...; // Replace with values from the Azure portal
    private static final String DB_KEY = ... 
    private static final String DB_NAME = ...;
```

# Step 1: create client to cosmos DB (1)

```java
    CosmosClient client = new CosmosClientBuilder()
         .endpoint(CONNECTION_URL)
         .key(DB_KEY)
         .directMode() // comment this if not to use direct mode
         .consistencyLevel(ConsistencyLevel.SESSION)
         .connectionSharingAcrossClientsEnabled(true)
         .contentResponseOnWriteEnabled(true) // On write, return the object written
         .buildClient();
```

# Step 2: get reference to container

```java
static String CONTAINER_NAME = "users";

db = client.getDatabase(DB_NAME);
users = db.getContainer( CONTAINER_NAME);

```

# Step 3: User Class

```java
public class User {
	
	private String id;
	private String pwd;
	private String email;	
	private String displayName;

    ...
```

# Step 3: User Class (What's actually stored)

```java
public class User {
	private String _rid; // Cosmos generated unique id of item
	private String _ts; // timestamp of the last update to the item

	private String id;
	private String pwd;
	private String email;	
	private String displayName;

    ...
```

```java
	private String _rid; // Cosmos generated unique id of item
	private String _ts; // timestamp of the last update to the item
```

Include these fields in your database objects (when reading into) if your application considers this information useful, for instance to manage a cache.

It is possible to use two classes, one that includes the CosmosDB fields, the other without them. The drawback is the need to add methods/constructors to perform conversions between the two types.

It is also possible to class inheritance, by defining the other class to extend the User class by including the new fields. Check the sample code, for an actual example.

# Step 4: write an item

```java
CosmosItemResponse<User> res = users.createItem(u);
if( res.getStatusCode() < 300)
	return res.getItem();
else
	throw new Exception("ERROR:" + res.getStatusCode());    
```

Note: Calls return HTTP status codes that indicate *success* (2XX) or *failure* (>=300)

# Step 5: read items

```java
CosmosPagedIterable<User> res = users.queryItems(
	"SELECT * FROM users WHERE users.id=\"" + id + "\"", 
    new CosmosQueryRequestOptions(), User.class);

for(User u : res) {
	System.out.println(u);
}
```

Note: Despite choosing a NoSQL database backend, CosmosDB provides a query language that is a subset dialect of SQL.

# Additional Nodes

Is it possible to access a Cosmos DB from a program running in my machine?

**Yes**. The example provided does that.

NOTE: This is particularly useful for getting the code correct.

Is it possible to use Cosmos DB with other data models, for instance SQL?

**Yes**. Check the documentation.


# Sample Code

The code provided [scc2425-lab3.zip](scc2425-lab3.zip) is a Maven project for interfacing with CosmosDB (directly).

For testing it in the command line, just run:

```mvn clean compile assembly:single```

This will compile and create a single file with all compiled classes and dependencies.

Run the program as follows:

```java -cp target/scc2425-lab3-1.0-jar-with-dependencies.jar scc.utils.TestUsers```

# Goals

+ Create a Cosmos DB account / database and container @ Azure;
+ Create the resources for the project by storing data at Azure Cosmos DB;
+ **Start converting Tukano to leverage CosmosDB NoSQL**