# Explore non-relational data in Azure

Learn the fundamentals of database concepts in a cloud environment, get basic skilling in cloud data services, and build your foundational knowledge of cloud data services within Microsoft Azure. You will explore non-relational data offerings, provisioning and deploying non-relational databases, and non-relational data stores with Microsoft Azure.

## Explore non-relational data offerings in Azure

**Learning objectives**

In this module, you will:

* Explore use-cases and management benefits of using Azure Table Storage
* Explore use-cases and management benefits of using Azure Blob Storage
* Explore use-cases and management benefits of using Azure File Storage
* Explore use-cases and management benefits of using Azure Cosmos DB

To help ensure fast access, Azure Table Storage splits a table into partitions.
Partitioning Helps to organize data and improve scalability and performance.

![Table partition](table-partitions.png)

This scheme enables an application to quickly perform Point queries that identify a single row, and Range queries that fetch a contiguous block of rows in a partition.

In a range query, the application searches for a set of rows in a partition, specifying the start and end point of the set as row keys.

### Use cases and management benefits of using Azure Table Storage

The primary advantages of using Azure Table Storage tables over other ways of storing data include:

* It's simpler to scale. It takes the same time to insert data in an empty table, or a table with billions of entries. An Azure storage account can hold up to 5 PB of data.
* A table can hold semi-structured data
* There's no need to map and maintain the complex relationships typically required by a normalized relational database.
* Row insertion is fast
* Data retrieval is fast, if you specify the partition and row keys as query criteria
  
There are disadvantages to storing data this way though, including:

* Consistency needs to be given consideration as transactional updates across multiple entities aren't guaranteed
* There's no referential integrity; any relationships between rows need to be maintained externally to the table
* It's difficult to filter and sort on non-key data. Queries that search based on non-key fields could result in full table scans

## Explore Azure Blob storage

Many applications need to store large, binary data objects, such as images and video streams. Microsoft Azure virtual machines use blob storage for holding virtual machine disk images. These objects can be several hundreds of GB in size.

**What is Azure Blob storage?**

Azure Blob storage is a service that enables you to store massive amounts of unstructured data, or blobs, in the cloud

Azure currently supports three different types of blob:

* *Block blobs*. A block blob is handled as a set of blocks. Each block can vary in size, up to 100 MB. A block blob can contain up to 50,000 blocks, giving a maximum size of over 4.7 TB. The block is the smallest amount of data that can be read or written as an individual unit. Block blobs are best used to store discrete, large, binary objects that change infrequently.

* *Page blobs*. A page blob is organized as a collection of fixed size 512-byte pages. A page blob is optimized to support random read and write operations; you can fetch and store data for a single page if necessary. A page blob can hold up to 8 TB of data. Azure uses page blobs to implement virtual disk storage for virtual machines.

* *Append blobs*. An append blob is a block blob optimized to support append operations. You can only add blocks to the end of an append blob; updating or deleting existing blocks isn't supported. Each block can vary in size, up to 4 MB. The maximum size of an append blob is just over 195 GB.


Inside an Azure storage account, you create blobs inside containers. A container provides a convenient way of grouping related blobs together, and you can organize blobs in a hierarchy of folders, similar to files in a file system on disk.

![Blob Container](Blob_container.png)

Blob storage provides three access tiers, which help to balance access latency and storage cost:

* Hot tier
* Cool tier
* Archive tier

Common uses of Azure Blob Storage include:

* Serving images or documents directly to a browser, in the form of a static website. Visit Static website hosting in Azure storage for detailed information.
* Storing files for distributed access
* Streaming video and audio
* Storing data for backup and restore, disaster recovery, and archiving
* Storing data for analysis by an on-premises or Azure-hosted service

### Explore Azure File Storage

For storing files...

### Explore Azure Cosmos DB

**What is Azure Cosmos DB?**

Azure Cosmos DB is a multi-model NoSQL database management system. Cosmos DB manages data as a partitioned set of documents. A document is a collection of fields, identified by a key. 

Cosmos DB provides APIs that enable you to access these documents using a set of well-known interfaces.

* SQL API
* Table API
* MongoDB API
* Cassandra API
* Gremlin API: provides graph like relationship

## Explore provisioning and deploying non-relational data services in Azure

**Learning objectives**

In this module, you will:

* Provision non-relational data services
* Configure non-relational data services
* Explore basic connectivity issues
* Explore data security components

### Describe provisioning non-relational data services

As the data engineer, you're asked to set up data stores using Azure Cosmos DB, Azure Blob storage, Azure Data Lake store, and Azure File storage.

The act of increasing (or decreasing) the resources used by a service is called scaling.

Azure provides several tools you can use to provision services:

* *The Azure portal.*
* *The Azure command-line interface (CLI)*
* *Azure PowerShell.*
* *Azure Resource Manager templates*


**Provision Azure Cosmos DB**

Azure Cosmos DB is a document database. In Cosmos DB, you organize your data as a collection of documents stored in containers. Containers are held in a database. 


### Provision other non-relational data services

This unit describes how to provision Data Lake storage, Blob storage, and File Storage. As with Cosmos DB, you can provision these services using the Azure portal, the Azure CLI, Azure PowerShell, and Azure Resource Manager templates.

### Exercise: Provision non-relational Azure data services

In the sample scenario, you've decided to create the following data stores:

* A Cosmos DB for holding information about the volume of items in stock. You need to store current and historic information about volume levels, so you can track how levels vary over time. The data is recorded daily.
* A Data Lake store for holding production and quality data.
* A blob container for holding images of the products the company manufactures.
* File storage for sharing reports.

In this exercise, you'll provision and configure the Cosmos DB account, and test it by creating a database, a container, and a sample document. You'll also provision an Azure Storage account that can provide blob, file, and Data Lake storage.

## Manage non-relational data stores in Azure

Learn how to upload and retrieve data held in Azure Cosmos DB, Azure Blob storage, and Azure File storage.

**Learning objectives**

In this module, you will:

* Describe Azure Cosmos DB APIs
* Describe non-relational Azure storage management

### Manage Azure Cosmos DB

Azure Cosmos DB is a NoSQL database management system. It's compatible with some existing NoSQL systems, including MongoDB and Cassandra. In the Contoso scenario, you've created a Cosmos DB database for holding information about the quantity of items in stock. You now need to understand how to populate this database, and how to query it.

![partitioned data](partitioned-data.png)

### Query Azure Cosmos DB

Although Azure Cosmos DB is described as a NoSQL database management system, the SQL API enables you to run SQL-like queries against Cosmos DB databases. These queries use a syntax similar to that of SQL, but there are some differences. This is because the data in a Cosmos DB is structured as documents rather than tables.

### Manage Azure Blob storage

Azure Blob storage is a suitable repository for holding large binary objects, such as images, video, and audio files. In the Contoso scenario, you've created a blob container for holding images of the products the company manufactures.

### Manage Azure File storage

You can use Azure File storage to store shared files. Users can connect to a shared folder (also known as a file share) and read and write files (if they have the appropriate privileges) in much the same way as they would use a folder on a local machine.

### Exercise: Upload, download, and query data in a non-relational data store


In the sample scenario, suppose that you've created the following data stores:

* A Cosmos DB database for holding information about the products that Contoso manufactures.
* A blob container in Azure Storage for holding the images of products.
In this exercise, you'll run a script to upload data to these data stores. You'll perform queries against the data in the Cosmos DB database. Then, you'll download and view the images held in Azure Storage.

```Azure Cloud Shell
Requesting a Cloud Shell.Succeeded.
Connecting terminal...

Welcome to Azure Cloud Shell

Type "az" to use Azure CLI
Type "help" to learn about Cloud Shell

diego@Azure:~$ git clone https://github.com/MicrosoftLearning/DP-900T00A-Azure-Data-Fundamentals dp-900
Cloning into 'dp-900'...
remote: Enumerating objects: 3646, done.
remote: Counting objects: 100% (21/21), done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 3646 (delta 15), reused 12 (delta 12), pack-reused 3625
Receiving objects: 100% (3646/3646), 4.78 MiB | 17.18 MiB/s, done.
Resolving deltas: 100% (732/732), done.
diego@Azure:~$ cd dp-900/nosql
diego@Azure:~/dp-900/nosql$ bash setup.sh
Creating Cosmos DB account

Cosmos DB account name:  cosmos31779
Cosmos DB database:  ProductData
Cosmos DB container:  ProductCatalog
Storage account name:  storage32042

diego@Azure:~/dp-900/nosql$
```