Skip to content

Commit

Permalink
Merge pull request #11865 from MicrosoftDocs/learn-build-service-prod…
Browse files Browse the repository at this point in the history
…bot/docutune-autopr-20240415-050706-1277797-ignore-build

[DocuTune-Remediation] - DocuTune scheduled execution in AAC (part 3)
  • Loading branch information
v-dirichards committed Apr 15, 2024
2 parents 171d0a8 + 1732418 commit 5be601f
Show file tree
Hide file tree
Showing 9 changed files with 255 additions and 257 deletions.
8 changes: 4 additions & 4 deletions docs/ai-ml/architecture/real-time-recommendation-content.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ This reference architecture is for training and deploying a real-time recommende
### Components

- [Azure Databricks][databricks-service-page]. Databricks is a development environment used to prepare input data and train the recommender model on a Spark cluster. Azure Databricks also provides an interactive workspace to run and collaborate on notebooks for any data processing or machine learning tasks.
- [Azure Kubernetes Service][aks-service-page] (AKS). AKS is used to deploy and operationalize a machine learning model service API on a Kubernetes cluster. AKS hosts the containerized model, providing scalability that meets your throughput requirements, identity and access management, and logging and health monitoring.
- [Azure Kubernetes Service (AKS)][aks-service-page]. AKS is used to deploy and operationalize a machine learning model service API on a Kubernetes cluster. AKS hosts the containerized model, providing scalability that meets your throughput requirements, identity and access management, and logging and health monitoring.
- [Azure Cosmos DB][cosmosdb-service-page]. Azure Cosmos DB is a globally distributed database service used to store the top 10 recommended movies for each user. Azure Cosmos DB is well-suited for this scenario, because it provides low latency (10 ms at 99th percentile) to read the top recommended items for a given user.
- [Machine Learning][mls-service-page]. This service is used to track and manage machine learning models, and then package and deploy these models to a scalable AKS environment.
- [Microsoft Recommenders][github]. This open-source repository contains utility code and samples to help users get started in building, evaluating, and operationalizing a recommender system.
Expand Down Expand Up @@ -99,7 +99,7 @@ To deploy this architecture, follow the **Azure Databricks** instructions in the
1. Clone the [Microsoft Recommenders][github] repository into an environment where you can execute scripts (for example, your local computer).
1. Follow the **Quick install** setup instructions to [install the relevant libraries][setup] on Azure Databricks.
1. Follow the **Quick install** setup instructions to [prepare Azure Databricks for operationalization][setupo16n].
1. Import the [ALS Movie Operationalization notebook][als-example] into your workspace. After logging into your Azure Databricks workspace, do the following:
1. Import the [ALS Movie Operationalization notebook][als-example] into your workspace. After signing in to your Azure Databricks workspace, do the following:
1. Click **Home** on the left side of the workspace.
1. Right-click on white space in your home directory. Select **Import**.
1. Select **URL**, and paste the following into the text field:
Expand All @@ -110,13 +110,13 @@ To deploy this architecture, follow the **Azure Databricks** instructions in the

## Contributors

*This article is maintained by Microsoft. It was originally written by the following contributors.*
*This article is maintained by Microsoft. It was originally written by the following contributors.*

Principal authors:

- [Miguel Fierro](https://www.linkedin.com/in/miguelgfierro) | Principal Data Scientist Manager
- [Nikhil Joglekar](https://www.linkedin.com/in/nrjoglekar) | Product Manager, Azure algorithms and data science

*To see non-public LinkedIn profiles, sign in to LinkedIn.*

## Next steps
Expand Down
10 changes: 5 additions & 5 deletions docs/ai-ml/architecture/search-blob-metadata-content.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
This article demonstrates how to create a search service that enables users to search for documents based on document content in addition to any metadata that's associated with the files.

You can implement this service by using [multiple indexers](/azure/search/search-indexer-overview#indexer-scenarios-and-use-cases) in [Azure AI Search](/azure/search/search-what-is-azure-search).
You can implement this service by using [multiple indexers](/azure/search/search-indexer-overview#indexer-scenarios-and-use-cases) in [Azure AI Search](/azure/search/search-what-is-azure-search).

This article uses an example workload to demonstrate how to create a single [search index](/azure/search/search-what-is-an-index) that's based on files in [Azure Blob Storage](/azure/storage/blobs/storage-blobs-overview). The file metadata is stored in [Azure Table Storage](/azure/storage/tables/table-storage-overview).

Expand All @@ -20,7 +20,7 @@ This article uses an example workload to demonstrate how to create a single [sea

### Components

- [Blob Storage](https://azure.microsoft.com/products/storage/blobs/) provides cost-effective cloud storage for file data, including data in formats like PDF, HTML, and CSV, and in Microsoft Office files.
- [Blob Storage](https://azure.microsoft.com/products/storage/blobs/) provides cost-effective cloud storage for file data, including data in formats like PDF, HTML, and CSV, and in Microsoft 365 files.
- [Table Storage](https://azure.microsoft.com/products/storage/tables/) provides storage for nonrelational structured data. In this scenario, it's used to store the metadata for each document.
- [Azure AI Search](https://azure.microsoft.com/products/search/) is a fully managed search service that provides infrastructure, APIs, and tools for building a rich search experience.

Expand All @@ -44,7 +44,7 @@ Because the files that are searched in this scenario are binary documents, you c

### Searching file metadata

If you want to include additional information about the files, you can directly associate [metadata](/azure/storage/blobs/storage-blob-properties-metadata) with the blobs, without using a separate store. The built-in [Blob Storage search indexer can even read this metadata](/azure/search/search-howto-indexing-azure-blob-storage#indexing-blob-metadata) and place it in the search index. This enables users to search for metadata along with the file content. However, [the amount of metadata is limited to 8 KB per blob](/rest/api/storageservices/Setting-and-Retrieving-Properties-and-Metadata-for-Blob-Resources#Subheading1), so the amount of information that you can place on each blob is fairly small. You might choose to store only the most critical information directly on the blobs. In this scenario, only the document's *author* is stored on the blob.
If you want to include additional information about the files, you can directly associate [metadata](/azure/storage/blobs/storage-blob-properties-metadata) with the blobs, without using a separate store. The built-in [Blob Storage search indexer can even read this metadata](/azure/search/search-howto-indexing-azure-blob-storage#indexing-blob-metadata) and place it in the search index. This enables users to search for metadata along with the file content. However, the [amount of metadata is limited to 8 KB per blob](/rest/api/storageservices/Setting-and-Retrieving-Properties-and-Metadata-for-Blob-Resources#Subheading1), so the amount of information that you can place on each blob is fairly small. You might choose to store only the most critical information directly on the blobs. In this scenario, only the document's *author* is stored on the blob.

To overcome this storage limitation, you can place additional metadata in another [data source that has a supported indexer](/azure/search/search-indexer-overview#supported-data-sources), like [Table Storage](/azure/storage/tables/table-storage-overview). You can add the document type, business impact, and other metadata values as separate columns in the table. If you configure the built-in [Table Storage indexer](/azure/search/search-howto-indexing-azure-tables) to target the same search index as the blob indexer, the blob and table storage metadata is combined for each document in the search index.

Expand Down Expand Up @@ -73,7 +73,7 @@ These considerations implement the pillars of the Azure Well-Architected Framewo

Reliability ensures that your application can meet the commitments you make to your customers. For more information, see [Overview of the reliability pillar](/azure/architecture/framework/resiliency/overview).

Azure AI Search provides a [high SLA](https://go.microsoft.com/fwlink/?LinkId=716855) for *reads* (querying) if you have at least two [replicas](/azure/search/search-capacity-planning#concepts-search-units-replicas-partitions-shards). It provides a high SLA for *updates* (updating the search indexes) if you have at least three replicas. You should therefore provision at least two replicas if you want your users to be able to search reliably, and three if actual changes to the index also need to be high-availability operations.
Azure AI Search provides a [high service-level agreement (SLA)](https://go.microsoft.com/fwlink/?LinkId=716855) for *reads* (querying) if you have at least two [replicas](/azure/search/search-capacity-planning#concepts-search-units-replicas-partitions-shards). It provides a high SLA for *updates* (updating the search indexes) if you have at least three replicas. You should therefore provision at least two replicas if you want your users to be able to search reliably, and three if actual changes to the index also need to be high availability operations.

[Azure Storage always stores multiple copies of your data](/azure/storage/common/storage-redundancy) to help protect it against planned and unplanned events. Azure Storage provides additional redundancy options for replicating data across regions. These safeguards apply to data in blob and table storage.

Expand All @@ -91,7 +91,7 @@ You can connect from the search service to the storage account by using a [priva

Cost optimization is about reducing unnecessary expenses and improving operational efficiencies. For more information, see [Overview of the cost optimization pillar](/azure/architecture/framework/cost/overview).

For information about the costs of running this scenario, see this preconfigured [estimate in the Azure pricing calculator](https://azure.com/e/375d2b930db14fbe90537421331f41de). All the services described here are configured in this estimate. The estimate is for a workload that has a total document size of 20 GB in Blob Storage and 1 GB of metadata in Table Storage. Two search units are used to satisfy the SLA for read purposes, as described in the [reliability](#reliability) section of this article. To see how the pricing would change for your particular use case, change the appropriate variables to match your expected usage.
For information about the costs of running this scenario, see this preconfigured [estimate in the Azure pricing calculator](https://azure.com/e/375d2b930db14fbe90537421331f41de). All the services described here are configured in this estimate. The estimate is for a workload that has a total document size of 20 GB in Blob Storage and 1 GB of metadata in Table Storage. Two search units are used to satisfy the SLA for read purposes, as described in the [Reliability](#reliability) section of this article. To see how the pricing would change for your particular use case, change the appropriate variables to match your expected usage.

If you review the estimate, you can see that the cost of blob and table storage is relatively low. Most of the cost is incurred by Azure AI Search, because it performs the actual indexing and compute for running search queries.

Expand Down
Loading

0 comments on commit 5be601f

Please sign in to comment.