Skip to content

Commit

Permalink
Merge pull request #2769 from vmagelo/batch2-seo-fix8
Browse files Browse the repository at this point in the history
Batch2 seo fix8 newsfeed-ingestion
  • Loading branch information
Ed Price - MSFT committed Apr 28, 2021
2 parents bc658b0 + ad969c1 commit 6977828
Show file tree
Hide file tree
Showing 9 changed files with 81 additions and 32 deletions.
5 changes: 5 additions & 0 deletions .openpublishing.redirection.json
Original file line number Diff line number Diff line change
Expand Up @@ -2101,6 +2101,11 @@
"source_path": "docs/solution-ideas/articles/cctv-mask-detection.yml",
"redirect_url": "/azure/architecture/solution-ideas/articles/cctv-iot-edge-for-covid-19-safe-environment-and-mask-detection",
"redirect_document_id": true
},
{
"source_path": "docs/example-scenario/ai/newsfeed-ingestion.yml",
"redirect_url": "/azure/architecture/example-scenario/ai/news-feed-ingestion-and-near-real-time-analysis",
"redirect_document_id": true
}
]
}
20 changes: 19 additions & 1 deletion docs/aws-professional/services.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ ms.service: architecture-center
ms.subservice: cloud-fundamentals
ms.custom:
- fcp
ms.category:
- analytics
- ai-machine-learning
keywords:
- cloud services comparison
- cloud services compared
Expand All @@ -18,6 +21,21 @@ keywords:
- compare Azure and AWS
- compare AWS and Azure
- IT capabilities
categories:
- compute
- storage
- databases
- networking
- security
- ai-machine-learning
products:
- azure-cosmos-db
- azure-functions
- azure-storage
- azure-search
- azure-cognitive-search
- azure-computer-vision
- azure-translator-text
---

<!-- cSpell:ignore Alexa Rekognition Cognito ElastiCache Greengrass Firehose -->
Expand Down Expand Up @@ -129,7 +147,7 @@ For an overview of Azure for AWS users, see [Introduction to Azure for AWS profe

[!INCLUDE [Advanced Analytics Architecture](../../includes/cards/advanced-analytics-on-big-data.md)]
[!INCLUDE [Automated enterprise BI](../../includes/cards/enterprise-bi-adf.md)]
[!INCLUDE [Mass ingestion and analysis of news feeds on Azure](../../includes/cards/newsfeed-ingestion.md)]
[!INCLUDE [Mass ingestion and analysis of news feeds on Azure](../../includes/cards/news-feed-ingestion-and-near-real-time-analysis.md)]

</ul>

Expand Down
6 changes: 3 additions & 3 deletions docs/browse/data/architectures.json
Original file line number Diff line number Diff line change
Expand Up @@ -7297,8 +7297,8 @@
"pricing-guidance"
],
"type": "example-workload",
"file_url": "example-scenario/ai/newsfeed-ingestion.md",
"http_url": "/azure/architecture/example-scenario/ai/newsfeed-ingestion",
"file_url": "example-scenario/ai/news-feed-ingestion-and-near-real-time-analysis.md",
"http_url": "/azure/architecture/example-scenario/ai/news-feed-ingestion-and-near-real-time-analysis",
"word_count": 1334,
"read_time": "5 min read",
"Title": "Mass ingestion and analysis of news feeds on Azure",
Expand All @@ -7324,7 +7324,7 @@
},
"sample_code": true,
"github_url": "https://github.com/Azure/cognitive-services",
"name": "newsfeed-ingestion",
"name": "news-feed-ingestion-and-near-real-time-analysis",
"popularity": 69,
"topic": "Analytics"
},
Expand Down
File renamed without changes
Original file line number Diff line number Diff line change
Expand Up @@ -3,22 +3,22 @@



This example scenario describes a pipeline for mass ingestion and near real-time analysis of documents using public RSS news feeds. It uses Azure Cognitive Services to offer useful insights including text translation, facial recognition, and sentiment detection.
This example scenario describes a pipeline for mass ingestion and near real-time analysis of documents coming from public RSS news feeds. It uses [Azure Cognitive Services](/azure/cognitive-services/what-are-cognitive-services) to provide useful insights based on text translation, facial recognition, and sentiment detection. Specifically, image and natural language processing steps are connected together in a messaging pipeline based on [Azure Service Bus](/azure/service-bus-messaging/service-bus-messaging-overview). The output of the pipeline is a notification containing the insight or analysis.

This scenario contains examples for [English][english], [Russian][russian], and [German][german] news feeds, but you can easily extend it to other RSS feeds. For ease of deployment, the data collection, processing, and analysis are based entirely on Azure services.
This scenario contains examples for [English][english], [Russian][russian], and [German][german] news feeds, but you can easily extend it to other RSS feeds and other languages. For ease of deployment, the data collection, processing, and analysis are based entirely on Azure services.

## Relevant use cases
## Potential use cases

While this scenario is based on processing of RSS feeds, it's relevant to any document, website, or article where you would need to:

- Translate any text to the language of choice.
- Translate text to a language of choice.
- Find key phrases, entities, and user sentiment in digital content.
- Detect objects, text, and landmarks in images associated with a digital article.
- Detect people by their gender and age in any image associated with digital content.
- Detect people by gender and age in images associated with digital content.

## Architecture

![Diagram of the architecture][architecture]
![Architecture diagram: ingest and analyze RSS feeds using image and text processing and send notifications.][architecture]

The data flows through the solution as follows:

Expand All @@ -32,7 +32,7 @@ The data flows through the solution as follows:

5. A detect function is triggered from the queued article. It uses the [Computer Vision][vision] service to detect objects, landmarks, and written words in the associated image, then passes the article to the next queue.

6. A face function is triggered is triggered from the queued article. It uses the [Azure Face API][face] service to detect faces for gender and age in the associated image, then passes the article to the next queue.
6. A face function is triggered from the queued article. It uses the [Azure Face API][face] service to detect faces for gender and age in the associated image, then passes the article to the next queue.

7. When all functions are complete, the notify function is triggered. It loads the processed records for the article and scans them for any results you want. If found, the content is flagged and a notification is sent to the system of your choice.

Expand All @@ -56,23 +56,23 @@ The following list of Azure components is used in this example.

### Alternatives

- Instead of using a pattern based on queue notification and Azure Functions, use another pattern for this data flow. For example, [Azure Service Bus Topics][topics] can be used to processes the various parts of the article in parallel as opposed to the serial processing done in this example. For more information, compare [queues and topics][queues-topics].
- Instead of using a pattern based on *queue notification* and Azure Functions, you could use a *topic and subscription* pattern for this data flow. [Azure Service Bus Topics][topics] can be used to process the various parts of the article in parallel as opposed to the serial processing done in this example. For more information, compare [queues and topics][queues-topics].

- Use [Azure Logic Apps][logic-app] to implement the function code and implement record-level locking such as [Redlock][redlock] (needed for parallel processing until Azure Cosmos DB supports [partial document updates][partial]). For more information, [compare Functions and Logic Apps][compare].
- Use [Azure Logic Apps][logic-app] to implement the function code and implement record-level locking such as that provided by the [Redlock algorithm][redlock] (which is needed for parallel processing until Azure Cosmos DB supports [partial document updates][partial]). For more information, [compare Functions and Logic Apps][compare].

- Implement this architecture using customized AI components rather than existing Azure services. For example, extend the pipeline using a customized model that detects certain people in an image as opposed to the generic people count, gender, and age data collected in this example. To use customized machine learning or AI models with this architecture, build the models as RESTful endpoints so they can be called from Azure Functions.

- Use a different input mechanism instead of RSS feeds. Use multiple generators or ingestion processes to feed Azure Cosmos DB and Azure Storage.

- [Azure Cognitive Search](/azure/search) is an AI feature in Azure Search that can also used to extract text from images, blobs, and other unstructured data sources.
- [Azure Cognitive Search](/azure/search) is an AI feature in Azure Search that can also be used to extract text from images, blobs, and other unstructured data sources.

## Considerations

For simplicity, this example scenario uses only a few of the available APIs and services from Azure Cognitive Services. For example, text in images can be analyzed using the [Text Analytics API][text-analytics]. The target language in this scenario is assumed to be English, but you can change the input to any [supported language][language] of your choice.
For simplicity, this example scenario uses only a few of the available APIs and services from Azure Cognitive Services. For example, text in images can be analyzed using the [Text Analytics API][text-analytics]. The target language in this scenario is assumed to be English, but you can change the input to any [supported language][language].

### Scalability

Azure Functions scaling depends on the [hosting plan][plan] you use. This solution assumes a [Consumption plan][plan-c], in which compute power is automatically allocated to the functions when required. You pay only when your functions are running. Another option is to use an [Azure App Service][plan-aas] plan, which allows you to scale between tiers to allocate a different amount of resources.
Azure Functions scaling depends on the [hosting plan][plan] you use. This solution assumes a [Consumption plan][plan-c], in which compute power is automatically allocated to the functions when required. You pay only when your functions are running. Another option is to use a [Dedicated plan][plan-ded], which allows you to scale between tiers to allocate a different amount of resources.

With Azure Cosmos DB, the key is to distribute your workload roughly evenly among a sufficiently large number of [partition keys][keys]. There's no limit to the total amount of data that a container can store or to the total amount of
[throughput][throughput] that a container can support.
Expand All @@ -85,7 +85,7 @@ To view the logs generated by the solution:

1. Go to [Azure portal][portal] and navigate to the resource group created for the deployment.

2. Click the **Application Insights** instance.
2. Select the **Application Insights** instance.

3. From the **Application Insights** section, navigate to **Investigate\\Search** and search the data.

Expand All @@ -95,8 +95,6 @@ Azure Cosmos DB uses a secured connection and shared access signature through th

## Pricing

The estimated daily cost to keep this deployment available is approximately \$20 U.S. with no data moving through the system.

Azure Cosmos DB is powerful but incurs the greatest [cost][db-cost] in this deployment. You can use another storage solution by refactoring the Azure Functions code provided.

Pricing for Azure Functions varies depending on the [plan][function-plan] it runs in.
Expand All @@ -108,6 +106,27 @@ Pricing for Azure Functions varies depending on the [plan][function-plan] it run
All the code for this scenario is available in the [GitHub][github] repository. This repository contains the source code used to build the generator application that feeds the pipeline for this demo.

## Next steps

* [Choosing an analytical data store in Azure](/azure/architecture/data-guide/technology-choices/analytical-data-stores)
* [Choosing a data analytics technology in Azure](/azure/architecture/data-guide/technology-choices/analysis-visualizations-reporting)
* [Choosing a big data storage technology in Azure](/azure/architecture/data-guide/technology-choices/data-storage)
* [Introduction to Azure Blob storage](/azure/storage/blobs/storage-blobs-introduction)
* [Welcome to Azure Cosmos DB](/azure/cosmos-db/introduction)
* [Introduction to Azure Functions](/azure/azure-functions/functions-overview)

## Related resources

Additional analytics architectures:

* [Automated enterprise BI](/azure/architecture/reference-architectures/data/enterprise-bi-adf)
* [Analytics end-to-end with Azure Synapse](/azure/architecture/example-scenario/dataplate2e/data-platform-end-to-end)
* [Data warehousing and analytics](/azure/architecture/example-scenario/data/data-warehouse)
* [Mass ingestion and analysis of news feeds on Azure](/azure/architecture/example-scenario/ai/newsfeed-ingestion)
* [Stream processing with Azure Databricks](/azure/architecture/reference-architectures/data/stream-processing-databricks)
* [Stream processing with Azure Stream Analytics](/azure/architecture/reference-architectures/data/stream-processing-stream-analytics)


[architecture]: ./media/mass-ingestion-newsfeeds-architecture.png
[aai]: /azure/azure-monitor/app/app-insights-overview
[aas]: https://azure.microsoft.com/try/app-service
Expand All @@ -130,8 +149,8 @@ All the code for this scenario is available in the [GitHub][github] repository.
[queues-topics]: /azure/service-bus-messaging/service-bus-queues-topics-subscriptions
[partial]: https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/6693091-be-able-to-do-partial-updates-on-document
[plan]: /azure/azure-functions/functions-scale
[plan-aas]: /azure/azure-functions/functions-scale#app-service-plan
[plan-c]: /azure/azure-functions/functions-scale#consumption-plan
[plan-c]: /azure/azure-functions/consumption-plan
[plan-ded]: /azure/azure-functions/dedicated-plan
[portal]: https://portal.azure.com
[redlock]: https://redis.io/topics/distlock
[russian]: http://government.ru/all/rss
Expand All @@ -141,4 +160,4 @@ All the code for this scenario is available in the [GitHub][github] repository.
[topics]: /azure/service-bus-messaging/service-bus-dotnet-how-to-use-topics-subscriptions
[text-analytics]: /azure/cognitive-services/text-analytics
[translate-text]: /azure/cognitive-services/translator/translator-info-overview
[vision]: /azure/cognitive-services/computer-vision/home
[vision]: /azure/cognitive-services/computer-vision/home
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
### YamlMime:Architecture
metadata:
title: Mass ingestion and analysis of news feeds on Azure
title: Analyze news feeds with near real-time analytics
description: Create a pipeline for ingesting and analyzing RSS news feed data using Azure Cosmos DB, Azure Cognitive Services, and other Azure services.
author: njray
ms.date: 2/1/2019
Expand All @@ -16,17 +16,24 @@ metadata:
- example-scenario
- internal-intro
- ai-machine-learning
- aml-project-improvement
social_image_url: /azure/architecture/example-scenario/ai/media/mass-ingestion-newsfeeds-architecture.png
name: Mass ingestion and analysis of news feeds on Azure
name: Analyze news feeds with near real-time analytics using image and natural language processing
azureCategories:
- analytics
- ai-machine-learning
summary: Create a pipeline for ingesting and analyzing RSS news feed data using Azure Cosmos DB, Azure Cognitive Services, and other Azure services.
products:
- azure-cosmos-db
- azure-functions
- azure-search
- azure-storage
- azure-search
- azure-cognitive-search
- azure-service-bus
- azure-computer-vision
- azure-translator-text
- azure-face
- azure-application-insights
thumbnailUrl: /azure/architecture/browse/thumbs/mass-ingestion-newsfeeds-architecture.png
content: |
[!include[](newsfeed-ingestion-content.md)]
[!include[](news-feed-ingestion-and-near-real-time-analysis-content.md)]
4 changes: 2 additions & 2 deletions docs/toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -801,8 +801,8 @@ items:
href: example-scenario/data/data-warehouse.yml
- name: Demand forecasting for shipping
href: solution-ideas/articles/demand-forecasting-for-shipping-and-distribution.yml
- name: Mass ingestion of news feeds on Azure
href: example-scenario/ai/newsfeed-ingestion.yml
- name: Ingestion and analysis of news feeds
href: example-scenario/ai/news-feed-ingestion-and-near-real-time-analysis.yml
- name: Partitioning in Event Hubs and Kafka
href: reference-architectures/event-hubs/partitioning-in-event-hubs-and-kafka.yml
- name: Stream processing with Azure Databricks
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@
<article class="card">
<div class="card-header has-margin-bottom-none" aria-hidden="true">
<figure class="image diagram has-height-175 has-overflow-hidden level">
<a href="/azure/architecture/example-scenario/ai/newsfeed-ingestion"><img src="/azure/architecture/browse/thumbs/newsfeed-ingestion.png" class="diagram" alt="Thumbnail of Mass ingestion and analysis of news feeds on Azure Architectural Diagram." data-linktype="relative-path"></a>
<a href="/azure/architecture/example-scenario/ai/news-feed-ingestion-and-near-real-time-analysis"><img src="/azure/architecture/browse/thumbs/news-feed-ingestion-and-near-real-time-analysis.png" class="diagram" alt="Thumbnail of mass ingestion and analysis of news feeds on Azure Architectural Diagram." data-linktype="relative-path"></a>
</figure>
</div>
<div class="card-content">
<a class="card-content-title has-margin-top-none" href="/azure/architecture/example-scenario/ai/newsfeed-ingestion">
<a class="card-content-title has-margin-top-none" href="/azure/architecture/example-scenario/ai/news-feed-ingestion-and-near-real-time-analysis">
<p>Mass ingestion and analysis of news feeds on Azure</p>
</a>
<ul class="card-content-metadata">
Expand Down
4 changes: 2 additions & 2 deletions includes/scenario-articles.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,13 +71,13 @@ ms.subservice: example-scenario
</a>
</li>
<li style="display: flex; flex-direction: column;">
<a href="~/example-scenario/ai/newsfeed-ingestion.md" style="display: flex; flex-direction: column; flex: 1 0 auto;">
<a href="~/example-scenario/ai/news-feed-ingestion-and-near-real-time-analysis.md" style="display: flex; flex-direction: column; flex: 1 0 auto;">
<div class="cardSize" style="flex: 1 0 auto; display: flex;">
<div class="cardPadding" style="display: flex;">
<div class="card">
<div class="cardImageOuter">
<div class="cardImage">
<img src="~/example-scenario/ai/media/mass-ingestion-newsfeeds-architecture.png" alt="Architecture diagram for Mass ingestion and analysis of news feeds on Azure" height="140px" />
<img src="~/example-scenario/ai/media/mass-ingestion-newsfeeds-architecture.png" alt="Architecture diagram for mass ingestion and analysis of news feeds on Azure" height="140px" />
</div>
</div>
<div class="cardText">
Expand Down

0 comments on commit 6977828

Please sign in to comment.