Skip to content

Commit 1d28bc6

Browse files
Merge pull request #14900 from MicrosoftDocs/references
[Canopy] Address some missed references and update long description
2 parents 837cd8a + 5277c6f commit 1d28bc6

File tree

5 files changed

+27
-31
lines changed

5 files changed

+27
-31
lines changed

docs/data-guide/technology-choices/stream-processing.md

Lines changed: 26 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,8 @@ To help you choose the right technology, this section outlines common options in
3333

3434
### High-level stream processing flow
3535

36-
:::image type="complex" source="../images/stream-processing.svg" alt-text="A diagram that shows the dataflow for the end-to-end data processing solution." lightbox="../images/stream-processing.svg" border="false":::
37-
The flow starts with mobile apps and customer-facing apps. Step 1 is labeled stream producers. It includes three subsections. The subsection labeled device endpoint telemetry contains Azure IoT Hub and Azure IoT Edge. The subsection labeled CDC generated from databases contains Azure Cosmos DB and Azure SQL Database. These two subsections point to step 2. The subsection labeled telemetry and events from custom applications contains Azure Kubernetes Service (AKS) and Azure Functions. This subsection points to the line that goes from the first two subsections to step 2. Step 2 is labeled stream ingestion. It contains Azure Event Hubs, Azure Event Grid, Kafka on HDInsight, and Kafka on Confluent. This step points to step 3, which is labeled stream processing. It contains Azure Stream Analytics, Fabric eventstream, and Azure Functions. It includes a subsection labeled Spark Structured Streaming, which contains Microsoft Fabric, Azure Synapse Analytics, and Azure Databricks. Step 3 points to step 4, which is labeled streaming sinks. This step contains Azure Data Explorer, Azure Cosmos DB, Azure Blob Storage, One Lake, and Fabric eventhouse.
36+
:::image type="complex" source="../images/stream-processing.svg" alt-text="A diagram that shows the dataflow for a stream processing solution." lightbox="../images/stream-processing.svg" border="false":::
37+
Left to right, the diagram shows a four-stage numbered streaming data pipeline: 1 Stream producers, 2 Stream ingestion, 3 Stream processing, 4 Streaming sinks. At the far left a tall box contains Mobile apps above Customer-facing apps. To its right three stacked producer boxes: the top box labeled Device endpoint metrics contains IoT Hub left of IoT Edge; the middle box labeled CDC generated from databases contains Azure Cosmos DB left of SQL Database; the bottom box labeled metrics and events from custom applications contains AKS left of Functions. Right-pointing arrows from each producer box converge on the ingestion box, which lists (top row left to right) Event Hubs and Event Grid, and (bottom row) HDInsight Kafka left of Confluent Kafka. A single arrow leads to the processing box showing (top row) Stream Analytics, Eventstream, Functions, and below them an inset Spark Structured Streaming box containing Fabric left of Azure Databricks. Another arrow points to the sinks box listing Azure Data Explorer above Blob Storage, Azure Cosmos DB above One Lake, and Eventhouse at the bottom.
3838
:::image-end:::
3939

4040
*Download a [Visio file](https://arch-center.azureedge.net/stream-processing.vsdx) of this architecture.*
@@ -64,10 +64,10 @@ Stream producers provide the following benefits:
6464
#### General capabilities
6565

6666
| Capability | IoT Hub | CDC producers |Custom applications|
67-
| --- | --- | --- | --- |
68-
| Device telemetry | Yes | No | No |
69-
| Managed service | Yes | No | No |
70-
| Scalability | Yes | Yes | Yes |
67+
| --- | --- | --- | --- |
68+
| Device metics and logs | Yes | No | No |
69+
| Managed service | Yes | No | No |
70+
| Scalability | Yes | Yes | Yes |
7171

7272
### Stream ingestion
7373

@@ -80,7 +80,7 @@ Consider the following factors:
8080
- **Scalability:** Ensure that the ingestion layer can scale dynamically as data volume, variety, and velocity increase.
8181
- **Data integrity and reliability:** Prevent data loss or duplication during transmission.
8282

83-
#### Components
83+
#### Stream ingestion components
8484

8585
- [Event Hubs](/azure/well-architected/service-guides/event-hubs) is a real-time data ingestion service that can handle millions of events per second, which makes it ideal for high-throughput scenarios. It can scale dynamically and process massive volumes of data with low latency.
8686

@@ -92,14 +92,14 @@ Consider the following factors:
9292

9393
- [Apache Kafka on Confluent Cloud](/azure/partner-solutions/apache-kafka-confluent-cloud/overview) is a fully managed Apache Kafka service for real-time data ingestion. It integrates with Azure to simplify deployment and scaling. This solution includes features like schema registry, ksqlDB for stream queries, and enterprise-grade security. Use this option if you use Confluent's extended ecosystem of connectors and stream processing tools.
9494

95-
#### General capabilities
95+
#### Stream ingestion general capabilities
9696

9797
| Capability | Event Hubs | Kafka on HDInsight | Kafka on Confluent|
98-
| --- | --- | --- | --- |
99-
| Message retention | Yes | Yes | Yes |
98+
| --- | --- | --- | --- |
99+
| Message retention | Yes | Yes | Yes |
100100
| Message size limit| 1 MB | Configurable | Configurable |
101-
| Managed service | Yes | Managed infrastructure as a service | Yes |
102-
| Autoscale | Yes | Yes | Yes |
101+
| Managed service | Yes | Managed infrastructure as a service | Yes |
102+
| Autoscale | Yes | Yes | Yes |
103103
| Partner offering | No | No | Yes |
104104
| Pricing model | [Based on tier](https://azure.microsoft.com/pricing/details/event-hubs/) | [Per cluster hour](https://azure.microsoft.com/pricing/details/hdinsight/) | [Consumption models](https://azuremarketplace.microsoft.com/marketplace/apps/confluentinc.confluent-cloud-azure-prod?tab=PlansAndPrice) |
105105

@@ -115,7 +115,7 @@ Consider the following factors:
115115
- **Windowing:** Use sliding or tumbling windows to manage time-based aggregations and analytics.
116116
- **Fault tolerance:** Ensure that the system can recover from failures without data loss or reprocessing errors.
117117

118-
#### Components
118+
#### Stream processing components
119119

120120
- [Stream Analytics](/azure/stream-analytics/stream-analytics-introduction) is a managed service that uses a SQL-based query language to enable real-time analytics. Use this service for simple processing tasks like filtering, aggregating, and joining data streams. It integrates seamlessly with Event Hubs, IoT Hub, and Azure Blob Storage for input and output. Stream Analytics best suits low-complexity, real-time tasks where a simple, managed solution with SQL-based queries is sufficient.
121121

@@ -125,15 +125,15 @@ Consider the following factors:
125125

126126
- [Azure Functions](/azure/well-architected/service-guides/azure-functions) is a serverless compute service for event-driven processing. It's useful for lightweight tasks, like transforming data or triggering workflows based on real-time events. Azure functions are stateless by design. The durable functions feature extends capabilities to support stateful workflows for complex event coordination.
127127

128-
#### General capabilities
128+
#### Stream processing general capabilities
129129

130130
| Capability | Stream Analytics | Spark Structured Streaming (Fabric, Azure Databricks) | Fabric eventstreams| Azure Functions|
131-
| --- | --- | --- | --- | --- |
132-
| Micro-batch processing| Yes | Yes | Yes| No |
133-
| Event-based processing| No | No | Yes| Yes |
134-
| Stateful processing | Yes | Yes | Yes| No |
131+
| --- | --- | --- | --- | --- |
132+
| Micro-batch processing| Yes | Yes | Yes| No |
133+
| Event-based processing| No | No | Yes| Yes |
134+
| Stateful processing | Yes | Yes | Yes| No |
135135
| Support for check pointing | Yes | Yes | Yes| No |
136-
| Low-code interface | Yes| No | Yes | No |
136+
| Low-code interface | Yes| No | Yes | No |
137137
| Pricing model | [Streaming units](/azure/stream-analytics/stream-analytics-streaming-unit-consumption) | Yes | [Fabric SKU](https://azure.microsoft.com/pricing/details/microsoft-fabric/)| Yes |
138138

139139
### Streaming sinks
@@ -144,23 +144,23 @@ Consider the following factors:
144144

145145
- **Data consumption and usage:** Use Power BI for real-time analytics or reporting dashboards. It integrates well with Azure services and provides live visualizations of data streams.
146146

147-
- **Low-latency requirements:** Determine whether your system must deliver analytics on real-time data streams, such as device telemetry and application logs. Some applications might also require ultra-low latency for reads and writes, which makes them suitable for operational analytics or real-time applications.
147+
- **Low-latency requirements:** Determine whether your system must deliver analytics on real-time data streams, such as device metrics and application logs. Some applications might also require ultra-low latency for reads and writes, which makes them suitable for operational analytics or real-time applications.
148148
- **Scalability and volume:** Assess your workload's need to ingest large volumes of data, support diverse data formats, and scale efficiently and cost-effectively.
149149

150-
#### Components
150+
#### Streaming sinks components
151151

152152
- [Azure Data Lake Storage](/azure/storage/blobs/data-lake-storage-introduction) is a scalable, distributed, and cost-effective solution for storing unstructured and semi-structured data. It supports petabyte-scale storage and high-throughput workloads for storing large volumes of streaming data. It also enables fast read and write operations, which support analytics on streaming data and real-time data pipelines.
153153

154-
- A [Fabric eventhouse](/fabric/real-time-intelligence/eventhouse) is a KQL database for real-time analytics and exploration on vent-based data, such as telemetry and log data, time-series data, and IoT data. It supports ingestion of millions of events per second with low latency. This feature enables near-instant access to streaming data. An eventhouse deeply integrates with the Fabric ecosystem. It enables users to query and analyze streaming data immediately by using tools like Power BI.
154+
- A [Fabric eventhouse](/fabric/real-time-intelligence/eventhouse) is a KQL database for real-time analytics and exploration on vent-based data, such as metrics and log data, time-series data, and IoT data. It supports ingestion of millions of events per second with low latency. This feature enables near-instant access to streaming data. An eventhouse deeply integrates with the Fabric ecosystem. It enables users to query and analyze streaming data immediately by using tools like Power BI.
155155

156156
- [Azure Cosmos DB](/azure/well-architected/service-guides/cosmos-db) is a NoSQL database for low-latency, globally distributed, and highly scalable data storage. It delivers high throughput and can handle large volumes of streaming data with consistent performance.
157157

158-
- [SQL Database](/azure/well-architected/service-guides/azure-sql-database-well-architected-framework) is a fully managed, cloud-based relational database service. It's built on the SQL Server engine. So it provides the capabilities of a traditional SQL Server database with the benefits of cloud-based scalability, reliability, and reduced management overhead.
158+
- [SQL Database](/azure/well-architected/service-guides/azure-sql-database-well-architected-framework) is a fully managed, cloud-based relational database service. It's built on the SQL Server engine. So it provides the capabilities of a traditional SQL Server database with the benefits of cloud-based scalability, reliability, and reduced management overhead.
159159

160-
#### General capabilities
160+
#### Streaming sinks general capabilities
161161

162162
| Capability | Data Lake Storage | Fabric eventhouse | Azure Cosmos DB|SQL Database|
163-
| --- | --- | --- | --- | --- |
163+
| --- | --- | --- | --- | --- |
164164
| General purpose object store | Yes | No | No | No|
165165
| Streaming data aggregations | No | Yes | No | No|
166166
| Low-latency reads and writes for JSON documents | No | Yes | Yes | No|
@@ -184,6 +184,7 @@ Principal author:
184184
- [Data Lake Storage](/azure/storage/blobs/data-lake-storage-introduction)
185185

186186
Explore the following training modules:
187+
187188
- [Explore Azure Functions](/training/modules/explore-azure-functions)
188189
- [Get started with Stream Analytics](/training/modules/introduction-to-data-streaming)
189190
- [Perform advanced streaming data transformations](/training/modules/perform-advanced-streaming-data-transformations-with-spark-kafka)

docs/guide/sap/sap-netweaver-content.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -304,7 +304,7 @@ For infrastructure security, data is encrypted in transit and at rest. For infor
304304

305305
You can use [Azure Disk Encryption](/azure/security/azure-security-disk-encryption-overview) to encrypt Windows VM disks. This service uses the BitLocker feature of Windows to provide volume encryption for the operating system and the data disks. The solution also works with Azure Key Vault to help you control and manage the disk-encryption keys and secrets in your key vault subscription. Data on the VM disks is encrypted at rest in your Azure storage.
306306

307-
For data-at-rest encryption, SQL Server transparent data encryption (TDE) encrypts SQL Server, Azure SQL Database, and Azure Synapse Analytics data files. For more information, see [SQL Server Azure Virtual Machines DBMS deployment for SAP NetWeaver](/azure/virtual-machines/workloads/sap/dbms_guide_sqlserver).
307+
For data-at-rest encryption, SQL Server transparent data encryption (TDE) encrypts SQL Server and Azure SQL Database data files. For more information, see [SQL Server Azure Virtual Machines DBMS deployment for SAP NetWeaver](/azure/virtual-machines/workloads/sap/dbms_guide_sqlserver).
308308

309309
To monitor threats from inside and outside the firewall, consider deploying [Microsoft Sentinel (preview)](https://www.microsoft.com/security/blog/2021/05/19/protecting-sap-applications-with-the-new-azure-sentinel-sap-threat-monitoring-solution). The solution provides continuous threat detection and analytics for SAP systems that are deployed on Azure, in other clouds, or on-premises. For deployment guidance, see [Deploy Threat Monitoring for SAP in Microsoft Sentinel](/azure/sentinel/sap-deploy-solution).
310310

docs/guide/technology-choices/data-stores-getting-started.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,6 @@ The following table lists common use scenario requirements and the recommended d
109109
| Deliver high availability and elastic scaling to open-source mobile and web apps by using a managed community MySQL database service, or migrate MySQL workloads to the cloud. | [Azure Database for MySQL](/azure/mysql/overview) |
110110
| Build applications that have guaranteed low latency and high availability anywhere, at any scale, or migrate Cassandra, MongoDB, Gremlin, and other NoSQL workloads to the cloud. | [Azure Cosmos DB](/azure/cosmos-db/introduction) |
111111
| Modernize existing Cassandra data clusters and apps, and gain flexibility by using a managed instance service. | [Azure Managed Instance for Apache Cassandra](/azure/managed-instance-apache-cassandra/introduction) |
112-
| Build a managed elastic data warehouse that has security at every level of scale at no extra cost. | [Azure Synapse Analytics](/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-overview-what-is) |
113112
| Deliver fast, scalable applications by using an open-source-compatible in-memory data store. | [Azure Cache for Redis](/azure/azure-cache-for-redis/cache-overview) |
114113

115114
## Database feature comparison

docs/integration/integration-start-here-content.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,6 @@ The following resources can help you learn the core concepts of integration:
2929
- [Integration design for Dynamics 365 solutions][Integration design for Dynamics 365 solutions]
3030
- [Data integrations with Finance and Operations apps][Data integrations with Finance and Operations apps]
3131
- [Examine business integration for IoT solutions][Examine business integration for IoT solutions]
32-
- [Integrate data with Azure Data Factory or Azure Synapse Pipeline][Integrate data with Azure Data Factory or Azure Synapse Pipeline]
3332
- [Explore Event Grid integration][Explore Event Grid integration]
3433
- [Architect API integration in Azure][Architect API integration in Azure]
3534

@@ -148,7 +147,6 @@ The following resources provide practical recommendations and information for sp
148147
[Google Cloud to Azure services comparison—Messaging and eventing]: ../gcp-professional/services.md#messaging-and-eventing
149148
[Google Cloud to Azure services comparison—Miscellaneous workflow]: ../gcp-professional/services.md#miscellaneous
150149
[Identify microservice boundaries]: ../microservices/model/microservice-boundaries.yml
151-
[Integrate data with Azure Data Factory or Azure Synapse Pipeline]: /training/modules/data-integration-azure-data-factory
152150
[Integrate Event Hubs with serverless functions on Azure]: ../serverless/event-hubs-functions/event-hubs-functions.yml
153151
[Integrate IBM mainframe and midrange message queues with Azure]: ../example-scenario/mainframe/integrate-ibm-message-queues-azure.yml
154152
[Integration design for Dynamics 365 solutions]: /training/modules/integration

docs/solution-ideas/articles/azure-databricks-modern-analytics-architecture-content.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,6 @@ To learn about related solutions, see the following guides and architectures.
180180
[Medallion model]: /azure/databricks/lakehouse/medallion
181181
[MLflow]: https://mlflow.org
182182
[MLflow Model Registry]: https://www.mlflow.org/docs/latest/registry.html
183-
[Native connectors]: /azure/databricks/data/data-sources/azure/synapse-analytics
184183
[Photon improves performance]: /azure/databricks/compute/photon
185184
[Power BI connector for Azure Databricks]: /azure/databricks/integrations/bi/power-bi
186185
[Stream processing with Azure Databricks]: ../../reference-architectures/data/stream-processing-databricks.yml
@@ -196,4 +195,3 @@ To learn about related solutions, see the following guides and architectures.
196195
[Microsoft Fabric]: /fabric/get-started/microsoft-fabric-overview
197196
[Data Factory in Microsoft Fabric]: /fabric/data-factory/data-factory-overview
198197
[Direct Lake]: /fabric/get-started/direct-lake-overview
199-

0 commit comments

Comments
 (0)