description | title | ms.date | dev_langs | helpviewer_keywords | ms.topic | ||||
---|---|---|---|---|---|---|---|---|---|
Learn more about: Walkthrough: Using BatchBlock and BatchedJoinBlock to Improve Efficiency |
Walkthrough: Using BatchBlock and BatchedJoinBlock to Improve Efficiency |
03/30/2017 |
|
|
tutorial |
The TPL Dataflow Library provides the xref:System.Threading.Tasks.Dataflow.BatchBlock%601?displayProperty=nameWithType and xref:System.Threading.Tasks.Dataflow.BatchedJoinBlock%602?displayProperty=nameWithType classes so that you can receive and buffer data from one or more sources and then propagate out that buffered data as one collection. This batching mechanism is useful when you collect data from one or more sources and then process multiple data elements as a batch. For example, consider an application that uses dataflow to insert records into a database. This operation can be more efficient if multiple items are inserted at the same time instead of one at a time sequentially. This document describes how to use the xref:System.Threading.Tasks.Dataflow.BatchBlock%601 class to improve the efficiency of such database insert operations. It also describes how to use the xref:System.Threading.Tasks.Dataflow.BatchedJoinBlock%602 class to capture both the results and any exceptions that occur when the program reads from a database.
[!INCLUDE tpl-install-instructions]
-
Read the Join Blocks section in the Dataflow document before you start this walkthrough.
-
Ensure that you have a copy of the Northwind database, Northwind.sdf, available on your computer. This file is typically located in the folder %Program Files%\Microsoft SQL Server Compact Edition\v3.5\Samples\.
[!IMPORTANT] In some versions of Windows, you cannot connect to Northwind.sdf if Visual Studio is running in a non-administrator mode. To connect to Northwind.sdf, start Visual Studio or a Developer Command Prompt for Visual Studio in the Run as administrator mode.
This walkthrough contains the following sections:
-
In Visual Studio, create a Visual C# or Visual Basic Console Application project. In this document, the project is named
DataflowBatchDatabase
. -
In your project, add a reference to System.Data.SqlServerCe.dll and a reference to System.Threading.Tasks.Dataflow.dll.
-
Ensure that Form1.cs (Form1.vb for Visual Basic) contains the following
using
(Imports
in Visual Basic) statements.[!code-csharpTPLDataflow_BatchDatabase#1] [!code-vbTPLDataflow_BatchDatabase#1]
-
Add the following data members to the
Program
class.[!code-csharpTPLDataflow_BatchDatabase#2] [!code-vbTPLDataflow_BatchDatabase#2]
Add to the Program
class the Employee
class.
[!code-csharpTPLDataflow_BatchDatabase#3] [!code-vbTPLDataflow_BatchDatabase#3]
The Employee
class contains three properties, EmployeeID
, LastName
, and FirstName
. These properties correspond to the Employee ID
, Last Name
, and First Name
columns in the Employees
table in the Northwind database. For this demonstration, the Employee
class also defines the Random
method, which creates an Employee
object that has random values for its properties.
Add to the Program
class the InsertEmployees
, GetEmployeeCount
, and GetEmployeeID
methods.
[!code-csharpTPLDataflow_BatchDatabase#4] [!code-vbTPLDataflow_BatchDatabase#4]
The InsertEmployees
method adds new employee records to the database. The GetEmployeeCount
method retrieves the number of entries in the Employees
table. The GetEmployeeID
method retrieves the identifier of the first employee that has the provided name. Each of these methods takes a connection string to the Northwind database and uses functionality in the System.Data.SqlServerCe
namespace to communicate with the database.
Add to the Program
class the AddEmployees
and PostRandomEmployees
methods.
[!code-csharpTPLDataflow_BatchDatabase#5] [!code-vbTPLDataflow_BatchDatabase#5]
The AddEmployees
method adds random employee data to the database by using dataflow. It creates an xref:System.Threading.Tasks.Dataflow.ActionBlock%601 object that calls the InsertEmployees
method to add an employee entry to the database. The AddEmployees
method then calls the PostRandomEmployees
method to post multiple Employee
objects to the xref:System.Threading.Tasks.Dataflow.ActionBlock%601 object. The AddEmployees
method then waits for all insert operations to finish.
Add to the Program
class the AddEmployeesBatched
method.
[!code-csharpTPLDataflow_BatchDatabase#6] [!code-vbTPLDataflow_BatchDatabase#6]
This method resembles AddEmployees
, except that it also uses the xref:System.Threading.Tasks.Dataflow.BatchBlock%601 class to buffer multiple Employee
objects before it sends those objects to the xref:System.Threading.Tasks.Dataflow.ActionBlock%601 object. Because the xref:System.Threading.Tasks.Dataflow.BatchBlock%601 class propagates out multiple elements as a collection, the xref:System.Threading.Tasks.Dataflow.ActionBlock%601 object is modified to act on an array of Employee
objects. As in the AddEmployees
method, AddEmployeesBatched
calls the PostRandomEmployees
method to post multiple Employee
objects; however, AddEmployeesBatched
posts these objects to the xref:System.Threading.Tasks.Dataflow.BatchBlock%601 object. The AddEmployeesBatched
method also waits for all insert operations to finish.
Add to the Program
class the GetRandomEmployees
method.
[!code-csharpTPLDataflow_BatchDatabase#7] [!code-vbTPLDataflow_BatchDatabase#7]
This method prints information about random employees to the console. It creates several random Employee
objects and calls the GetEmployeeID
method to retrieve the unique identifier for each object. Because the GetEmployeeID
method throws an exception if there is no matching employee with the given first and last names, the GetRandomEmployees
method uses the xref:System.Threading.Tasks.Dataflow.BatchedJoinBlock%602 class to store Employee
objects for successful calls to GetEmployeeID
and xref:System.Exception?displayProperty=nameWithType objects for calls that fail. The xref:System.Threading.Tasks.Dataflow.ActionBlock%601 object in this example acts on a xref:System.Tuple%602 object that holds a list of Employee
objects and a list of xref:System.Exception objects. The xref:System.Threading.Tasks.Dataflow.BatchedJoinBlock%602 object propagates out this data when the sum of the received Employee
and xref:System.Exception object counts equals the batch size.
The following example shows the complete code. The Main
method compares the time that is required to perform batched database insertions versus the time to perform non-batched database insertions. It also demonstrates the use of buffered join to read employee data from the database and also report errors.
[!code-csharpTPLDataflow_BatchDatabase#100] [!code-vbTPLDataflow_BatchDatabase#100]