Skip to content
Permalink
Browse files

Adjust for Data Conveyer 3.1.0.

  • Loading branch information...
mavidian committed Jun 16, 2019
1 parent 4772cef commit 24e404158fea60d335e912d7f964cdb22f1ca134
@@ -1,12 +1,12 @@
<Project Sdk="Microsoft.NET.Sdk">
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>netcoreapp2.1</TargetFramework>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="DataConveyer" Version="3.0.1" />
<PackageReference Include="DataConveyer" Version="3.1.0" />
</ItemGroup>

</Project>
@@ -32,7 +32,7 @@ internal FileProcessor(string inLocation, string outFile)
GlobalCacheElements = new string[] { "TokenSummary" }, //a single element - a dictionary - Dict<string,Tuple<int,int>>
InputDataKind = KindOfTextData.XML,
IntakeReaders = () => Directory.GetFiles(_inLocation, "*.xml").Select(f => File.OpenText(f)), //note that we're neglecting to dispose the stream readers here (not a production code)
XmlJsonIntakeSettings = "RecordNode|Token,IncludeExplicitText|true",
XmlJsonIntakeSettings = "RecordNode|Token,IncludeExplicitText|true,IncludeAttributes|true",
ExplicitTypeDefinitions = "__explicitText__|I", //in our case, explicit text in Token node contains integer value
ClusterMarker = (rec,pRec,n) => pRec == null ? true : rec.SourceNo != pRec.SourceNo, // each file (source) constitutes a cluster
MarkerStartsCluster = true, //predicate (marker) matches the first record in cluster
@@ -103,7 +103,7 @@ private IEnumerable<ICluster> CumulateTokenData(ICluster cluster)
//Regular cluster - cumulate data in the global cache
foreach (var rec in cluster.Records)
{
var color = (string)rec["color"];
var color = (string)rec["@color"]; //keys originating from attributes are prepended with @ (by default)
var value = (int)rec["__explicitText__"];

tokenSummary.AddOrUpdate(color, (1, value), (c, t) => (t.count + 1, t.total + value));
@@ -1,7 +1,7 @@
# DataConveyer_AggregateTokens

DataConveyer_AggregateTokens is a console application to demonstrate how Data Conveyer can be
used to accumulate data extracted from a sequence of input files.
used to accumulate data extracted from a sequence of input files.

There are 10 sample XML files located in ...Data folder. Data Conveyer will process all these
files and identify tokens contained in them (in this example, tokens are just Token nodes). The

0 comments on commit 24e4041

Please sign in to comment.
You can’t perform that action at this time.