TX Text Control Form Field Mapper

This sample opens a TX Text Control template document, reads form fields from the document, compares their names with a JSON data source, and renames matching form fields through FormFieldCollection.

When the app is started without parameters, it uses:

data\forms.tx
data\sample-data.json
data\mapped-forms.tx

You can also pass explicit paths:

dotnet run -- template.tx data.json mapped-template.tx 0.72

The optional last value is the minimum match score. The default is 0.72.

Matching Technique

The mapper does not require exact field names. It uses a small fuzzy-matching pipeline so names like these can still match:

company_name  <->  companyname
first_name    <->  firstName
street        <->  street_address

1. JSON Field Extraction

JsonFieldExtractor parses the JSON document with System.Text.Json.

For every object property, it adds:

the leaf property name, such as company_name
the nested path, such as customer.company_name

This allows templates to match either short names or more explicit nested names.

2. Tokenization

FieldNameTokenizer converts a field name into lowercase tokens.

It handles:

underscores: company_name
camel case: firstName
numbers: street1
punctuation or separators

Examples:

company_name     -> company, name
companyname      -> companyname
firstName        -> first, name
street_address   -> street, address

The tokens are also joined into a normalized string:

company_name -> companyname
firstName    -> firstname

3. Exact Normalized Matching

If two field names are different but their normalized form is the same, they receive a very high score.

Example:

company_name -> companyname

Both normalize to:

companyname

Score: 0.99

4. Token Set Matching

The mapper compares token sets using a Jaccard-style score:

matching tokens / all unique tokens

This helps when the same words appear in a different order.

Example:

name_company <-> company_name

Both contain:

company, name

5. Token Containment Matching

The mapper also checks whether all tokens from the shorter name are contained in the longer name.

This is useful for cases like:

street <-> street_address

The full token set is not identical, but street is completely contained in street_address, so the mapper treats it as a confident match.

Score: 0.86

6. Levenshtein Distance

For typo-like differences, the mapper computes a Levenshtein similarity score.

This compares the normalized strings and gives partial credit for small edit distances.

Example:

compnyname <-> companyname

This helps when names are close but not structurally identical.

7. Best Score Wins

For each form field, the mapper calculates all matching scores against all JSON fields and uses the highest score.

The final score is the best of:

exact normalized match
token set match
token containment match
Levenshtein similarity

8. Threshold and Ambiguity

A match is accepted only if it is above the minimum score.

Default threshold:

0.72

The mapper also skips ambiguous matches. If the best match and second-best match are too close, the field is not renamed. This avoids unsafe mappings when two JSON fields look nearly equally likely.

9. Duplicate Form Fields

JSON fields are not consumed after the first match.

This means multiple form fields in the document can map to the same JSON field:

street_address -> street
street1        -> street

This is important for templates where the same data appears in several places.

TX Text Control Integration

The document is loaded with ServerTextControl:

textControl.Load(templatePath, StreamType.InternalUnicodeFormat);

The mapper receives the document form fields:

mapper.RenameFormFields(textControl.FormFields);

Each accepted match updates the field name:

field.Name = decision.JsonName;

The changed template is saved as a TX document:

textControl.Save(outputPath, StreamType.InternalUnicodeFormat);

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
.gitattributes		.gitattributes
.gitignore		.gitignore
AppOptions.cs		AppOptions.cs
FieldNameCandidate.cs		FieldNameCandidate.cs
FieldNameScorer.cs		FieldNameScorer.cs
FieldNameTokenizer.cs		FieldNameTokenizer.cs
FormFieldMapper.cs		FormFieldMapper.cs
JsonFieldExtractor.cs		JsonFieldExtractor.cs
MapResult.cs		MapResult.cs
MatchDecision.cs		MatchDecision.cs
Program.cs		Program.cs
README.md		README.md
TxFormFieldMapper.csproj		TxFormFieldMapper.csproj
TxFormFieldMapper.sln		TxFormFieldMapper.sln

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TX Text Control Form Field Mapper

Matching Technique

1. JSON Field Extraction

2. Tokenization

3. Exact Normalized Matching

4. Token Set Matching

5. Token Containment Matching

6. Levenshtein Distance

7. Best Score Wins

8. Threshold and Ambiguity

9. Duplicate Form Fields

TX Text Control Integration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TX Text Control Form Field Mapper

Matching Technique

1. JSON Field Extraction

2. Tokenization

3. Exact Normalized Matching

4. Token Set Matching

5. Token Containment Matching

6. Levenshtein Distance

7. Best Score Wins

8. Threshold and Ambiguity

9. Duplicate Form Fields

TX Text Control Integration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages