Store a table of common pattern value resolutions with Similarity Percentage (confidence) #70

gelliottrsg · 2021-03-09T16:33:23Z

Users love the pattern detection but would like to leverage those patterns against a dataset that keeps the most common resolution to those patterns as a potential 1 to many name value lookup. For instance patterns of 9999999999, 999-999-9999, +9 9999999999 would have values in this new dataset that flag it as a potential phone number. A sample of the output could look like the attached image.

dcamper · 2021-03-10T12:46:34Z

@gelliottrsg This is a good idea. A couple of questions:

I feel that the meaning behind patterns is possibly specific to a use-case. There are very few patterns that would actually be globally true (latitude/longitude comes to mind as one example). Phone numbers are not global but there is a finite set of patterns for them, so they would be harder but doable. SSN patterns could be easily confused with other things. The point is, does it make sense for this functionality to have a dictionary of pattern->meaning pairs built in, or require the caller to supply the dictionary?
How do you envision the 'similarity percent' and 'resolution ranking' values in your example to be computed? The similarity value could be "number of records matching that pattern out of the total number of records" but that is not clear.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store a table of common pattern value resolutions with Similarity Percentage (confidence) #70

Store a table of common pattern value resolutions with Similarity Percentage (confidence) #70

gelliottrsg commented Mar 9, 2021

dcamper commented Mar 10, 2021

Store a table of common pattern value resolutions with Similarity Percentage (confidence) #70

Store a table of common pattern value resolutions with Similarity Percentage (confidence) #70

Comments

gelliottrsg commented Mar 9, 2021

dcamper commented Mar 10, 2021