Skip to content
This repository has been archived by the owner on Feb 5, 2024. It is now read-only.

Latest commit

 

History

History
103 lines (97 loc) · 6.66 KB

Table.FuzzyNestedJoin.md

File metadata and controls

103 lines (97 loc) · 6.66 KB

Table.FuzzyNestedJoin

Performs a fuzzy join between tables on supplied columns and produces the join result in a new column.

function (table1 as table, optional key1 as nullable any, table2 as table, optional key2 as nullable any, newColumnName as text, optional joinKind as nullable any, optional joinOptions as nullable record) as table

Description

Joins the rows of table1 with the rows of table2 based on a fuzzy matching of the values of the key columns selected by key1 (for table1) and key2 (for table2). The results are returned in a new column named newColumnName.

Fuzzy matching is a comparison based on similarity of text rather than equality of text.

The optional joinKind specifies the kind of join to perform. By default, a left outer join is performed if a joinKind is not specified. Options include:

  • JoinKind.Inner
  • JoinKind.LeftOuter
  • JoinKind.RightOuter
  • JoinKind.FullOuter
  • JoinKind.LeftAnti
  • JoinKind.RightAnti
  • An optional set of joinOptions may be included to specify how to compare the key columns. Options include:

    • ConcurrentRequests
    • Culture
    • IgnoreCase
    • IgnoreSpace
    • NumberOfMatches
    • Threshold
    • TransformationTable
    • The following table provides more details about the advanced options.

      Advanced Option Default Allowed Description
      ConcurrentRequests 1 Between 1 and 8 The ConcurrentRequests option supports parallelizing the join operation by specifying the number of parallel threads to to use.
      Culture Culture neutral A valid culture name The Culture option allows matching records based on culture-specific rules.
      For example a Culture option of 'ja-JP' matches records based on the Japanese language.
      IgnoreCase true true or false The IgnoreCase option allows matching records irrespective of the case of the text.
      For example, 'Grapes' (sentence case) is matched with 'grapes' (lower case) if the IgnoreCase option is set to true.
      IgnoreSpace true true or false The IgnoreSpace option allows matching records ignoring the spaces in the string.
      For example, 'Grapes' is matched with 'Grape s' (space before 's') if the IgnoreSpace option is set to true.
      NumberOfMatches 2147483647 Between 0 and 2147483647 The NumberOfMatches option specifies the maximum number of matching rows that can be returned.
      Threshold 0.80 Between 0.00 and 1.00 The similarity Threshold option provides the ability to match records above a given similarity score. A threshold of 1.00 is the same as specifying an exact match criteria.
      For example, 'Grapes' matches with 'Graes' (missing 'p') only if the thresold is set to less than 0.90.
      TransformationTable A valid table with at least 2 columns named 'From' and 'To'. The TransformationTable option allows matching records based on custom value mappings.
      For example, 'Grapes' are matched with 'Raisins' if a transformation table is provided with the 'From' column containing 'Grapes' and the 'To' column containing 'Raisins'.

      # Category Table.Transformation # Examples Left inner fuzzy join of two tables based on [FirstName] ``` Table.FuzzyNestedJoin(Table.FromRecords({ [CustomerID = 1, FirstName1 = "Bob", Phone = "555-1234"], [CustomerID = 2, FirstName1 = "Jim", Phone = "555-2345"], [CustomerID = 3, FirstName1 = "Paul", Phone = "555-3456"], [CustomerID = 4, FirstName1 = "Zoe", Phone = "555-4567"], [CustomerID = 5, FirstName1 = "Robert", Phone = "555-4567"] }), "FirstName1", Table.FromRecords({ [CustomerStateID = 1, FirstName2 = "Bob", State = "TX"], [CustomerStateID = 2, FirstName2 = "bob", State = "WA"], [CustomerStateID = 3, FirstName2 = "BOb", State = "PA"], [CustomerStateID = 4, FirstName2 = "bOB", State = "CA"], [CustomerStateID = 5, FirstName2 = "Pul", State = "TX"], [CustomerStateID = 6, FirstName2 = "Paul", State = "WA"], [CustomerStateID = 7, FirstName2 = "paul", State = "PA"], [CustomerStateID = 8, FirstName2 = "Peter", State = "CA"], [CustomerStateID = 9, FirstName2 = "Adam", State = "TX"], [CustomerStateID = 10, FirstName2 = "Zoe", State = "WA"], [CustomerStateID = 11, FirstName2 = "Zo", State = "PA"], [CustomerStateID = 12, FirstName2 = "Zaza", State = "CA"] }), "FirstName2", "NewColumnTable2", JoinKind.LeftOuter, [IgnoreCase = true, IgnoreSpace = false] ) ``` > Table.FromRecords({ [ CustomerID = 1, FirstName1 = "Bob", Phone = "555-1234", NewColumnTable2 =Table.FromRecords({ [ CustomerStateID = 1, FirstName2 = "Bob", State = "TX" ], [ CustomerStateID = 2, FirstName2 = "bob", State = "WA" ], [ CustomerStateID = 3, FirstName2 = "BOb", State = "PA" ], [ CustomerStateID = 4, FirstName2 = "bOB", State = "CA" ] }, { "CustomerStateID", "FirstName2", "State" }) ], [ CustomerID = 2, FirstName1 = "Jim", Phone = "555-2345", NewColumnTable2 =Table.FromRecords({}, {}) ], [ CustomerID = 3, FirstName1 = "Paul", Phone = "555-3456", NewColumnTable2 =Table.FromRecords({ [ CustomerStateID = 6, FirstName2 = "Paul", State = "WA" ], [ CustomerStateID = 7, FirstName2 = "paul", State = "PA" ], [ CustomerStateID = 5, FirstName2 = "Pul", State = "TX" ] }, { "CustomerStateID", "FirstName2", "State" }) ], [ CustomerID = 4, FirstName1 = "Zoe", Phone = "555-4567", NewColumnTable2 =Table.FromRecords({ [ CustomerStateID = 10, FirstName2 = "Zoe", State = "WA" ], [ CustomerStateID = 11, FirstName2 = "Zo", State = "PA" ] }, { "CustomerStateID", "FirstName2", "State" }) ], [ CustomerID = 5, FirstName1 = "Robert", Phone = "555-4567", NewColumnTable2 =Table.FromRecords({}, {}) ] }, { "CustomerID", "FirstName1", "Phone", "NewColumnTable2" })