Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
1 contributor

Users who have contributed to this file

162 lines (123 sloc) 7.56 KB

Wrangling Data Flow

Wrangling Data Flow in Azure Data Factory allows you to do code-free data preparation/wrangling @cloud scale iteratively. Wrangling Data Flow integrates with Power Query Online and makes the best in class Power Query M functions available for data wrangling @ cloud scale via spark execution.

Column Management

Selection, Removal, Renaming, and Reordering (corresponding M functions: Table.SelectColumns, Table.RemoveColumns, Table.RenameColumns, Table.ReorderColumns, Table.PrefixColumns, Table.TransformColumnNames)

Row Filters

(corresponding M function: Table.SelectRows):

Adding and Transforming Columns

(corresponding M functions: Table.AddColumn, Table.TransformColumns, Table.ReplaceValue, Table.DuplicateColumn):

Merging/Joining tables

  • Power Query will generate a nested join (Table.NestedJoin; users can also manually write Table.AddJoinColumn). Users must then expand the nested join column into a non-nested join (Table.ExpandTableColumn, not supported in any other context).

  • The M function Table.Join can be manually written directly to avoid the need for an additional expansion step, but the user must ensure that there are no duplicate column names amongst the joined tables (because ADF does not support them).

  • Supported Join Kinds: Inner, LeftOuter, RightOuter, FullOuter

  • Both Value.Equals and Value.NullableEquals are supported as key equality comparers

Group by

(corresponding M function: Table.Group):

Sorting

(corresponding M function: Table.Sort)

Notable Unsupported Functionality (not exhaustive; subject to change)

  • Merge columns (however the same effect can often be achieved by clever use of AddColumn)

  • Split column (also can often be worked around, albeit trickier)

  • Append queries

  • Changing Column Types

  • “Use first row as headers” or “Use headers as first row”

You can’t perform that action at this time.