# Zip Partitions

`zip_partitions` produces the same result as the `append_columns` operation but runs faster by assuming the Dataflows come from the same source and hence, have the same partitioning and number of records. When two or more DataPrep Dataflows are combined with `append_columns`, the data processing engine makes no assumptions about the partitioning of the input flows and hence will read all data in each flow and repartition the result. When you have split a dataflow by column and then need to recombine, this leads to performance issues that are not necessary. 

The `zip_partitions` operation assumes all Dataflows are combined to have the same number of partitions and the same number of records in each partition. If the input Dataflows do not have aligned partitions and partition row counts the operation will fail completely

In [None]:
import azureml.dataprep as dprep

dflow_a = dprep.auto_read_file('../data/crime-winter.csv')
dflow_b = dprep.auto_read_file('../data/crime-winter-dup.csv')

In [None]:
df_a = dflow_a.to_pandas_dataframe()
df_a

In [None]:
df_b = dflow_b.to_pandas_dataframe()
df_b

In [None]:
zipped_dflow = dflow_a.zip_partitions([dflow_b])
zipped_df = zipped_dflow.to_pandas_dataframe()
zipped_df

In [None]:
dflow_c = dprep.auto_read_file('../data/crime-winter-dup2.csv')
df_c = dflow_c.to_pandas_dataframe()
df_c

In [None]:
zipped_dflow = dflow_a.zip_partitions([dflow_b, dflow_c])
zipped_df = zipped_dflow.to_pandas_dataframe()
zipped_df