use of base_columns should be allowed as alias for base_column with multiple base_columns #47

ronanstokes-db · 2021-07-15T22:09:20Z

Code is present already to do this but does not work for following code snippet:

import dbldatagen as dg
from pyspark.sql.types import StructType, StructField, StringType

shuffle_partitions_requested = 8
partitions_requested = 8
data_rows = 10000000

dataspec = (dg.DataGenerator(spark, rows=10000000, partitions=8)
.withColumn("name", percent_nulls=1.0, template=r'\w \w|\w a. \w')
.withColumn("payment_instrument_type", values=['paypal', 'visa', 'mastercard', 'amex'], random=True)
.withColumn("payment_instrument", minValue=1000000, maxValue=10000000, template="dddd dddddd ddddd")
.withColumn("email", template=r'\w.\w@\w.com')
.withColumn("md5_payment_instrument",
expr="md5(concat(payment_instrument_type, ':', payment_instrument))",
base_columns=['payment_instrument_type', 'payment_instrument'])
)
df1 = dataspec.build()

df1.display()

ronanstokes-db added the bug Something isn't working label Jul 15, 2021

ronanstokes-db self-assigned this Jul 15, 2021

ronanstokes-db added wontfix This will not be worked on enhancement New feature or request and removed bug Something isn't working wontfix This will not be worked on labels Jul 27, 2021

ronanstokes-db added this to the initial-release milestone Jul 27, 2021

ronanstokes-db linked a pull request Jul 27, 2021 that will close this issue

added baseColumn and base_columns as alias for baseColumn #56

Merged

ronanstokes-db closed this as completed in #56 Jul 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use of base_columns should be allowed as alias for base_column with multiple base_columns #47

use of base_columns should be allowed as alias for base_column with multiple base_columns #47

ronanstokes-db commented Jul 15, 2021

use of base_columns should be allowed as alias for base_column with multiple base_columns #47

use of base_columns should be allowed as alias for base_column with multiple base_columns #47

Comments

ronanstokes-db commented Jul 15, 2021