Skip to content

v0.4

Compare
Choose a tag to compare
@genegc genegc released this 23 Aug 22:59
· 2410 commits to main since this release
675f035

Release notes

  • new data lake structure (stage1np, stage2np, stage2p, stage3np, stage3p)
  • packaging pyspark code in the framework and modules as python classes in notebooks for importing
  • support for delta lake
  • adding of pseudonymization logic based on specified schema
  • added logging into Application Insights
  • modified github repo structure to facilitate the deployment of module assets directly from the framework github repo into the customer's implementation repo (making it possible to easily pull in pipeline assets)
  • improved setup script with logging to a timestamped log file
  • updated implementation of the ContosoISD package to work with the new architecture
  • updated docs

Migration notes
The three big changes in this release that require consideration when migrating from a previous version of OEA are:

  1. Modifications to the data lake structure - the names of the containers in the data lake have changed and additional containers have been added.
  2. Use of delta lake, which requires an updated spark pool to be created in order to use spark v3.1 (which comes with Delta Lake v1.0)
  3. Use of OEA_py as a class, in order to simplify processing of data.