This workshop has been prepared in collaboration between Cuusoo and Databricks.
The workshop aims to provide participants with:
- A deep understanding of the delta lake file format in relation to the data lakehouse architecture.
- Ability to ingest and transform data using the medallion methodology (bronze, silver, gold).
📹 Recording: https://www.youtube.com/watch?v=mrHfdeH6az0
Within each topic folder, there will be sub-folders for:
unsolved
: contains the unsolved starter code. Students should refer to the Python comments for instructions for the activity.solved
: contains the solved solutions.
Clone this git repository into your Databricks Repos by following the steps below (taken from link):
- Click Repos Icon Repos in the sidebar.
- Click Add Repo.
- In the Add Repo dialog, click Clone remote Git repo and enter the repository URL. Select your Git provider from the drop-down menu, optionally change the name to use for the Databricks repo, and click Create. The contents of the remote repository are cloned to the Databricks repo.