Follow these steps for Part 2:
- Open the provided Notebook link and allocate the cluster.
- Specify the input dataset by updating the
filepathvariable accordingly. - Execute each command sequentially in the Notebook.
- View the query outputs within the Notebook as well as in the specified output path.