### Setting widgets

 Run this isolated cell, first, to create the widget UI. You'll need to fill it for the code to run.

In [0]:
dbutils.widgets.text('kaggle_username','username','Kaggle username')
dbutils.widgets.text('kaggle_apikey','API token','Kaggle key')
dbutils.widgets.text('url','https://www.kaggle.com/datasets/user/dataset','Dataset url')
dbutils.widgets.text('wkdir','Directory name','Working directory')

### Getting widget values and storing them into variables

In [0]:
import os

handle = dbutils.widgets.get('url').replace("https://www.kaggle.com/datasets/", "")
wkdir = dbutils.widgets.get('wkdir')

os.environ['handle'] = handle
os.environ['ds'] = handle.split('/')[1]
os.environ['ddir'] = wkdir
os.environ['kaggle_user'] = dbutils.widgets.get('kaggle_username')
os.environ['kaggle_key'] = dbutils.widgets.get('kaggle_apikey')

### Linking to kaggle and downloading the dataset

In [0]:
%sh
pip install kaggle
export KAGGLE_USERNAME=$kaggle_user
export KAGGLE_KEY=$kaggle_key
kaggle datasets download $handle --force

### Creating and going into a working directory (inside the driver)

In [0]:
%sh 
mkdir /databricks/driver/$ddir/ && cd $_

### Moving the downloaded dataset into the working directory

In [0]:
%sh
mv /databricks/driver/$ds.zip /databricks/driver/$ddir/$ds.zip

### Unzipping files

In [0]:
%sh
unzip /databricks/driver/$ds.zip

### Creating two lists for the files' path and name (for latter copy)

In [0]:
csv_files = []
file_names = []

files = dbutils.fs.ls('file:/databricks/driver/')

while files:
    path = files.pop(0).path
    if path.endswith('/'):
        files += dbutils.fs.ls(path)
    elif path.endswith('.csv'):
        csv_files.append(path)
        file_names.append(path.rsplit('/',1)[1])

### Mounting to ADLS

 You will need to add your client and ADLS details (between `<>`), in the first cell, so a connection can be established.

In [0]:
configs = {"fs.azure.account.auth.type": 'OAuth',
           "fs.azure.account.oauth.provider.type": 'org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider',
           "fs.azure.account.oauth2.client.id": '<application-id>',
           "fs.azure.account.oauth2.client.secret": '<service-credential-key-name>',
           "fs.azure.account.oauth2.client.endpoint": 'https://login.microsoftonline.com/<directory-id>/oauth2/token'}

adls_origin = 'abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/',
mount_location = '/mnt/<mount-name>'

In [0]:
if not any(mount.mountPoint == mount_location for mount in dbutils.fs.mounts()):
    dbutils.fs.mount(
        source = adls_origin,
        mount_point = mount_location,
        extra_configs = configs)
else:
    print('The specified mount point was already in use, nothing needed to be done in this cell.')

### Seting the raw folder's path (in ADLS)

In [0]:
raw = '/mnt/dspdata/{0}/raw/'.format(wkdir)

### Copying files into the raw folder

In [0]:
for file, name in zip(csv_files, file_names):
    dbutils.fs.cp(file, raw + name)

### Final Check - Ensuring all files were correctly copied into the raw folder

In [0]:
display(dbutils.fs.ls(raw))