This `Jupyter Notebook` downloads the `BER Public search` dataset to your `Google Drive` and converts the data to `parquet` format

To run it:

- Register your email address with SEAI at https://ndber.seai.ie/BERResearchTool/Register/Register.aspx

- Run the cell below to install all dependencies and (if prompted) select `RESTART RUNTIME` to register them

- Run all cells by selecting `Runtime > Run All` from the dropdown menu and ...
    - Enter your email address
    - (Google Colab) Authenticate `Google Drive` by clicking the URL linked in the [Mount Google Drive](#mount-google-drive) section below

- Once all cells have finished running you can query the converted `BER Public search` data saved on your `Google Drive` in ...
    - [sandbox](https://colab.research.google.com/github/codema-dev/berpublicsearch/blob/main/notebooks/sandbox.ipynb) ... includes links to tutorials
    - [heat-loss-parameter](https://colab.research.google.com/github/codema-dev/berpublicsearch/blob/main/notebooks/heat-loss-parameter.ipynb) ... an example application of this repository

In [None]:
!pip install git+https://github.com/codema-dev/berpublicsearch

In [None]:
email_address = input("Enter your email eddress: ")

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
from os import mkdir
from os import path

save_directory = "/content/drive/MyDrive/berpublicsearch"

if path.exists(save_directory):
    print(f"Skipping creation of new folder as {save_directory} already exists!")
else:
    mkdir(save_directory)
    
path_to_berpublicsearch_zip = f"{save_directory}/BERPublicsearch.zip"
path_to_berpublicsearch_unzipped = f"{save_directory}/BERPublicsearch"
path_to_berpublicsearch_parquet = f"{save_directory}/BERPublicsearch_parquet"

# Download `BERPublicsearch.zip` and convert to `parquet`

`parquet` is used here inplace of the default `txt` format as it is:
1. Compressed on disk: 184 MB vs 972 MB
2. Siginficantly faster input speeds as it is read column-by-column rather than row-by-row

In [None]:
from berpublicsearch.download import download_berpublicsearch

download_berpublicsearch(email_address, path_to_berpublicsearch_zip)

In [None]:
from shutil import unpack_archive

unpack_archive(path_to_berpublicsearch_zip, path_to_berpublicsearch_unzipped)

In [None]:
from berpublicsearch.convert import convert_to_parquet

convert_to_parquet(path_to_berpublicsearch_unzipped, path_to_berpublicsearch_parquet)

# Now try ...
- [sandbox](https://colab.research.google.com/github/codema-dev/berpublicsearch/blob/main/notebooks/sandbox.ipynb) ... includes links to tutorials
- [heat-loss-parameter](https://colab.research.google.com/github/codema-dev/berpublicsearch/blob/main/notebooks/heat-loss-parameter.ipynb) ... an example application of this repository