## 01 - Acquiring smart meter data

In this notebook, we will download a recent extract of smart meter data from [Weave](https://weave.energy/about), an organisation that aims to improve access to energy data.

Weave collect and disseminate smart meter data from four UK DNOs. Rather than providing data at the individual consumer level, this dataset is aggregated to the low-voltage feeder level.

We're looking at the top row of this diagram:

<img src="../docs/imgs/energy-sa-ingest-flow.png" width="800">




### What does this notebook do?

Since the data is already in a public S3 bucket, we have two options:
1. Clone the source data into our enviroment and save it as a table.
2. Skip the raw clone and go straight to pre-processing the data.

We're going with option 1, but feel free to modify the code in subsequent notebooks ! 🔧



In [0]:
%run ./includes/common_functions_and_imports

In [0]:
meter_data_uri = "s3://weave.energy/smart-meter"
target_table_name = (
    f"{CONFIG.target_catalog}.{CONFIG.target_schema}.smart_meter_data_raw"
)

print(f"Pipeline is set to clone_raw_data: {CONFIG.clone_raw_data}")

if not CONFIG.clone_raw_data:
    dbutils.notebook.exit(
        f"Skipping clone of raw data from {meter_data_uri} to {target_table_name}"
    )
    
# Write to our lakehouse by reading directly from a public S3 bucket.
if spark.catalog.tableExists(target_table_name) or CONFIG.overwrite_data:
    (
        spark.read.format("parquet")
        .load(meter_data_uri)
        .write.saveAsTable(target_table_name, mode="overwrite")
    )