# Summary of this notebook
The weather data in the folder `windfarm_weather_data` initially contained data for farms that are planned, but not in service. In this notebook: 
- Using the file `hydroquebec_wind_farms.csv`, this notebook extracts the names of in-service wind farms, assigns an integer `label` to each of them and saves in `hydroquebec_wind_farms_in_service.csv`.
- Move the weather files of farms in service to the folder `windfarm_weather_data_in_service` from the original folder `windfarm_weather_data`.
- Finally in each of the weather files in the folder `windfarm_weather_data_in_service`, the column names corresponding to `temperature`,`relative_humidity_2m`,`wind_speed_10m`,`wind_direction_10m`,and `location` were renamed with a `_label` at the end corresponding to label of the in-service wind farm.

In [1]:
import os
import pandas as pd

## Extract and save in service wind farms

We have a file of all windfarms, including some under construction projects. We will remove the unfinished windfarms, and we will also add a column with numerical labels for the wind farms to make our labelling in the large file simpler.

In [2]:
#deletes all windfarms labelled 'planned'
# Load the CSV file
wf = pd.read_csv("hydroquebec_wind_farms.csv")

# Remove rows where the 'status' column has the value 'Planned'
wf = wf[wf['status'] != 'Planned']

In [3]:
# adds a new column with numerical label
labels = list(range(1, len(wf) + 1))

# Insert the new column next to 'label'
wf.insert(1, 'labels', labels)

In [4]:
#save a new csv
wf.to_csv("hydroquebec_wind_farms_in_service.csv", index=False)

## Move weather files of in service wind farms
Now we will use this file to separate out the weather data files that corresond to wind farms which are In Service.

In [5]:
# Paths
wf_csv_path = "hydroquebec_wind_farms_in_service.csv"
source_folder = "../windfarm_weather_data_2019-2023"
destination_folder = "../windfarm_weather_data_in_service"


In [6]:
# checking that the destination folder exists
os.makedirs(destination_folder, exist_ok=True)

In [7]:
# read csv
wf = pd.read_csv(wf_csv_path)

In [9]:
# Move files
for filename in wf["name"]:
    fullfilename = filename.replace(' ','_') + "_hourly_weather_2019-2023.csv"
    src_path = os.path.join(source_folder, fullfilename)
    dst_path = os.path.join(destination_folder, fullfilename)

    if os.path.exists(src_path):
        os.rename(src_path, dst_path)
        print(f"Moved: {fullfilename}")
    else:
        print(f"File not found: {fullfilename}")

Moved: Baie-des-Sables_wind_farm_hourly_weather_2019-2023.csv
Moved: Carleton_wind_farm_hourly_weather_2019-2023.csv
Moved: Mont-Rothery_wind_farm_hourly_weather_2019-2023.csv
Moved: De_L'Érable_wind_farm_hourly_weather_2019-2023.csv
Moved: Des_Moulins_wind_farm_hourly_weather_2019-2023.csv
Moved: Frampton_wind_farm_hourly_weather_2019-2023.csv
Moved: Gros-Morne_wind_farm_hourly_weather_2019-2023.csv
Moved: Côte-de-Beaupré_wind_farm_hourly_weather_2019-2023.csv
Moved: La_Mitis_wind_farm_hourly_weather_2019-2023.csv
Moved: Lac-Alfred_wind_farm_hourly_weather_2019-2023.csv
Moved: L'Anse-à-Valleau_wind_farm_hourly_weather_2019-2023.csv
Moved: Le_Granit_wind_farm_hourly_weather_2019-2023.csv
Moved: Le_Plateau_wind_farm_hourly_weather_2019-2023.csv
Moved: Le_Plateau_2_wind_farm_hourly_weather_2019-2023.csv
Moved: Massif_du_Sud_wind_farm_hourly_weather_2019-2023.csv
Moved: Montagne_Sèche_wind_farm_hourly_weather_2019-2023.csv
Moved: Montérégie_wind_farm_hourly_weather_2019-2023.csv
Moved: Mo

We have now moved all the relevant csv files into one folder. They all the same column names and I want to add the labels to the column names so they can be differentiated in the big csv.

## WARNING this code overwrites the files in the folder titled "windfarm_weather_data_in_service! Save a copy of this folder before running!

## Modify the column names with wind-farm labels

In [10]:
wf.columns

Index(['name', 'labels', 'project_type', 'capacity_MW', 'region', 'status',
       'commissioning_date', 'latitude', 'longitude'],
      dtype='object')

In [12]:
for filename, label in zip(wf["name"], wf["labels"]):
    
    file_path = os.path.join(destination_folder, filename.replace(' ','_') + "_hourly_weather_2019-2023.csv")

    if os.path.exists(file_path):
        try:
            df = pd.read_csv(file_path)

            new_columns = [f"{column}" for column in df.columns[0:2]] + [f"{column}_{label}" for column in df.columns[2:]]
            df.columns = new_columns

            # Save the file (overwrite)
            df.to_csv(file_path, index=False)
            print(f"Updated columns in {filename}")
        except Exception as e:
            print(f"Error processing {filename}: {e}")
    else:
        print(f"File not found: {filename}")

Updated columns in Baie-des-Sables wind farm
Updated columns in Carleton wind farm
Updated columns in Mont-Rothery wind farm
Updated columns in De L'Érable wind farm
Updated columns in Des Moulins wind farm
Updated columns in Frampton wind farm
Updated columns in Gros-Morne wind farm
Updated columns in Côte-de-Beaupré wind farm
Updated columns in La Mitis wind farm
Updated columns in Lac-Alfred wind farm
Updated columns in L'Anse-à-Valleau wind farm
Updated columns in Le Granit wind farm
Updated columns in Le Plateau wind farm
Updated columns in Le Plateau 2 wind farm
Updated columns in Massif du Sud wind farm
Updated columns in Montagne Sèche wind farm
Updated columns in Montérégie wind farm
Updated columns in Mont-Louis wind farm
Updated columns in New Richmond wind farm
Updated columns in Pierre-De Saurel wind farm
Updated columns in Mesgi'g Ugju's'n wind farm
Updated columns in Rivière-du-Moulin wind farm
Updated columns in Des Cultures wind farm
Updated columns in Saint-Damase win