# Environment Setup

To run this notebook and scripts, follow these steps:

1. **Create a virtual environment:**
   ```bash
   python3 -m venv venv
   source venv/bin/activate
   ```
2. **Install requirements:**
   ```bash
   pip install -r requirements.txt
   ```
3. **Set the environment in Jupyter:**
   If using Jupyter, make sure the kernel uses your virtual environment.
   ```bash
   python -m ipykernel install --user --name=venv
   ```
   Then select the 'venv' kernel in the notebook interface.

# Offshore Wind Farm Data Extraction

This notebook outlines the scripts used to extract relevant data from an offshore wind farm database. This data was retrieved from `https://emodnet.ec.europa.eu/geoviewer/#` and last updated in 08/2025.

## Data Filtering

Two filters are applied to the data:

- **Status:** Only wind farms with `status = production` are included.
- **Turbines:** Only wind farms with more than 1 turbine (`turbines > 1`) are included.

The filtered data is saved as a CSV file for further analysis.

In [None]:
import pandas as pd
import json

# Load and normalize the wind farm data from JSON
with open('windfarms.json', 'r') as f:
    data = json.load(f)

df = pd.json_normalize(data['features'])

df[['geometry.coordinates.longitude', 'geometry.coordinates.latitude']] = pd.DataFrame(df['geometry.coordin\
ates'].tolist(), index=df.index)

# Apply filters: status = 'production' and n_turbines > 1
df_relevant = df[(df['properties.status'] == 'Production') & (df['properties.n_turbines'] > 1)].copy()

# Drop irrelevant columns
df_relevant = df_relevant.drop(columns=[
    'type',
    'id',
    'geometry_name',
    'geometry.type',
    'geometry.coordinates',
    'properties.notes',
])

# Sort the columns
df = df[sorted(df.columns)]

# Save the filtered dataframe to CSV
df_relevant.to_csv('windfarms_relevant.csv', index=False)

# Extended Database Creation

The database created above (`windfarms_relevant.csv`) was used in the following prompt to generate extended CSV files. These extended files include the names of the wind farms and the organizations (vendors and operators) involved. The results can be found under the `operators_vendors/` directory.

The prompt was provided along with `windfarms_relevant.csv` to the following models:

- Deepseek: DeepSeek-V3
- Perplexity: Default Best Selection inbuilt which chooses depending on the prompt
- ChatGPT: ChatGPT 5

[Research mode was enabled in all prompts]

## Prompts Used

### DeepSeek
```
Below is a csv with all commercially operational offshore wind farms in europe which are relevant to me. Get me the name of all the operator and vendor organisations involved for each farm. Use the information in the csv in your research. Don't get back to me until its done. Make sure to include more than one vendor or operator where appropriate (which might be referenced in the modat magnify given data).  Compile all the information and give me a csv with everything.
```
### ChatGPT
```
Below is a csv with all commercially operational offshore wind farms in europe which are relevant to me. Get me the name of all the operator and vendor organisations involved for each farm. Use the information in the csv in your research. Don't get back to me until its done. Make sure to include more than one vendor or operator where appropriate (which might be referenced in the modat magnify given data)
```

### Perplexity
```
Below is a csv with all commercially operational offshore wind farms in europe which are relevant to me. Get me the name of all the operator and vendor organisations involved for each farm. Use the information in the csv in your research. Don't get back to me until its done. Make sure to include more than one vendor or operator where appropriate (which might be referenced in the modat magnify given data).  Compile all the information and give me a csv with everything.
```