#**Photovoltaic panel segmentation on building facades**

Ayca Duran*, Pedram Mirabian, Panagiotis Karapiperis, Christoph Waibel,
Bernd Bickel and Arno Schlueter

## Data Collection Script

Here you can find the script to access the images collected in this project. The collection incudes both Google Street View (marked as GSV) imagery which needs to be accessed using a valid Google Maps Platform API key, and images collected from around the internet (marked as Web). In order to balance the dataset, the list of images featuring PV was complemented by roughly equal number of buildings without PV, using GSV captures in the city of Zürich, and Web captures from Google Images.

The input file consists of the image names (as they appear in the dataset) along with the category (showing the source of the image between GSV or Web and whether or not it has PV), split (training, validation, test set) and the link to access them.

## Data Availability

In some cases, the links used to save the images are made unavailable between the time of data collection and the publication of this notebook, leading to missing values in the 'link' column. You can contact us for further details and access to these images.

In [None]:
#@title Change this, run the rest

# STEP 1: Set your Google Maps Platform API key
api_key = 'YOUR_API_KEY'

# STEP 2: Load the input file image_links.csv
# if using Google Colab: You can just drag the file into the Files tab on the left.
input_file = '/content/image_links.csv'

## STEP 3: Set custom save location if needed (otherwise it's temporary)
# os.chdir(path)

In [None]:
def get_image(row, plot= True):

    file_name = row['file_name']
    cat = row['cat']
    split = row['split']

    link = row['link']

    print(f"\n{file_name}")

    if link == '':
        print(f"no links available")
        return
    else:
        link = link.replace('API_KEY_HERE', api_key)

        response = requests.get(link, headers=headers)

        if plot and response.status_code == 200:
            img = Image.open(io.BytesIO(response.content))
            plt.imshow(img)
            plt.axis("off")
            plt.show()

        with open(os.path.join(os.getcwd(), 'downloaded', split, file_name), 'wb') as f:
            f.write(response.content)

        print(f"response {response.status_code}")
        return

In [None]:
import requests
import pandas as pd
import os, shutil
import matplotlib.pyplot as plt
import io
from PIL import Image

headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0 Safari/537.36"}

df = pd.read_csv(input_file, index_col= 0)
df.fillna('', inplace= True)

df

Unnamed: 0_level_0,file_name,cat,split,link
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
0,006_gsv_00_jpg.rf.36be7fd85fcfee1d8d1fea13b103...,PVgsv,train,https://maps.googleapis.com/maps/api/streetvie...
1,016_gsv_00_jpg.rf.14e1682aaa9f7deb3aa5d7e39faa...,PVgsv,train,https://maps.googleapis.com/maps/api/streetvie...
2,283_147503_135nopv_jpg.rf.fea8ee38e4092d194653...,noPVgsv,train,https://maps.googleapis.com/maps/api/streetvie...
3,32_jpg.rf.ba36860180941726699154d186dfe15c.jpg,noPVweb,train,
4,265_web_02_jpg.rf.a87e9fe1053def404b991bdb87bd...,PVweb,train,https://assets.solarix.prod.verveagency.com/as...
...,...,...,...,...
1028,275_web_01_jpg.rf.fe7b616aed46d695f3f08daa06b4...,PVweb,test,https://kzp-architekten.com/media/yrewrite_seo...
1029,392_151337_179nopv_jpg.rf.1ccb05b42f45d348a1f8...,noPVgsv,test,https://maps.googleapis.com/maps/api/streetvie...
1030,240_web_01_jpg.rf.f5f0d78ac430efe7e349ee214b16...,PVweb,test,https://integratedpv.eurac.edu/sites/default/f...
1031,496_154381_277nopv_jpg.rf.77f34ca8613da7b9d158...,noPVgsv,test,https://maps.googleapis.com/maps/api/streetvie...


In [None]:
# download images

for item in ['train', 'val', 'test']:
    os.makedirs(os.path.join(os.getcwd(), 'downloaded', item), exist_ok= True)

df.apply(lambda row: get_image(row), axis= 1)