## **Segmentação e determinação da potência de painéis solares utilizando imagens de satélite**

Devs:
*   Juan Carlos Cortez Aucapiña - 265568
*   Luiza Higino Silva Santos - 264535
*   Sergio Augusto de Almeida Christoforo - 249522

#### **Objetivo**

O objetivo deste projeto é utilizar ferramentas de análise de imagens e reconhecimento de padrões para segmentar painéis solares em imagens de satélite. A partir da segmentação, estimar a potência dos painéis com uma rede neural.

## **Bibliotecas e Imports**

In [None]:
import pandas as pd
import os
from google.colab import drive 
import numpy as np
import json
import random
import math
from sklearn.model_selection import train_test_split

from PIL import Image

In [None]:
drive.mount('/content/gdrive')
data_dir = "/content/gdrive/MyDrive/bdappv/"

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


## **Funções**

In [None]:
#______________________________________________________________________________#
#  Function to obtain a list of files
#______________________________________________________________________________#
def get_list(GoogleOrIgn, MaskOrImg):
    """
    GoogleOrIgn: A string that can be 'Google' or 'Ign' to indicate the path of the 
    files.

    MaskOrImg: A string that can be 'mask' or 'img' to indicate which file to load.

    Returns a list with all the file names from a directory on the drive.
    """

    if GoogleOrIgn == 'Google':
        path = '/content/gdrive/MyDrive/bdappv/google'

    elif GoogleOrIgn == 'Ign':
        path = '/content/gdrive/MyDrive/bdappv/ign'

    else:
        raise ValueError("GoogleOrIgn deve ser 'Google' ou 'Ign'.")

    if MaskOrImg == 'Mask':
        files_path = path+'/mask'

    elif MaskOrImg == 'Img':
        files_path = path+'/img'

    else:
        raise ValueError("MaskOrImg deve ser 'Mask' ou 'Img'.")

    files_list = os.listdir(files_path)

    return files_list
#______________________________________________________________________________#
#  Function to save a list of files
#______________________________________________________________________________#
def save_list(files_list, GoogleOrIgn, MaskOrImg):
    """
    GoogleOrIgn: A string that can be 'Google' or 'Ign' to indicate the path where 
    the files will be saved.

    MaskOrImg: A string that can be 'mask' or 'img' to indicate the name of the file
    to be saved.

    Saves a list of files obtained with the get_list command.
    """

    if GoogleOrIgn == 'Google':
        path = '/content/gdrive/MyDrive/bdappv/google'

    elif GoogleOrIgn == 'Ign':
        path = '/content/gdrive/MyDrive/bdappv/ign'

    else:
        raise ValueError("GoogleOrIgn deve ser 'Google' ou 'Ign'.")

    if MaskOrImg == 'Mask':
        file_name = path+'/listed_masks.txt'

    elif MaskOrImg == 'Img':
        file_name = path+'/listed_imgs.txt'

    else:
        raise ValueError("MaskOrImg deve ser 'Mask' ou 'Img'.")

#   print(file_name)
    np.savetxt(file_name, files_list, fmt='%s')
#______________________________________________________________________________#
#  Function to read an image file
#______________________________________________________________________________#
def read_list(GoogleOrIgn, MaskOrImg):
    """
    GoogleOrIgn: A string that can be 'Google' or 'Ign' to indicate the path where 
    the file to be read is located.

    MaskOrImg: A string that can be 'mask' or 'img' to indicate the name of the file
    to be read.

    Reads a list that contains all the files obtained with the get_list command and 
    saved with save_list.
    """

    if GoogleOrIgn == 'Google':
        path = '/content/gdrive/MyDrive/bdappv/google'

    elif GoogleOrIgn == 'Ign':
        path = '/content/gdrive/MyDrive/bdappv/ign'

    else:
        raise ValueError("GoogleOrIgn deve ser 'Google' ou 'Ign'.")

    if MaskOrImg == 'Mask':
        file_name = path+'/listed_masks.txt'

    elif MaskOrImg == 'Img':
        file_name = path+'/listed_imgs.txt'

    else:
        raise ValueError("MaskOrImg deve ser 'Mask' ou 'Img'.")

#    with open(file_name, 'r') as file:
#        lines = file.readlines()
#        files_list = np.loadtxt(lines, dtype=str, delimiter='\n', comments=None)
#    return files_list

    #files_list = np.loadtxt(file_name, dtype=str)
    files_list = np.genfromtxt(file_name, dtype=str, delimiter='\n', invalid_raise=False, filling_values="")

    return files_list
#______________________________________________________________________________#
#  Function to categorize the dataset
#______________________________________________________________________________#
def categorize_data(google_img_list, ign_img_list, google_mask_list, ign_mask_list):
    """
    google_img_list: List of images captured from the Google database;

    ign_img_list: List of images captured from the IGN database;

    google_mask_list: List of masks associated with the Google images;

    ign_mask_list: List of masks associated with the IGN images;

    Returns two category dictionaries with the occurrence count of 'masked' and 
    'unmasked' in the Google and IGN databases. It also returns two dictionaries, 
    google_data and ign_data, that store the categories for each image in their 
    respective databases.
    """

    google_categories = {'masked': 0, 'unmasked': 0}
    ign_categories = {'masked': 0, 'unmasked': 0}

    google_data = {}
    ign_data = {}

    for img in google_img_list:
        if img in google_mask_list:
            google_data[img] = 'masked'
            google_categories['masked'] += 1
        else:
            google_data[img] = 'unmasked'
            google_categories['unmasked'] += 1

    for img in ign_img_list:
        if img in ign_mask_list:
            ign_data[img] = 'masked'
            ign_categories['masked'] += 1
        else:
            ign_data[img] = 'unmasked'
            ign_categories['unmasked'] += 1

    return google_categories, ign_categories, google_data, ign_data


#______________________________________________________________________________#
#  Function to filter data with and without masks
#______________________________________________________________________________#
def filter_dictionary(dictionary, value):
    """
    dictionary: Input data, dictionary of values with and without masks 
    (google_data or ign_data).

    value: Desired rule ('masked' or 'unmasked').

    Returns a data vector that adheres to the established rule.
    """

    filtered_dictionary = [key for key, val in dictionary.items() if val == value]

    return filtered_dictionary
#______________________________________________________________________________#
#  Function to randomly select elements from the list of images
#______________________________________________________________________________#
def select_random_percentage(list, percentage):
    """
    list: A list of any elements.

    percentage: The percentage of randomly selected elements from the list.

    Returns a new list with randomly chosen elements.
    """
    # Define a seed
    random.seed(42)
    
    # Calcula quantos elementos correspondem a porcentagem
    n_elements = int(len(data_list) * (percentage / 100))
    
    # Seleciona aleatoriamente n elementos
    selected_elements = random.sample(data_list, n_elements)
    
    # Retorna a lista de elementos selecionados
    return selected_elements
#______________________________________________________________________________#
#  Function to select images based on uniformity criterion
#______________________________________________________________________________#
def list_threshold(GoogleOrIgn, threshold=0.95):
    """
    GoogleOrIgn: A string that can be 'Google' or 'Ign' to indicate the path where 
    the file to be read is located.

    threshold (float): Threshold for the percentage of pixels with the same value.

    Returns a dictionary containing the paths of the images that meet the selection 
    criteria.
    """

    if GoogleOrIgn == 'Google':
        path = '/content/gdrive/MyDrive/bdappv/google/'
    elif GoogleOrIgn == 'Ign':
        path = '/content/gdrive/MyDrive/bdappv/ign/'
    else:
        raise ValueError("GoogleOrIgn deve ser 'Google' ou 'Ign'.")

    results = {}
    img_list = read_list(GoogleOrIgn, 'Img')
    img_path = path + 'img'

    for filename in img_list:
        filepath = os.path.join(img_path, filename)
        print(i)
        img = Image.open(filepath)
        img_np = np.array(img)
        counts = np.bincount(img_np.ravel())
        max_val = np.argmax(counts)
        count_max = counts[max_val]

        if (count_max / img_np.size) >= threshold:
            results[filename] = True
        else:
            results[filename] = False
    return results
#______________________________________________________________________________#
#  Function to clear the dictionary
#______________________________________________________________________________#
def clean_dictionary(dictionary, vector):
    """
    dictionary: Dictionary with the list of images.

    vector: Vector with all the terms to be removed.

    Returns a dictionary containing only the terms that are not present in the 
    vector.
    """
    for key in vector:
        dictionary.pop(key)

    return clean_dictionary
#______________________________________________________________________________#
#  Function to save the dictionary
#______________________________________________________________________________#
def save_dictionary(dictionary, path_filename):    
    """
    dictionary: Dictionary with the list of images.

    path_filename: Path where the file will be saved along with the file name.

    Saves the dictionary at the chosen path.
    """

    with open(path_filename, 'w') as file:
        json.dump(dictionary, file)
#______________________________________________________________________________#
#  Function to read the dictionary
#______________________________________________________________________________#
def read_dictionary(path_filename):
    """
    path_filename: Path where the dictionary file is saved, along with the file name

    Return the saved dictionary.
    """

    with open(path_filename, 'r') as file:
        content = file.read()

    dictionary = json.loads(content)

    return(dictionary)

## **Código**


In [None]:
#google_img_list = get_list('Google','Img'); save_list(google_img_list,'Google','Img');
#ign_img_list = get_list('Ign','Img'); save_list(ign_img_list,'Ign','Img');
#google_mask_list = get_list('Google','Mask'); save_list(google_mask_list,'Google','Mask');
#ign_mask_list = get_list('Ign','Mask'); save_list(ign_mask_list,'Ign','Mask');

In [None]:
google_img_list = read_list('Google', 'Img').tolist();
google_mask_list = read_list('Google', 'Mask').tolist();
ign_img_list = read_list('Ign', 'Img').tolist();
ign_mask_list = read_list('Ign', 'Mask').tolist();

In [None]:
google_categories, ign_categories, google_data, ign_data = categorize_data(google_img_list, ign_img_list, google_mask_list, ign_mask_list);

In [None]:
print('Imagens na base de dados obtida com Google: ',google_categories);
print('Imagens na base de dados obtida com IGN: ',ign_categories);

Imagens na base de dados obtida com Google:  {'masked': 13303, 'unmasked': 15522}
Imagens na base de dados obtida com IGN:  {'masked': 7685, 'unmasked': 9649}


In [None]:
google_masked = filter_dictionary(google_data, 'masked');
google_unmasked = filter_dictionary(google_data, 'unmasked');
ign_masked = filter_dictionary(ign_data, 'masked');
ign_unmasked = filter_dictionary(ign_data, 'unmasked');

In [None]:
# Só é necessário roda uma vez para obter os dicionários.
#threshold_list_google = list_threshold('Google', threshold=0.95);
#save_dictionary(threshold_list_google, '/content/gdrive/MyDrive/bdappv/google/threshold_data_google.txt')
#threshold_list_google = list_threshold('Ign', threshold=0.95);
#save_dictionary(threshold_list_ign, '/content/gdrive/MyDrive/bdappv/ign/threshold_data_ign.txt')

In [None]:
#threshold_array_google =np.loadtxt('/content/gdrive/MyDrive/bdappv/google/threshold_data_google.txt', delimiter=',', dtype=str)
#threshold_array_ign = np.loadtxt('/content/gdrive/MyDrive/bdappv/ign/threshold_data_ign.txt', delimiter=',', dtype=str)

threshold_list_google = read_dictionary('/content/gdrive/MyDrive/bdappv/google/threshold_data_google.txt');
threshold_list_ign = read_dictionary('/content/gdrive/MyDrive/bdappv/ign/threshold_data_ign.txt');

In [None]:
trash_data_google = filter_dictionary(threshold_list_google, True);
print('Quatidade de imagens com sinalização de erro - Google: ',len(trash_data_google));
trash_data_ign = filter_dictionary(threshold_list_ign, True);
print('Quatidade de imagens com sinalização de erro - Google: ',len(trash_data_ign));

Quatidade de imagens com sinalização de erro - Google:  177
Quatidade de imagens com sinalização de erro - Google:  95


## **Seleção do conjunto de testes**

In [None]:
cleaned_list_masked = [term for term in google_masked if term not in trash_data_google];
cleaned_list_unmasked = [term for term in google_unmasked if term not in trash_data_google];

# Conjunto total de dados:
random.seed(42);
selected_google_masked = cleaned_list_masked;
selected_google_unmasked = random.sample(cleaned_list_unmasked, math.ceil((0.3*len(selected_google_masked))/0.7))

In [None]:
train_validation_set_masked, test_set_masked = train_test_split(selected_google_masked, train_size=0.8, random_state=42)
train_set_masked, validation_set_masked = train_test_split(train_validation_set_masked, train_size=0.75, random_state=42)

## **Alguns testes**

In [None]:
cleaned_list_masked_ign = [term for term in ign_masked if term not in trash_data_ign];
cleaned_list_unmasked_ign = [term for term in ign_unmasked if term not in trash_data_ign];

In [None]:
A = random.sample(cleaned_list_unmasked_ign, 1)
print(A)

['GVTSI1CBEVTYEX.png']


In [None]:
def check_common_terms(list1, list2):
    set1 = set(list1)
    set2 = set(list2)

    if set1 & set2:
        return True
    else:
        return False

In [None]:
check_common_terms(train_set_masked,validation_set_masked)

False

In [None]:
import matplotlib.pyplot as plt
image_list = trash_data_google
image_size=(2, 2)
num_images = len(image_list)
fig, axes = plt.subplots(1, num_images,figsize=(num_images * image_size[0], image_size[1]))

for i, filename in enumerate(image_list):
    image_path = os.path.join('/content/gdrive/MyDrive/bdappv/google/img/', filename)
    image = Image.open(image_path)
    axes[i].imshow(image)
    axes[i].axis('off')

plt.show()

## **Algumas Referências**

"Segmentation of Satellite Images of Solar Panels Using Fast Deep Learning Model" - https://www.ijrer.org/ijrer/index.php/ijrer/article/view/11607/pdf
(80%/20%)

"Estimation of rooftop solar energy generation using Satellite Image Segmentation" - https://ieeexplore.ieee.org/document/8971578 (80/10/10)

"Panel Segmentation: A Python Package for Automated Solar Array Metadata Extraction Using Satellite Imagery" - https://ieeexplore.ieee.org/document/10008194 (63/25/12)

"SolarFinder: Automatic Detection of Solar Photovoltaic Arrays" - https://ieeexplore.ieee.org/abstract/document/9111006?casa_token=BRGGve63_NgAAAAA:hV2kmVbSGPzD9zfckkhISndDHbweEyD1FR4axwkAbxfs6EhkRfY2yR5Y0expG1xTn7-3nbiymck (70/30)

"Multi-resolution dataset for photovoltaic panel segmentation from satellite and aerial imagery" - https://essd.copernicus.org/articles/13/5389/2021/ (80/20)

### Rascunho

In [None]:
threshold_list_ign = list_threshold('Ign', threshold=0.95);

In [None]:
trash_data_google = filter_dictionary(threshold_list_google, True);
trash_data_ign = filter_dictionary(threshold_list_ign, True);

In [None]:
print('Quatidade de imagens que serão removidas - Google: ',len(trash_data_google));
print('Quatidade de imagens que serão removidas - IGN: ',len(trash_data_ign));

In [None]:
np.savetxt('/content/gdrive/MyDrive/IA901_Projeto_PV/bdappv/google/trash_data_google.txt', trash_data_google, fmt='%s');
np.savetxt('/content/gdrive/MyDrive/IA901_Projeto_PV/bdappv/google/trash_data_ign.txt', trash_data_google, fmt='%s')

In [None]:
print(trash_data)

In [None]:
from PIL import Image

img = Image.open('/content/gdrive/MyDrive/IA901_Projeto_PV/bdappv/google/img/AASNS567FOZJTZ.png')
img.show()

In [None]:
counts = np.bincount(img_np.ravel())
max_val = np.argmax(counts)
count_max = counts[max_val]

if (count_max / img_np.size) >= 0.98:
  results = True
else:
  results = False

print(results)

In [None]:
print(A)


In [None]:
trash_data = filter_dictionary(threshold_list, True);
len(trash_data)
