# <center> **YVPD: YOLO Visual Pollution Detection System**</center>


## Introduction </br>
Visual pollution refers to objects or activities in the environment that negatively impact its aesthetic appeals, such as graffiti, billboards, unkept facades, and cluttered sidewalks. These types of pollution can decrease an area's value, distract drivers, and affect the quality of life for residents. It is crucial for cities to take action to combat visual pollution for the benefit of both residents and visitors.

## Problem Statement </br>
Visual pollution in urban areas is a growing concern affecting residents' quality of life. We're on a mission to revolutionize the way we measure and address it, by simulating human learning to create a "visual pollution score/index" using cutting-edge technology and data from a fleet of vehicles in KSA. 	


>**NOTE:** A detailed report has been submitted separately on SDAIA platform. This notebook contains the related code and info. The manual work done in excel is also mentioned in the specified report.

## Solution: 

### Step-1

In [None]:
#Importing libraries
import pandas as pd
import os
import glob
from PIL import Image, ImageDraw
import numpy as np
import matplotlib.pyplot as plt
import random
from sklearn.model_selection import train_test_split
import shutil
import torch
from IPython.display import Image  
import os 
import random
import shutil
import PIL
from warnings import filterwarnings
filterwarnings('ignore')

In [None]:
#Reading the csv into df
df = pd.read_csv('train.csv')

In [None]:
#Reading unique class names 
print(df['name'].unique())
print(df.name.unique())

# Convert to List
print(df.name.unique().tolist())

In [None]:
#making separate folders to store training and validation data
!mkdir training val

In [None]:
#Making image and label folders inside training and validation folders
!mkdir training/images training/labels val/images val/labels

In [None]:
#Converting labels into yolo format
# Creating the list of images from the excel sheet
imgs = df['image_path'].unique().tolist()
# Loop through each of the image
for img in imgs:
    boundingDetails = []
    # First get the bounding box information for a particular image from the excel sheet
    boundingInfo = df.loc[df.image_path == img,:]
    # Loop through each row of the details
    for idx, row in boundingInfo.iterrows():
        # Get the class Id for the row
        class_id =row["class"]
        # Convert the bounding box info into the format for YOLOV5
        # Get the width
        bb_width = row['xmax'] - row['xmin']
        # Get the height
        bb_height = row['ymax'] - row['ymin']
        # Get the centre coordinates
        bb_xcentre = (row['xmin'] + row['xmax'])/2
        bb_ycentre = (row['ymin'] + row['ymax'])/2
        # Normalise the coordinates by diving by width and height

        bb_xcentre /= row['width'] 
        bb_ycentre /= row['height'] 
        bb_width    /= row['width'] 
        bb_height   /= row['height']  
        #Append details in the list 
        boundingDetails.append("{} {:.3f} {:.3f} {:.3f} {:.3f}".format(class_id, bb_xcentre, bb_ycentre, bb_width, bb_height))
    # Create the file name to save this info     
    file_name = os.path.join("labels", img.split(".")[0] + ".txt")
    # Save the annotation to disk
    print("\n".join(boundingDetails), file= open(file_name, "w"))

In [None]:
#Reading all text files and storing them in a varaiable
annotations = glob.glob('labels' +'/*.txt')
annotations

In [None]:
# Get the list of images from its folder
imagePath = 'C:/smartathon/dataset/images'
images = glob.glob(imagePath + '/*.jpg')
images

In [None]:
# Sort the annotations and images and the prepare the train and validation sets
images.sort()
annotations.sort()
 
# Split the dataset into train-valid splits 
train_images, val_images, train_annotations, val_annotations = train_test_split(images, annotations, test_size = 0.2, random_state = 123)


In [None]:
#Utility function to copy images to destination folder
def move_files_to_folder(list_of_files, destination_folder):
    for f in list_of_files:
        try:
            shutil.copy(f, destination_folder)
        except:
            print(f)
            assert False

In [None]:
# Copy the splits into the respective folders
move_files_to_folder(train_images, 'training/images')
move_files_to_folder(val_images, 'val/images')
move_files_to_folder(train_annotations, 'training/labels')
move_files_to_folder(val_annotations, 'val/labels')


### Step-2

As the conversion of labels from pascalvoc to yolo format is complete, now the model has to be implemented

In [None]:
#cloning yolov7 into local directory 
!git clone https://github.com/WongKinYiu/yolov7.git

In [None]:
#changing current directory to yolov7 folder
cd C:\smartathon\dataset\images\yolov7

In [None]:
#installing the requirements
!pip install -r requirements.txt

In [None]:
#getting the pre-trained weights file
!wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7_training.pt

In [None]:
#running the yolov7 model on 200 epochs with 8 workers, batch size 16 and image size (640 px)
!python train.py --workers 8 --device 0 --batch-size 16 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights 'yolov7_training.pt' --name yolov7_custom1 --hyp data/hyp.scratch.custom.yaml --epoch 200

### Step-3

After the model has been trained, it is later applied on unseen test data

In [None]:
#The model is being tested with the best trained weights
!python detect.py --weights best.pt --source test_images --save-txt

In [None]:
#Defining function to convert normalized predicted labels to nonnormalized format
def yolo_to_pascal_voc(x_center, y_center, w, h,  image_w, image_h):
    w = w * image_w
    h = h * image_h
    x1 = ((2 * x_center * image_w) - w)/2
    y1 = ((2 * y_center * image_h) - h)/2
    x2 = x1 + w
    y2 = y1 + h
    return [x1, y1, x2, y2]

#converting labels from yolo format to pascalvoc format and adding images' paths as well.
for annotation in annotations:
    with open(annotation, 'r') as f:
        content = f.readlines()
        new_file_name = annotation.replace(".txt", "_pascal_voc.txt") # new file name with different format
        with open(new_file_name, 'w') as new_f:
            for line in content:
                class_name, x_center, y_center, w, h = line.strip().split()
                x1, y1, x2, y2 = yolo_to_pascal_voc(float(x_center), float(y_center), float(w), float(h),1920, 1080)
                new_f.write(annotation + " " + class_name + " " + str(x1) + " " + str(y1) + " " + str(x2) + " " + str(y2) + "\n")


In [None]:
#Now converting labels from txt to csv format
folder_path = 'dest'
csv_file = 'dest2.csv'

# Get list of all text files in the folder
txt_files = [f for f in os.listdir(folder_path) if f.endswith('.txt')]

# Create a list to store the data from the text files
data = []

# Iterate through each text file
for txt_file in txt_files:
    with open(os.path.join(folder_path, txt_file), 'r') as f:
        # Read each line of the text file
        lines = f.readlines()
        # Iterate through each line
        for line in lines:
            # Split the line by space
            words = line.split(" ")
            # Append the file name and words as separate columns in the data list
            data.append([txt_file]+words)

# Write the data to a CSV file
with open(csv_file, 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['File Name']+['word_'+str(i) for i in range(len(words))])
    writer.writerows(data)


In [None]:
# Now adding the name column in csv file in accordance to the class column

df107=df10

# Create a dictionary of class-name mapping
class_name_mapping = {'0': 'GRAFFITI', '1': 'FADED_SIGNAGE', '2': 'POTHOLES', '3': 'GARBAGE', '4': 'CONSTRUCTION_ROAD', '5': 'BROKEN_SIGNAGE', '6':'BAD_STREETLIGHT','7':'BAD_BILLBOARD','8':'SAND_ON_ROAD','9':'CLUTTER_SIDEWALK','10':'UNKEPT_FACADE'}

# Add a new column 'name' and assign the corresponding name from the dictionary
df107['name'] = df107['class'].map(class_name_mapping)

# write the changes back to the file
df107.to_csv('example1.csv', index=False)

In [None]:
#As the files have to be submitted in specific formats, the columns in csv file have been reordered adn renamed. 

# specify the new order of columns
new_column_order = ['class', 'image_path', 'name', 'x_max', 'x_min', 'y_max', 'y_min']

# reorder the columns
df108 = df108[new_column_order]

# write the changes back to the file
df108.to_csv('submission.csv', index=False)

#renaming the column names
df108 = df108.rename(columns={'x_max': 'xmax', 'y_max':'ymax', 'x_min':'xmin', 'y_min':'ymin'})

#The values in the numerical columns have been rounded 
# columns to truncate decimal values
columns_to_truncate = ['x_min', 'y_min', 'x_max', 'y_max']

# truncate decimal values in columns 
df105[columns_to_truncate] = df105[columns_to_truncate].apply(lambda x: x.apply(lambda x: int(x)))

# write the changes back to the file
df105.to_csv('example.csv', index=False)

All the results and related information have been mentioned in the linked pdf report, submmitted on SDAIA platform. The csv file (finalsubmission.csv) has also been submitted for evaluation purpose.