# Smart Agricultural Bird Pest Control: Utilizing Computer Vision for Pest Management

## Abstract
This project focuses on developing a smart bird detection and repellent system aimed at protecting agricultural fields from bird-related crop damage. By leveraging computer vision and sound-based repellent mechanisms, the system will detect birds and trigger distress calls or ultrasonic sounds to scare them away. The solution is designed to be cost-effective, environmentally friendly, and sustainable, reducing the reliance on harmful methods like chemicals or physical barriers. The project begins by collecting a large dataset of bird images using web scraping techniques, which will be used to train the detection model. The project will then evolve to involve the integration of IoT systems for real-time monitoring and automated responses using Raspberry Pi.

## Table of Contents
1. [Overview](#overview)
2. [Data Sources](#data-sources)
3. [Importing Libraries](#importing-libraries)
4. [Data Collection](#data-collection)
5. [Exploratory Data Analysis (EDA)](#exploratory-data-analysis-eda)
6. [Data Preprocessing](#data-preprocessing)
7. [Model Development](#model-development)
8. [Results and Evaluation](#results-and-evaluation)
9. [Conclusion and Next Steps](#conclusion-and-next-steps)
10. [References](#references)

## Overview

### Problem Statement
Birds can cause significant damage to agricultural fields, threatening crop yields and food security worldwide. This problem is best illustrated by the 2023 invasion of red-billed quelea birds in Kenya, where farmers may lose up to 60 tonnes of grain as a result of these pests. This situation is worsened by ongoing drought conditions in the Horn of Africa, which have driven quelea to invade cultivated fields in search of food. In fact, the Food and Agricultural Organization (FAO) estimates that such crop losses amount to $50 million annually across sub-Saharan Africa.

The Kenyan government’s response to this issue has included the controversial use of toxic pesticides like fenthion to eradicate quelea populations. While intended to protect crops, this approach raises serious environmental and health concerns, as the indiscriminate nature of these chemicals poses risks to non-target wildlife, particularly endangered raptors, and can lead to ecological imbalances. Furthermore, the rapid breeding capabilities of birds, coupled with their consumption of up to 10 grams of grain daily, highlight the urgent need for sustainable and effective solutions to mitigate the impacts of avian pests on agriculture.

This project seeks to address the broader issue of bird-related crop damage by developing a smart bird detection and repellent system that leverages computer vision and sound-based mechanisms. The system will detect birds and trigger distress calls or ultrasonic sounds to deter them, providing a cost-effective and environmentally friendly alternative to harmful pesticides. The initial phase will involve collecting a comprehensive dataset of bird images through web scraping, which will be utilized to train the detection model. The project will later on evolve to incorporate IoT systems for real-time monitoring and automated responses using Raspberry Pi, empowering farmers to protect their crops while promoting ecological sustainability.

The following sections will detail the data sources, methods, and results of this project, highlighting the significance of technology in modern pest control and its potential impact on agricultural sustainability.

## Data Sources

The projectutilised data from various sources which was collected using various methods such as web scraping.

[African Bird Club](https://www.africanbirdclub.org/)

## Importing Libraries

Below are all the libraries that shall be used for this project.

In [1]:
from bs4 import BeautifulSoup
import requests
import time

## Data Collection

### Web Scraping
#### African Bird Club
The first data collected was from [African Bird Club](https://www.africanbirdclub.org/). In this section, I shall download and store 2,334 bird images of 673 Kenyan bird species through web scraping using beautiful soup.

The first thing I did was set the links to the [website](https://www.africanbirdclub.org/) that we would be accessing information from. I separated the two links to keep the code clean and flexible since I shall be collecting information from different pages. 
- The `base_url` holds the main address of the site and serves as the foundation for any additional links on the site
- `species_info_url` contains the path to the page with the kkenyan bird species information. 

This approach makes it easier to update or reuse parts of the URL later on if needed.

In [2]:
# Set the links to the pages to retrieve information from
base_url = 'https://www.africanbirdclub.org'
species_info_url = f'{base_url}/afbid/search/category/-/-/28'

When scraping data from websites, it’s important to identify your request as coming from a legitimate source, such as a web browser rather than a bot or scraper, so as to allow one to retrieve the necessary data smoothly. Websites can block requests if they think the traffic isn’t from a real user. To avoid this, we need to set headers that make the requests look like they're coming from a browser. 

I set the headers to include a `User-Agent`. You can find more information about `User-Agent` in [User-Agent Documentation](https://pypi.org/project/user-agents/), which is a string used by web browsers to identify themselves when making requests to websites. By using this, we ensure that the website allows our code to access the information without blocking it. In this case, I used a User-Agent that mimics a standard Chrome browser on Windows.

To understand more about why headers are important you can read [this article.](https://www.zenrows.com/blog/python-requests-user-agent#what-is)

In [3]:
# Set headers with a User-Agent to allow access
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36'
}

After setting the header, I sent a request using the `requests.get` method, which sends a GET request to the `species_info_url`, containing the specific endpoint for the bird species information. I also include the headers  defined earlier to ensure the request is accepted. You can find more details about the requests.get method in the [Requests Documentation](https://pypi.org/project/requests/).

After sending the request, I checked the status code of the response to determine if the request was successful. A status code of 200 indicates that the request was successful and the page has been successfully retrieved. If a different status code is recieved, a message to inform of the failure is printed, including the specific status code received. This helps in easier identifying and troubleshooting any issues with the request.

If the request is successful, the HTML content of the response is parsed using BeautifulSoup. BeautifulSoup is a powerful library for parsing HTML and XML documents, as detailed in the [documentation](https://www.crummy.com/software/BeautifulSoup/bs4/doc/). By passing the content of the response and specifying the parser ('html.parser'), a soup object is created, which allows for easy navigation and extraction of data from the HTML structure of the page.

This step is crucial, as it lays the foundation for extracting the necessary information about the bird species from the website.

In [4]:
# Scrape the species list page
response_info = requests.get(species_info_url, headers=headers)
if response_info.status_code != 200:
    print(f"Failed to retrieve species list page, status code: {response_info.status_code}")
else:
    soup_info = BeautifulSoup(response_info.content, 'html.parser')

In the next step, the HTML structure of the page is assessed. It can be noted that the structure is such that the bird species are arranged in a list contained in the class`inner-panel` as shown below:

<div style="text-align: center;">
    <img src="images-used/species-info-page-html.png" alt=" Species information pageand HTML structure" style="width: 800px; height: auto;">
</div>
           
From this page, information on the bird's `species_name`, `common_name`, and the `img_link` will be scraped and stored. The aim is to locate the specific elements within the HTML document of the `species_info_url` page that contain the list of bird species.

In [5]:
# Find the div and ul containing the species list
species_div = soup_info.find('div', class_='panel-inner')  # Find the div with class "panel-inner"
species_list = species_div.find('ul', class_='type')  # Find the ul with class "type"

The first line uses the find method of BeautifulSoup to search for a `<div>` tag that has the class attribute of panel-inner which is the container for the species list.

The second line searches for an `<ul>` tag (unordered list) within the previously identified species_div. The class of this `<ul>` is type, which indicates that it holds the list of bird species. By extracting this `<ul>`, we can access all the individual list items `<li>` that contain the species names and other relevant information.

In [7]:
 # Find all species items (li elements)
species_items = species_list.find_all('li')

# Use indexing to pick the species you want (e.g., the first species)
selected_species = species_items[0]  # Indexing to select the first species (change index as needed)
image_link = selected_species.find('a')['href']  # Get the link from the <a> tag
species_name = selected_species.find('h5').text.strip()  # Get the species name
common_name = selected_species.find('span').text.strip()  # Get the common name

print(f"Selected Species: {species_name}, Common Name: {common_name}, Image Link: {image_link}")

# Add sleep time to avoid overwhelming the server
time.sleep(5)

Selected Species: Acrocephalus arundinaceus (1), Common Name: Great Reed Warbler, Image Link: /afbid/search/browse/species/1503/28
