<a href="https://colab.research.google.com/github/raz0208/Agritech-Pest-Prediction/blob/main/Agritech_Pest_Prediction_and_Classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Agritech Pest Prediction and Classification

# Datasets

## Overview
The datasets utilized in this project aim to analyze trends and predict the occurrence of insect pests based on meteorological conditions and past insect captures. Two primary data sources have been considered:
1. **Capture Data**: Records of insect catches over time.
2. **Historical Weather Data**: Meteorological data corresponding to the same period and locations as the capture data.

Both datasets are provided for two different locations: **Cicalino** (two different locations) and **Imola** (three different locations).

## Capture Data
Purpose: Contains historical records of insect catches, used for both regression (predicting the number of insects caught) and classification (detecting new catches).

### Files:
- `Capture_Chart(Cicalino_1).csv`
- `Capture_Chart(Cicalino_2).csv`
- `Capture_Chart(Imola_1).csv`
- `Capture_Chart(Imola_2).csv`
- `Capture_Chart(Imola_3).csv`

### Dataset sructure:
- **DateTime:** Timestamp of capture events.
- **Number of insects:** Count of insects caught.
- **New catches (per event):** Indicator of whether new catches occurred.
- **Reviewed:** Status of data review (e.g., "Si" for reviewed).
- **Event:** Additional event information (mostly empty).

## Historical Weather Data
Purpose: Contains meteorological data, which will be used as features for insect prediction models.

### Files:
- `Historical_Weather_Data(Cicalino_1).csv`
- `Historical_Weather_Data(Cicalino_2).csv`
- `Historical_Weather_Data(Imola_1).csv`
- `Historical_Weather_Data(Imola_2).csv`
- `Historical_Weather_Data(Imola_3).csv`

### Dataset structure:
- **DateTime:** Timestamp of recorded weather data.
- **Average Temperature:** Mean temperature at that time.
- **Temperature Range:** Minimum and maximum temperatures.
- **Average Humidity:** Humidity level at that time.

## Data Integration Strategy
To effectively utilize these datasets for analysis and model training:
- **Temporal alignment**: The capture and weather data will be merged based on the **Date** field to ensure correlation analysis and feature engineering.
- **Feature Engineering**: Extracting meaningful features such as lagged insect counts and weather trends to improve predictive performance.
- **Handling Missing Data**: Implementing imputation strategies for missing weather readings or insect captures.

This dataset preparation will serve as the foundation for building predictive models aimed at analyzing and forecasting insect population trends based on meteorological conditions.



### Import required libraries and read the data

In [1]:
# Import required libraries
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

## Libraries for visualization
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
# Read data
Capture_Chart_Cicalino_1 = pd.read_csv('/content/AgritechPestDataset/Capture_Chart(Cicalino_1).csv')
Capture_Chart_Cicalino_2 = pd.read_csv('/content/AgritechPestDataset/Capture_Chart(Cicalino_2).csv')
Capture_Chart_Imola_1 = pd.read_csv('/content/AgritechPestDataset/Capture_Chart(Imola_1).csv')
Capture_Chart_Imola_2 = pd.read_csv('/content/AgritechPestDataset/Capture_Chart(Imola_2).csv')
Capture_Chart_Imola_3 = pd.read_csv('/content/AgritechPestDataset/Capture_Chart(Imola_3).csv')
Historical_Weather_Data_Cicalino_1 = pd.read_csv('/content/AgritechPestDataset/Historical_Weather_Data(Cicalino_1).csv')
Historical_Weather_Data_Cicalino_2 = pd.read_csv('/content/AgritechPestDataset/Historical_Weather_Data(Cicalino_2).csv')
Historical_Weather_Data_Imola_1 = pd.read_csv('/content/AgritechPestDataset/Historical_Weather_Data(Imola_1).csv')
Historical_Weather_Data_Imola_2 = pd.read_csv('/content/AgritechPestDataset/Historical_Weather_Data(Imola_2).csv')
Historical_Weather_Data_Imola_3 = pd.read_csv('/content/AgritechPestDataset/Historical_Weather_Data(Imola_3).csv')

In [3]:
# Showing first rows of each file for Cicalino
print("** Capture Chart Cicalino 1 ** \n",Capture_Chart_Cicalino_1.head(), "\n")
print("** Capture Chart Cicalino 2 ** \n",Capture_Chart_Cicalino_2.head(), "\n")
print("** Historical Weather Data Cicalino 1 ** \n",Historical_Weather_Data_Cicalino_1.head(), "\n")
print("** Historical Weather Data Cicalino 2 ** \n",Historical_Weather_Data_Cicalino_2.head(), "\n")

** Capture Chart Cicalino 1 ** 
            Catch chart         Unnamed: 1               Unnamed: 2 Unnamed: 3  \
0             DateTime  Number of insects  New catches (per event)   Reviewed   
1  06.07.2024 06:01:00                  0                        0         Si   
2  07.07.2024 06:04:00                  0                        0         Si   
3  08.07.2024 06:03:00                  0                        0         Si   
4  09.07.2024 06:05:00                  0                        0         Si   

  Unnamed: 4  
0      Event  
1        NaN  
2        NaN  
3        NaN  
4        NaN   

** Capture Chart Cicalino 2 ** 
            Catch Chart         Unnamed: 1                Unnamed: 2  \
0             DateTime  Number of insects  New Catches (per evento)   
1  05.07.2024 11:31:01                  0                         0   
2  06.07.2024 03:02:01                  0                         0   
3  07.07.2024 03:04:01                  0                         0   


In [4]:
# Showing first rows of each file for Imola
print("** Capture Chart Imola 1 ** \n",Capture_Chart_Imola_1.head(), "\n")
print("** Capture Chart Imola 2 ** \n",Capture_Chart_Imola_2.head(), "\n")
print("** Capture Chart Imola 3 ** \n",Capture_Chart_Imola_3.head(), "\n")
print("** Historical Weather Data Imola 1 ** \n",Historical_Weather_Data_Imola_1.head(), "\n")
print("** Historical Weather Data Imola 2 ** \n",Historical_Weather_Data_Imola_2.head(), "\n")
print("** Historical Weather Data Imola 3 ** \n",Historical_Weather_Data_Imola_3.head(), "\n")

** Capture Chart Imola 1 ** 
            Catch chart         Unnamed: 1              Unnamed: 2 Unnamed: 3  \
0             DateTime  Number of Insects  New Catch (per evento)    Reviwed   
1  30.07.2024 22:01:00                  0                       0         Si   
2  31.07.2024 22:03:00                  0                       0         Si   
3  01.08.2024 22:01:00                  0                       0         Si   
4  02.08.2024 22:03:00                  0                       0         Si   

  Unnamed: 4  
0      Event  
1        NaN  
2        NaN  
3        NaN  
4        NaN   

** Capture Chart Imola 2 ** 
            Catch Chart         Unnamed: 1                Unnamed: 2  \
0             DateTime  Number of Insects  New Catches (per evento)   
1  31.07.2024 00:00:00                  0                         0   
2  01.08.2024 00:01:00                  0                         0   
3  02.08.2024 00:03:00                  0                         0   
4  03.08.202