## Introduction
Agricultural pests pose a significant threat to crop production, leading to substantial economic losses and impacting food security worldwide. Timely and accurate identification of these pests is crucial for effective pest management strategies. In this project, titled "Agricultural Pests Classification: ResNet-50 V2," we aim to develop a robust machine learning model capable of classifying various agricultural pests using advanced deep learning techniques. This project leverages the power of the ResNet-50 V2 architecture, a state-of-the-art convolutional neural network known for its exceptional performance in image classification tasks.

The dataset utilized for this classification task comprises two primary components: the Agricultural Pests Dataset and the Agricultural Pests Image Dataset. These datasets contain a diverse collection of images of agricultural pests along with their corresponding labels. The images are sourced from various agricultural settings, ensuring a comprehensive representation of different pest species.

## Datasets
### Agricultural Pests Dataset:

This dataset contains a CSV file with essential information regarding the training and test images. Each entry provides a filename corresponding to an image and its associated label, indicating the type of pest present in the image. The dataset is structured to facilitate the loading and processing of images for training the classification model.
The dataset can be accessed at the following link: Agricultural Pests Dataset.
Agricultural Pests Image Dataset:

Accompanying the main dataset, this dataset contains the actual images used for training and testing. The images are organized in a manner that reflects the classes they belong to, ensuring ease of access during the data preprocessing and model training phases.
The dataset can be accessed at the following link: Agricultural Pests Image Dataset.
ResNet-50 V2 Agricultural Pests Classification:

For this project, we have adopted the ResNet-50 V2 model, a deep learning architecture renowned for its residual learning framework. This model allows for the training of very deep networks without suffering from the vanishing gradient problem, making it particularly effective for complex image classification tasks.
The model is pre-trained on a large dataset (ImageNet) and fine-tuned on our specific dataset of agricultural pests. The adaptability of the ResNet-50 V2 architecture enables it to capture intricate features of pest images, leading to improved classification accuracy.

## Model Utilization
The ResNet-50 V2 model is implemented using the Lobe library, which simplifies the process of training and deploying machine learning models. The Lobe library allows us to load the pre-trained model and fine-tune it using our dataset, streamlining the training process. Once the model is trained, it is capable of predicting the presence of specific agricultural pests in unseen images, providing valuable insights for pest management in agricultural practices.

In summary, this project showcases the integration of deep learning techniques with agricultural pest classification, demonstrating the potential of technology to enhance pest management strategies. Through this notebook, we will explore the various stages of data preparation, model training, and evaluation, ultimately aiming to contribute to the ongoing efforts in sustainable agriculture.

In [1]:
import pandas as pd
train=pd.read_csv('/kaggle/input/agricultural-pests-dataset/train.csv')
train.head()

Unnamed: 0,filename,label
0,/kaggle/input/agricultural-pests-image-dataset...,snail
1,/kaggle/input/agricultural-pests-image-dataset...,wasp
2,/kaggle/input/agricultural-pests-image-dataset...,bees
3,/kaggle/input/agricultural-pests-image-dataset...,grasshopper
4,/kaggle/input/agricultural-pests-image-dataset...,weevil


## we are importing the necessary library and loading the training data from a CSV file.

* Importing pandas:
pandas is a powerful data manipulation library in Python. It is commonly used to load, analyze, and manipulate data. Here, we import it using the alias pd to make future calls shorter.

* Loading the Dataset:
The dataset is stored in a CSV (Comma Separated Values) file. We are using pd.read_csv() to load the data into a DataFrame, which is a two-dimensional tabular data structure in pandas.

* Exploring the Dataset:
Once the data is loaded, train.head() is used to display the first five rows of the dataset, giving us an idea of its structure and contents, which might include the features and labels for agricultural pests classification.

In [2]:
test=pd.read_csv('/kaggle/input/agricultural-pests-dataset/test.csv')
test.head()

Unnamed: 0,filename,label
0,/kaggle/input/agricultural-pests-image-dataset...,wasp
1,/kaggle/input/agricultural-pests-image-dataset...,snail
2,/kaggle/input/agricultural-pests-image-dataset...,catterpillar
3,/kaggle/input/agricultural-pests-image-dataset...,weevil
4,/kaggle/input/agricultural-pests-image-dataset...,beetle


In [3]:
train['filename']=train['filename'].str.replace('/kaggle/input/agricultural-pests-image-dataset/','/kaggle/input/agricultural-pests-dataset/train/')
train.head()

Unnamed: 0,filename,label
0,/kaggle/input/agricultural-pests-dataset/train...,snail
1,/kaggle/input/agricultural-pests-dataset/train...,wasp
2,/kaggle/input/agricultural-pests-dataset/train...,bees
3,/kaggle/input/agricultural-pests-dataset/train...,grasshopper
4,/kaggle/input/agricultural-pests-dataset/train...,weevil


## In this cell, we are modifying the file paths in the training dataset to ensure the correct image locations are referenced.

* Accessing the filename column:
The dataset contains a column named filename, which stores the paths to the images associated with each entry (or pest). This column is being accessed as train['filename'].

* Replacing parts of the file paths:
str.replace() is a method used to replace parts of strings in pandas DataFrame columns. Here, it is being used to modify the file paths in the filename column. The original paths seem to point to an old directory (/kaggle/input/agricultural-pests-image-dataset/), but we are updating them to the correct path where the images are stored (/kaggle/input/agricultural-pests-dataset/train/).

* Displaying the updated data:
After the paths are updated, train.head() is called again to show the first five rows of the training dataset with the modified filename paths.

In [4]:
test['filename']=test['filename'].str.replace('/kaggle/input/agricultural-pests-image-dataset/','/kaggle/input/agricultural-pests-dataset/test/')
test.head()

Unnamed: 0,filename,label
0,/kaggle/input/agricultural-pests-dataset/test/...,wasp
1,/kaggle/input/agricultural-pests-dataset/test/...,snail
2,/kaggle/input/agricultural-pests-dataset/test/...,catterpillar
3,/kaggle/input/agricultural-pests-dataset/test/...,weevil
4,/kaggle/input/agricultural-pests-dataset/test/...,beetle


## This cell mirrors the operations performed in the previous cell but is applied to the test dataset.

* Accessing the filename column in the Test Dataset:
Similar to the training dataset, the test dataset also has a filename column that contains paths to the images corresponding to each entry (or pest).

* Replacing parts of the file paths:
The str.replace() method is again used to update the file paths in the filename column. The previous path (/kaggle/input/agricultural-pests-image-dataset/) is replaced with the new correct path where the test images are located (/kaggle/input/agricultural-pests-dataset/test/).

* Why This Is Important:
Updating the file paths ensures that the model can correctly access the images needed for evaluation later in the project. Without these changes, the model would be unable to locate the images during testing, resulting in errors.

#### This step is essential for preparing the test dataset so it can be used to assess the performance of the trained model accurately.

In [5]:
!pip install lobe[all]

Collecting lobe[all]
  Downloading lobe-0.6.2-py3-none-any.whl (22 kB)
Collecting pillow~=9.0.1
  Downloading Pillow-9.0.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.3/4.3 MB[0m [31m21.5 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
Collecting onnxruntime~=1.10.0
  Downloading onnxruntime-1.10.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.9/4.9 MB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m0m
[?25hCollecting tflite-runtime~=2.7.0
  Downloading tflite_runtime-2.7.0-cp37-cp37m-manylinux2014_x86_64.whl (2.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m0m
[?25hCollecting tensorflow~=2.8.0
  Downloading tensorflow-2.8.4-cp37-cp37m-manylinux2010_x86_64.whl (497.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━

## In this cell, the Lobe library is being installed along with its dependencies. Let's break down the key components:

1. Lobe Library Overview:
Lobe is a machine learning library designed to simplify the process of building, training, and deploying machine learning models. It is particularly focused on computer vision tasks and is user-friendly, making it accessible for both beginners and experienced practitioners.

#### Key Features:

##### Ease of Use: Lobe provides a straightforward interface that abstracts much of the complexity involved in model training and evaluation, making it easier to get started with machine learning.
##### Integration with Popular Frameworks: It integrates well with frameworks like TensorFlow and ONNX, allowing users to leverage powerful backend technologies while maintaining a simplified user experience.
##### Visualization: The library often includes visual tools for understanding model performance and data, which can be especially useful in educational and experimental contexts.

2. Importance of Lobe in Your Project:
In the context of your agricultural pests classification project, Lobe can be particularly beneficial for several reasons:

##### Model Building: It allows you to quickly build and fine-tune models specifically for image classification tasks, which is essential for identifying various pests based on images.
##### Experimentation: The library's user-friendly interface enables you to experiment with different model architectures and configurations without extensive coding.
##### Deployment: Once the model is trained, Lobe can facilitate the deployment process, making it easier to implement the model in real-world applications.

3. Dependencies Installed with Lobe:
When installing Lobe with the [all] option, several packages are also installed or updated. Here are the main ones:

* Pillow: A library for image processing that provides capabilities for opening, manipulating, and saving image files. Version 9.0.1 is specified, which is important for handling image data in your project.

* Matplotlib: A popular plotting library for Python, already present in your environment. It’s used for visualizing data and can help in visualizing model predictions and performance metrics.

* Requests: A library for making HTTP requests, useful for interacting with APIs and downloading datasets or models from the internet.

* ONNX Runtime: A high-performance runtime for executing ONNX models, which allows for model interoperability across different frameworks. Version 1.10.0 is specified here.

* TensorFlow: A comprehensive open-source platform for machine learning. Version 2.8.4 is installed, providing a robust environment for training deep learning models.

* TensorFlow Lite Runtime: A lightweight version of TensorFlow designed for mobile and edge devices. It enables the deployment of machine learning models on devices with limited resources.

* Keras: A high-level neural networks API, which runs on top of TensorFlow. Version 2.8.0 is installed, providing a simplified interface for building and training deep learning models.

* TensorBoard: A suite of visualization tools for TensorFlow, enabling you to visualize training metrics and model architectures.

* Protobuf: A language-neutral data serialization library that is used for serializing structured data, often employed in machine learning frameworks for model storage and transfer.

In [6]:
from lobe import ImageModel

model = ImageModel.load('/kaggle/input/resnet-50-v2-agricultural-pests-classification/train TensorFlow')
model

<lobe.model.image_model.ImageModel at 0x7b7df9323550>

## In this cell, the Lobe library's ImageModel class is used to load a pre-trained model for the agricultural pests classification task.

* Importing the ImageModel Class: The ImageModel class from the Lobe library is imported. This class is specifically designed for image classification tasks and provides methods for loading pre-trained models, making predictions, and evaluating performance. It simplifies the workflow for image-related machine learning projects.

* Loading the Pre-trained Model: The load method of the ImageModel class is invoked to load a pre-trained model from a specified path. This model is expected to be based on the ResNet-50 V2 architecture, which is a deep convolutional neural network known for its ability to capture complex patterns in image data.

* Model Object: The model object is referenced at the end of the cell, which displays the model’s architecture, configuration, and possibly its training history. Once loaded, this model can be used to make predictions on new images and evaluate its performance on test datasets.

* Importance of the ResNet-50 V2 Architecture: ResNet-50 V2 utilizes residual learning to help train very deep networks, mitigating the vanishing gradient problem and allowing for better performance on complex tasks like image classification. Using a pre-trained model like ResNet-50 V2 can significantly speed up the training process and improve accuracy, especially when the available training data is limited.

#### Conclusion:
This cell initializes the model for your agricultural pests classification project, leveraging the capabilities of the Lobe library and the strengths of the ResNet-50 V2 architecture to provide a solid foundation for further predictions and analyses.



In [7]:
result=[]
for i in test.filename:
    result_i= model.predict_from_file(i)
    result.append(result_i.prediction)
result[:5]

['wasp', 'snail', 'catterpillar', 'weevil', 'beetle']

## In this cell, predictions are made for the test images using the pre-trained model.

* Result Initialization: An empty list called result is created to store predictions.

* Predicting for Each Image: A for loop iterates through the test.filename list, which contains the paths to the test images. For each image, the predict_from_file method of the model is called to get its prediction.

* Storing Predictions: The predicted class labels are appended to the result list.

* Displaying Predictions: The first five predictions are retrieved and displayed with result[:5].

#### Conclusion:
This cell generates and stores predictions for all test images, providing a quick view of the model’s output on the test dataset.



In [8]:
test=test.assign(prediction=result)
test.head()

Unnamed: 0,filename,label,prediction
0,/kaggle/input/agricultural-pests-dataset/test/...,wasp,wasp
1,/kaggle/input/agricultural-pests-dataset/test/...,snail,snail
2,/kaggle/input/agricultural-pests-dataset/test/...,catterpillar,catterpillar
3,/kaggle/input/agricultural-pests-dataset/test/...,weevil,weevil
4,/kaggle/input/agricultural-pests-dataset/test/...,beetle,beetle


#### This cell effectively integrates the model's predictions into the test dataset, making it easier to analyze and compare the predicted results with the actual test images.



In [9]:
result=[]
for i in train.filename:
    result_i= model.predict_from_file(i)
    result.append(result_i.prediction)
result[:5]

['snail', 'wasp', 'bees', 'grasshopper', 'weevil']

## In this cell, predictions are generated for the training images using the same pre-trained model.

* Result Initialization: An empty list named result is created to store the predictions for each image in the training dataset.

* Predicting for Each Image: A for loop iterates through the train.filename list, which contains the paths to the training images. For each image:
The predict_from_file method of the model is invoked, which returns a prediction result for the current image.
The predicted class label is extracted from the result_i object and stored in the result list.

* Displaying Predictions: The first five predictions are displayed using result[:5], allowing a quick review of the model's outputs on a subset of the training images.

#### Conclusion:
This cell performs predictions on the training dataset, providing insights into how the model classifies the images it was trained on. This step is useful for evaluating the model's performance and understanding its behavior on the training data.

In [10]:
train=train.assign(prediction=result)
train.head()

Unnamed: 0,filename,label,prediction
0,/kaggle/input/agricultural-pests-dataset/train...,snail,snail
1,/kaggle/input/agricultural-pests-dataset/train...,wasp,wasp
2,/kaggle/input/agricultural-pests-dataset/train...,bees,bees
3,/kaggle/input/agricultural-pests-dataset/train...,grasshopper,grasshopper
4,/kaggle/input/agricultural-pests-dataset/train...,weevil,weevil


#### This cell effectively integrates the model's predictions into the training dataset, making it easier to analyze and compare the predicted results with the actual training images. This step is crucial for evaluating model performance on the training data.

In [11]:
from sklearn.metrics import classification_report
print(classification_report(train['label'],train['prediction']))

              precision    recall  f1-score   support

        ants       0.98      0.96      0.97       400
        bees       0.98      0.98      0.98       405
      beetle       0.87      0.95      0.91       331
catterpillar       0.97      0.93      0.95       329
  earthworms       0.96      0.98      0.97       246
      earwig       0.96      0.94      0.95       390
 grasshopper       0.98      0.96      0.97       390
        moth       1.00      0.99      0.99       397
        slug       0.98      0.96      0.97       316
       snail       1.00      1.00      1.00       405
        wasp       0.99      0.98      0.98       392
      weevil       0.98      0.99      0.99       394

    accuracy                           0.97      4395
   macro avg       0.97      0.97      0.97      4395
weighted avg       0.97      0.97      0.97      4395



## In this cell, the performance of the model is evaluated by generating a classification report based on the predictions made on the training data.

* Importing the Classification Report: The classification_report function from the sklearn.metrics module is imported. This function is used to compute various classification metrics.

* Generating the Classification Report: The classification_report function is called with two arguments:

train['label']: The true labels from the training dataset.
train['prediction']: The predicted labels generated by the model for the training images.

* Displaying the Report: The report generated by the function includes key metrics such as precision, recall, F1-score, and support for each class in the dataset. It provides a comprehensive overview of how well the model performed in classifying the training images.

#### Conclusion:
This cell assesses the model's performance by comparing its predictions against the actual labels in the training dataset. The classification report is crucial for understanding areas where the model excels or may need improvement, providing insights into its predictive capabilities across different classes.

In [12]:
print(classification_report(test['label'],test['prediction']))

              precision    recall  f1-score   support

        ants       0.91      0.97      0.94        99
        bees       0.90      0.96      0.93        95
      beetle       0.56      0.68      0.61        85
catterpillar       0.79      0.73      0.76       105
  earthworms       0.82      0.83      0.83        77
      earwig       0.85      0.72      0.78        76
 grasshopper       0.91      0.81      0.86        95
        moth       0.96      0.91      0.93       100
        slug       0.81      0.81      0.81        75
       snail       0.98      0.98      0.98        95
        wasp       0.97      0.96      0.97       106
      weevil       0.93      0.96      0.94        91

    accuracy                           0.87      1099
   macro avg       0.87      0.86      0.86      1099
weighted avg       0.87      0.87      0.87      1099



## In this cell, the classification performance of the model on the test dataset is evaluated by generating a classification report.

* Generating the Classification Report: Similar to the previous cell, the classification_report function from the sklearn.metrics module is called again, but this time with the test dataset:

test['label']: The true labels from the test dataset, which serve as the ground truth for evaluation.
test['prediction']: The predicted labels generated by the model for the test images.

* Displaying the Report: The output includes important metrics for each class in the dataset:

1. Precision: The ratio of correctly predicted positive observations to the total predicted positives. It indicates the accuracy of the positive predictions.
2. Recall (Sensitivity): The ratio of correctly predicted positive observations to the all actual positives. It reflects the model's ability to find all relevant cases.
3. F1-Score: The weighted average of precision and recall. It is a good measure of the model’s accuracy, especially when the class distribution is imbalanced.
4. Support: The number of actual occurrences of each class in the specified dataset.
#### Conclusion:
This classification report provides a detailed evaluation of the model's performance on the test dataset, highlighting its strengths and weaknesses in classifying each class. It is a vital step in assessing how well the model generalizes to unseen data, thus giving insights into its practical applicability.