Hi everyone! I just finished fastai's practical deep learning for coders and I am in the mood for making a plant image classifier.

If you want to make something like this as well, check [this](http://colab.research.google.com/github/fastai/fastbook/blob/master/02_production.ipynb) notebook(it is actually a more detailed version than the annotations below). If you want a video version, check [this](http://https://course.fast.ai/videos/?lesson=2).

**Note that Kaggle is the platform used since in using Google Colab or Jupyter Notebook, you need to have a GPU for running deep learning models .Kaggle has an available GPU already within its notebook. To access this, just follow this link. https://medium.com/aiplusoau/how-to-use-kaggle-and-google-colab-notebooks-with-gpu-enabled-c2a1512cd4f Meanwhile you need to setup Google Cloud if you opt to use Google Colab or Jupyer. To have a Google cloud account, the user must have a credit card information.**

# Installing a duckduckgo scraper and importing fastai libraries

In [None]:
pip install jmd_imagescraper;

In [None]:
!pip install -Uqq fastbook

import fastbook  #import the fast.ai library
from fastbook import *  #dont't worry, it's designed to work with import *
fastbook.setup_book()
from fastai.vision.widgets import *

#import the image scraper by @JoeDockrill, website: https://joedockrill.github.io/blog/2020/09/18/jmd-imagescraper-library/
from jmd_imagescraper.core import * 
from pathlib import Path
from jmd_imagescraper.imagecleaner import *

import ipywidgets as widgets

# Creating a path object that serves as a directory of our scraped images

In [None]:
path = Path().cwd()/"plant"

In [None]:
# Scraping the 200 images of each disease
duckduckgo_search(path, "bacterial spots", "bacterial spots plant", max_results = 200)
duckduckgo_search(path, "mosaic virus", "mosaic virus plant", max_results = 200)
duckduckgo_search(path, "healthy", "healthy plant", max_results = 200)
duckduckgo_search(path, 'rust fungus', 'rust fungus plant', max_results = 200)


In [None]:
path 

In [None]:
# Downloading and unzipping the images
lst = get_image_files(path)
lst

In [None]:
# rm -r plant
# uncomment this if you want to delete all of the images you scraped

In [None]:
# Checking the number of images we have
len(lst)

In [None]:
# Checking for images with errors
failed = verify_images(lst)
failed

In [None]:
# Removing those images with errors
failed.map(Path.unlink)

# Separating the train and valid data and checking our images

In [None]:
# Separating the test and valid data
plant = DataBlock(
            blocks = (ImageBlock, CategoryBlock),
            get_items = get_image_files,
            splitter = RandomSplitter(valid_pct = 0.2, seed = 42),
            get_y = parent_label,
            item_tfms = Resize(128))

In [None]:
# Checking the images we have and their classifications
dls = plant.dataloaders(path)
dls.valid.show_batch(max_n = 8, nrows = 2)

**Uncomment this if aug_transforms() gives you an error. Apparently, you have to downgrade pytorch in Kaggle.**

More details about that [here](http://www.kaggle.com/product-feedback/279990)

In [None]:
# pip install --user torch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 torchtext==0.10.0


In [None]:
# Performing data augmentation where we create random variations of our input data that they appear 
# different but the meaning does not change.
plant = plant.new(
           item_tfms = RandomResizedCrop(128, min_scale = 0.5),
           batch_tfms = aug_transforms())
dls = plant.dataloaders(path)
dls.train.show_batch(max_n = 8, nrows =2)

# Training Model and Checking the Results

In [None]:
learn = cnn_learner(dls, resnet18, metrics = error_rate)

Getting the best learning rate so I just won't guess it. I just pick either the slide or valley.
For the details about these four, click [here](http://forums.fast.ai/t/new-lr-finder-output/89236/3)

Click [here](http://sgugger.github.io/how-do-you-find-a-good-learning-rate.html) if you want to gain an intuitive sense of picking a learning rate.

In [None]:
lr_min, lr_steep, lr_slide, lr_valley = learn.lr_find(
    suggest_funcs=(minimum, steep, slide, valley))

In [None]:
print(f" minimum:{lr_min}\n steep:{lr_steep}\n slide:{lr_slide}\n valley:{lr_valley}")

In [None]:
learn.fine_tune(5,0.001737800776027143)

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

In [None]:
# Plotting the losses, I still don't know why the other 4 images don't show themeselves
interp.plot_top_losses(5, nrows = 1)


In [None]:
# This creates a GUI that shows the images where the model became confused the most. Feel free to
#delete or put in the correct classification the images
cleaner = ImageClassifierCleaner(learn)
cleaner

Uncomment the codes below to apply the changes you performed in the GUI

In [None]:
# for idx in cleaner.delete(): cleaner.fns[idx].unlink()
# for idx, cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)

# Exporting the model

In [None]:
learn.export()
path = Path()
path.ls(file_exts = '.pkl')

# The next step is to use this model in the desired web app or mobile app like Flutter