<a href="https://colab.research.google.com/github/mohmaed7777/Marine-life-classification-with-fastai/blob/main/Sea_animals_classification_with_fastai.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:

# IMPORTANT: RUN THIS CELL IN ORDER TO IMPORT YOUR KAGGLE DATA SOURCES
# TO THE CORRECT LOCATION (/kaggle/input) IN YOUR NOTEBOOK,
# THEN FEEL FREE TO DELETE THIS CELL.
# NOTE: THIS NOTEBOOK ENVIRONMENT DIFFERS FROM KAGGLE'S PYTHON
# ENVIRONMENT SO THERE MAY BE MISSING LIBRARIES USED BY YOUR
# NOTEBOOK.

import os
import sys
from tempfile import NamedTemporaryFile
from urllib.request import urlopen
from urllib.parse import unquote, urlparse
from urllib.error import HTTPError
from zipfile import ZipFile
import tarfile
import shutil

CHUNK_SIZE = 40960
DATA_SOURCE_MAPPING = 'sea-animals-image-dataste:https%3A%2F%2Fstorage.googleapis.com%2Fkaggle-data-sets%2F2442436%2F5198507%2Fbundle%2Farchive.zip%3FX-Goog-Algorithm%3DGOOG4-RSA-SHA256%26X-Goog-Credential%3Dgcp-kaggle-com%2540kaggle-161607.iam.gserviceaccount.com%252F20240308%252Fauto%252Fstorage%252Fgoog4_request%26X-Goog-Date%3D20240308T165604Z%26X-Goog-Expires%3D259200%26X-Goog-SignedHeaders%3Dhost%26X-Goog-Signature%3D233b3619037641088d915d34fb10e4a4dfc3cac83f0ee907d250258588f0a13f5d4929752840cb0e213d3bbb2ad9bd4add86266bc17e86d9f11f5e7cdd18e6bd11e0e5dadce676661290fd4bb6f741156959ca4adf81f65994199c17769bbbe6f13252a8c55dcbb68b2dba1f866bead4a4d5cc16868171a79aa6454a90227b5f6e666c6ac1dd52685b55d220e013e742b1780bb53bdfc98c13de189e8eb4cf57cc162942718960f779a0c8c95a05f1ce2d8486b87bc84cfcffa82171b29295662f80b528dc277eb41793e34a0e9af6136d03c617c398ba8b82070580bfb124e2d0e4e231889fc77d1deaa54cbf1b9adfe37be02fc03d46c51b57ad8adabe6dd3'

KAGGLE_INPUT_PATH='/kaggle/input'
KAGGLE_WORKING_PATH='/kaggle/working'
KAGGLE_SYMLINK='kaggle'

!umount /kaggle/input/ 2> /dev/null
shutil.rmtree('/kaggle/input', ignore_errors=True)
os.makedirs(KAGGLE_INPUT_PATH, 0o777, exist_ok=True)
os.makedirs(KAGGLE_WORKING_PATH, 0o777, exist_ok=True)

try:
  os.symlink(KAGGLE_INPUT_PATH, os.path.join("..", 'input'), target_is_directory=True)
except FileExistsError:
  pass
try:
  os.symlink(KAGGLE_WORKING_PATH, os.path.join("..", 'working'), target_is_directory=True)
except FileExistsError:
  pass

for data_source_mapping in DATA_SOURCE_MAPPING.split(','):
    directory, download_url_encoded = data_source_mapping.split(':')
    download_url = unquote(download_url_encoded)
    filename = urlparse(download_url).path
    destination_path = os.path.join(KAGGLE_INPUT_PATH, directory)
    try:
        with urlopen(download_url) as fileres, NamedTemporaryFile() as tfile:
            total_length = fileres.headers['content-length']
            print(f'Downloading {directory}, {total_length} bytes compressed')
            dl = 0
            data = fileres.read(CHUNK_SIZE)
            while len(data) > 0:
                dl += len(data)
                tfile.write(data)
                done = int(50 * dl / int(total_length))
                sys.stdout.write(f"\r[{'=' * done}{' ' * (50-done)}] {dl} bytes downloaded")
                sys.stdout.flush()
                data = fileres.read(CHUNK_SIZE)
            if filename.endswith('.zip'):
              with ZipFile(tfile) as zfile:
                zfile.extractall(destination_path)
            else:
              with tarfile.open(tfile.name) as tarfile:
                tarfile.extractall(destination_path)
            print(f'\nDownloaded and uncompressed: {directory}')
    except HTTPError as e:
        print(f'Failed to load (likely expired) {download_url} to path {destination_path}')
        continue
    except OSError as e:
        print(f'Failed to load {download_url} to path {destination_path}')
        continue

print('Data source import complete.')


In [None]:
import pandas as pd
import numpy as np
import torch
import fastai
from fastai.vision.all import *

In [None]:
path = Path('/kaggle/input/sea-animals-image-dataste')

In [None]:
path.ls()

# **Start by createing a DataLoader:=**

In [None]:
sea_animals = DataBlock (
    blocks=(ImageBlock, CategoryBlock),
    get_items=get_image_files,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    get_y=parent_label,
    item_tfms=Resize(128))

In [None]:
dls = sea_animals.dataloaders(path)

# **View Samples of our Data:=**

In [None]:
dls.train.show_batch(max_n=6, nrows=2)

In [None]:
dls.valid.show_batch(max_n=6, nrows=2)

In [None]:
print(dls.vocab)

**Check the dataset length:**

In [None]:
len(dls.train_ds), len(dls.valid_ds)

# **Initializing our Baseline:=**

**Start with Resnet50 model:**

In [None]:
learn = vision_learner(dls, resnet50, metrics=error_rate)
learn.fine_tune(6)

# **Model Interpretation:=**

In [None]:
# Get the top losses:=
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_top_losses(3, nrows=3)

# **Image Processing using DataAugmentation (Aug transform)methodolgy:=**

In [None]:
sea_animals = sea_animals.new(item_tfms=Resize(128), batch_tfms=aug_transforms(mult=2))
dls = sea_animals.dataloaders(path)

# **See our dataset again after we apply Aug_transform:**

In [None]:
dls.train.show_batch(max_n=6, nrows=3)

# **Train the model:=**

In [None]:
learn = vision_learner(dls, resnet50, metrics=error_rate)
learn.fine_tune(6)

**As we see our model has improved than the last time , next step will try another method that will give us better results.**

# **Presizing:=**

In [None]:
sea_animals = DataBlock(
    blocks = (ImageBlock,CategoryBlock),
    get_items = get_image_files,
    get_y= parent_label,
    item_tfms=Resize(460),
    batch_tfms=aug_transforms(size=224, min_scale=0.75))

In [None]:
dls = sea_animals.dataloaders(path)

In [None]:
dls.train.show_batch(max_n=6, nrows=3)

# **Make a new training with Presizing method:=**

In [None]:
learn = vision_learner(dls, resnet50, metrics=error_rate)
learn.fine_tune(11)

# **Model Interpretation:=**

In [None]:
presize_interp = ClassificationInterpretation.from_learner(learn)
presize_interp.plot_top_losses(6, nrows=3)

# **Get the most confused predictions:**

In [None]:
presize_interp.most_confused(min_val=5)

# **Improving the model by finding the best Learning Rate:**

In [None]:
learn = vision_learner(dls, resnet50, metrics=error_rate)
lrs= learn.lr_find(suggest_funcs=(minimum, steep, valley, slide))

# **Train the model with new suggestive Learning Rate:=**

In [None]:
learn = vision_learner(dls, resnet50, metrics=error_rate)
learn.fine_tune(11, lrs.valley)

# **Unfreezing the model:=**

**The pretrained models, such as the ResNet model that we are using right now, can be used on data other than this completely and for another task, and this whats called Transfer Learning.**

**Because the model was trained on some other dataset, we might be able to improve it by effectivley removing the final linear layer of the model- which is specifically designed to classifiy the categories in the original pretraining dataset - and replace it with a layer specific to our dataset.**

**From FastAI:**

**We want to train a model in such a way that we allow it to remember all of these generally useful ideas from the pretrained model, use them to solve our particular task, and only adjust them as required for the specifics of our particular task.**

**Our challenge when fine-tuning is to replace the random weights in our added linear layers with weights that correctly achieve our desired task without breaking the carefully pretrained weights and the other layers. There is actually a very simple trick to allow this to happen: tell the optimizer to only update the weights in those randomly added final layers. Don't change the weights in the rest of the neural network at all. This is called freezing those pretrained layers.**

**When we call the fine_tune method FastAI does two things:**

**Trains the randomly added layers for one epoch, with all other layers frozen.**

**Unfreezes all of the layers, and trains them all for the number of epochs requested.**

In [None]:
learn.unfreeze()
lrs = learn.lr_find(suggest_funcs=(minimum, steep, valley, slide))

In [None]:
learn.fit_one_cycle(10, slice(lrs.minimum, lrs.slide))

# **Model Interpretation:=**

In [None]:
presize_interp_2 = ClassificationInterpretation.from_learner(learn)
presize_interp_2.plot_top_losses(6, nrows=3)