# Image Classification Task for ML Internship 2023

Task - Your task is to train a ML model on the EuroSAT land cover classification dataset.

Dataset link- Dataset. The dataset contains 64x64 RGB images from Sentinel-2.
You can find references to the dataset here - https://github.com/phelber/EuroSAT

Training [Jupyter notebook]: Include all steps from loading the dataset, to
saving the model and evaluation metrics.

**Plan:**
  1. **Librairies import**
  2. **Data loading & preprocessing**
  4. **Model loading**
  5. **Model testing**

# Librairies import

In [12]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.models import load_model

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import os
from google.colab import drive

drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


# Data Loading & Preprocessing

In [2]:
#path of the dataset
dataset_path = '/content/drive/MyDrive/dataset/2750/'

In [3]:
#properties of the image
img_height = 64
img_width = 64

In [4]:
test_file_path = "/content/test.csv"

In [7]:
test_df = pd.read_csv(test_file_path)

In [6]:
# Normalize the testing set
test_datagen = ImageDataGenerator(rescale=1./255)

In [15]:
# Create a training data generator
test_generator = test_datagen.flow_from_dataframe(
    dataframe=test_df,
    directory='',
    x_col='path',
    y_col='label',
    target_size=(img_height, img_width),
    batch_size=1,
    class_mode='categorical')

Found 20 validated image filenames belonging to 10 classes.


# Model compilation

In [13]:
model = load_model('/content/drive/MyDrive/best_model_EuroStar.h5')

In [16]:
# Evaluate the model on the testing data
test_loss, test_acc = model.evaluate(test_generator)



In [17]:
print('Loss: %.4f, Accuracy: %.4f' % (test_loss, test_acc))

Loss: 0.4511, Accuracy: 0.9000


In [24]:
#classes
CLASSES = sorted(os.listdir(dataset_path))
for i, e in enumerate(CLASSES):
  print(f'classe {i} : {e}')

classe 0 : AnnualCrop
classe 1 : Forest
classe 2 : HerbaceousVegetation
classe 3 : Highway
classe 4 : Industrial
classe 5 : Pasture
classe 6 : PermanentCrop
classe 7 : Residential
classe 8 : River
classe 9 : SeaLake


In [25]:
y_pred = model.predict(test_generator)
y_pred = np.argmax(y_pred, axis=1)
preds = [CLASSES[e] for e in y_pred]



In [26]:
preds

['Highway',
 'PermanentCrop',
 'HerbaceousVegetation',
 'SeaLake',
 'AnnualCrop',
 'AnnualCrop',
 'Forest',
 'Industrial',
 'Industrial',
 'SeaLake',
 'AnnualCrop',
 'AnnualCrop',
 'Residential',
 'AnnualCrop',
 'PermanentCrop',
 'Industrial',
 'Forest',
 'Pasture',
 'HerbaceousVegetation',
 'Pasture']

Save the result

In [28]:
# Create a new DataFrame with the predicted labels and image paths
df_pred = pd.DataFrame({'path': test_df['path'],'label' : test_df['label'] ,'label_predicted': preds})

# Save the DataFrame to a CSV file
df_pred.to_csv('test_image_predictions.csv', index=False)