**IMPORTING THE MODULS**

Here we are adding the necessary libraries for the artificial intelligence to function properly.



In [None]:
import numpy as np 
import pandas as pd
import os
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import struct
import warnings
warnings.filterwarnings('ignore')
import tensorflow as tf
import seaborn as sns

**LOADİNG DATA**

Creating a data frame with the image path and label.

This code finds the .png files located in a directory and adds the full path of each file along with its directory name (as a label) to a DataFrame. We will use this data structure later for analysis or model training.






In [None]:
dir = '../input/a-large-scale-fish-dataset/Fish_Dataset/Fish_Dataset'
label = []
path = []                                                            
for dirname, _,filenames in os.walk(dir):                           
    for filename in filenames:

        if os.path.splitext(filename)[1]=='.png':                     
            if dirname.split()[-1]!='GT':                              
                label.append(os.path.split(dirname)[1])              
                path.append(os.path.join(dirname,filename))          
data = pd.DataFrame(columns=['path','label'])                         
data['path']=path
data['label']=label                                                     

**CONTROLLİNG THE DATA**

This way, you can see whether the path and label columns of the data are filled in correctly.



In [None]:
data.head()  
#This command is used to print the first few rows (by default, 5 rows) of a Pandas DataFrame to the screen. 

The data.info() command provides general and quick information about your dataset, making it very useful for checking whether there are any missing values, whether the data types are correct, and the memory usage.

In [None]:
data.info()

It shows how many of each fish species there are.



In [None]:
data['label']=data['label'].astype('category') 
data['label'].value_counts(ascending=True)   

In [None]:
plt.figure(figsize=(12, 8))
sns.countplot(y='label', data=data)
plt.title('Types Of Fishes')
plt.show()

**DATA ANALYSIS**

In [None]:
data['label'].unique()

Displaying first 9 images of fishes

In [None]:
idx = 0
plt.figure(figsize=(15,12)) 
for unique_label in data['label'].unique():  
    plt.subplot(3, 3, idx+1)
    plt.imshow(plt.imread(data[data['label']==unique_label].iloc[0,0])) 
    plt.title(unique_label)
    plt.axis('off')
    idx+=1

These codes display a sample image for each unique class in the DataFrame on a 3x3 grid. The loop selects one image from each class, displays it in the corresponding subplot, and adds the class name as the title.







**TRAINING AND TEST SETS**

The following codes are used to split the dataset into a training set and a test set. The aim is to appropriately divide the data for model training and evaluation.



In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test=train_test_split(data, test_size=0.2, random_state=42)

In [None]:
print(X_train.shape)
print(X_test.shape)

**Preparing the Image Dataset for Training, Validation, and Testing (with TensorFlow)**

This section is for preparing the dataset to be used in the deep learning model with Keras. The aim is to convert the images into a format suitable for the model, apply data augmentation, and create generators for the training/test sets.



In [None]:
from tensorflow.keras.applications        import ResNet50V2, MobileNetV2
from tensorflow.keras.applications.resnet import preprocess_input
from tensorflow.keras.preprocessing.image import ImageDataGenerator

trainGen = ImageDataGenerator(preprocessing_function=preprocess_input, validation_split=0.3)
testGen =ImageDataGenerator(preprocessing_function= preprocess_input)
X_train_img = trainGen.flow_from_dataframe(dataframe=X_train, x_col='path', y_col='label',class_mode='categorical', subset='training', color_mode='rgb', batch_size=32)
X_val_img = trainGen.flow_from_dataframe(dataframe=X_train, x_col='path', y_col='label',class_mode='categorical', subset='validation', color_mode='rgb', batch_size=32)
X_test_img =testGen.flow_from_dataframe(dataframe=X_test, x_col='path', y_col='label',class_mode='categorical', color_mode='rgb', batch_size=32, shuffle=False)

**IMAGE PLOTTING AFTER PREPROCESSING**

This code displays 6 images from the test set and their labels in a 2x3 grid. The label (class) information is added as a title for each image; if there is no label, "Unknown Label" is shown. This code is useful for visualizing the images in the test set and understanding which classes the model is predicting.

In [None]:
fit, ax = plt.subplots(nrows=2, ncols=3, figsize=(15, 8))
ax = ax.flatten()
j = 0
for _ in range(6):
    img, label = next(X_test_img) 
    ax[j].imshow(img[0].astype('uint8'))  

    print("Label:", label)  
    if isinstance(label, np.ndarray) and label.size > 0:
        ax[j].set_title(str(label[0]))  
    else:
        ax[j].set_title("Unknown Label") 

    ax[j].axis('off')  
    j += 1

plt.tight_layout()  
plt.show()  

We perform the following steps to examine the image dimensions and store them in a variable.



In [None]:
X_test_img[0][0].shape


In [None]:
image_shape=(256,256,3)

Checking Class Indices in the Training, Validation, and Test Sets


In [None]:
X_train_img.class_indices


In [None]:
X_val_img.class_indices


In [None]:
X_test_img.class_indices


**CREATING THE MODEL**

**EVALUATING THE MODEL'S RESULTS**

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=image_shape),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(9, activation='softmax')
])

model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

history = model.fit(
    X_train_img,
    epochs=10,
    batch_size=32,
    validation_data=(X_val_img,)
)

In [None]:
import matplotlib.pyplot as plt

plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='upper left')
plt.show()

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc='upper right')
plt.show()

In [None]:
y_pred = model.predict(X_test_img)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true = X_test_img.classes

Creating Confusion Matrix 

In [None]:
from sklearn.metrics import confusion_matrix
import seaborn as sns


cm = confusion_matrix(y_true, y_pred_classes)
plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d')
plt.xlabel('Predicted Class')
plt.ylabel('True Class')
plt.show()

Classification Report

The classification Report is used to evaluate the performance of a classification model in detail. Metrics such as precision, recall, F1 score, and support allow us to understand which classes the model predicts well and which classes require further improvement. This information is critical for the development and enhancement of the model.







In [None]:
from sklearn.metrics import classification_report

y_pred = model.predict(X_test_img)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true = X_test_img.classes

print("\nClassification Report:")
print(classification_report(y_true, y_pred_classes, 
                          target_names=X_test_img.class_indices.keys()))