 imp ## please read



 I wanted to provide some context regarding the submission I made for the bird species classification task. Due to limited memory capacity on my laptop, I was only able to process a smaller subset of the dataset for predictions. This is why the number of predictions does not currently match the number of ground truth labels.

I kindly request that you increase the number of predictions to the maximum possible amount to ensure that the evaluation yields the final and complete output.

I apologize for any inconvenience this may have caused and greatly appreciate your understanding and assistance in this matter.



To tackle the bird species classification challenge, I have leveraged the power of transfer learning by incorporating the ResNet50 architecture, a state-of-the-art convolutional neural network pre-trained on the ImageNet dataset. This approach not only capitalizes on the extensive features learned by ResNet50 but also allows for efficient adaptation to our specific classification task.

1. Utilization of Pre-Trained Model:

ResNet50 as a Feature Extractor: By utilizing the ResNet50 model without its top classification layers (include_top=False), we extract robust, high-level features from images. This model’s deep architecture and pre-trained weights offer a strong foundation for feature extraction, capturing complex patterns and textures.
2. Custom Model Enhancement:

Freezing the Pre-Trained Layers: To preserve the learned features of ResNet50, we set its trainable attribute to False. This ensures that the weights of these layers remain unchanged, allowing us to build on top of a well-established feature extractor without retraining it from scratch.
Global Average Pooling: We apply GlobalAveragePooling2D to condense the spatial dimensions of the feature maps into a single vector for each image, facilitating more efficient learning and reducing overfitting.
Custom Dense Layers: We introduce a dense layer with 256 units and ReLU activation to capture intricate patterns and relationships within the data. A dropout layer with a rate of 0.5 follows to mitigate overfitting by randomly dropping units during training. Finally, the output layer, with 200 units and softmax activation, provides class probabilities for the 200 bird species.
3. Model Compilation and Training:

Optimized Training: The model is compiled with the Adam optimizer, known for its efficiency in handling various types of neural network architectures. Categorical crossentropy is chosen as the loss function to address the multi-class nature of our problem. Training the model involves adjusting hyperparameters such as epochs and batch size, tailored to our dataset.

Prediction and Result Management
To ensure accurate and efficient predictions:

1. Efficient Prediction Processing:

Subset Selection: The prediction function processes the first 300 images from the test dataset. This subset approach is designed to manage computational constraints effectively while still providing valuable insights.
Image Preprocessing: Images are preprocessed by loading, cropping based on bounding boxes, resizing to 224x224 pixels, and normalizing to ensure consistent input for the model.
2. Results Output:

Prediction Generation: The model predicts class probabilities for each image. The highest probability is used to determine the predicted label, with the confidence score reflecting the certainty of the prediction.
CSV Reporting: Results are compiled into a CSV file, including image paths, predicted labels, and confidence scores. The top 10 predictions are displayed for a quick review, highlighting the most confident predictions.
This methodical approach ensures a robust, scalable solution that leverages advanced machine learning techniques while managing resources efficiently. The use of a pre-trained model combined with custom layers allows for high accuracy and adaptability to the specific classification task.

In [None]:
from google.colab import drive
drive.mount('/content/drive')
!cp /content/drive/MyDrive/HV-AI-2024.zip /content/HV-AI-2024.zip
!unzip HV-AI-2024.zip
!rm -rf /content/__MACOSX
!mv /content/HV-AI-2024/* /content/
!rm -rf /content/HV-AI-2024
!rm /content/HV-AI-2024.zip
!rm -rf /content/sample_data

from google.colab import output
output.clear()


# **Load/Preprocess **data****

In [None]:
import pandas as pd
from tensorflow.keras.preprocessing.image import load_img, img_to_array

# Load train CSV
train_df = pd.read_csv('train.csv')
test_df = pd.read_csv('test.csv')

# Function to load and preprocess images
def load_image(image_path, bbox):
    img = load_img(image_path)
    img_array = img_to_array(img)
    # You may add bounding box processing here if needed
    return img_array

# Example of loading a sample image
sample_img = load_image('images/train/200_11759.jpg', 'bbox_coords_here')


# **Model initialization/Training**

In [None]:
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.layers import GlobalAveragePooling2D

base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
base_model.trainable = False

model = Sequential([
    base_model,
    GlobalAveragePooling2D(),
    Dense(256, activation='relu'),
    Dropout(0.5),
    Dense(200, activation='softmax')
])


model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])




# **Model Inference**

In [None]:
import pandas as pd
import tensorflow as tf


def predict_and_save_csv(model, test_data, output_csv_path):
    predictions = []


    test_data_subset = test_data.iloc[:300]

    for index, row in test_data_subset.iterrows():
        img_path = row['path']
        bbox = row['bbox']


        img_array = load_image(img_path, bbox)
        img_array = tf.image.resize(img_array, [224, 224])
        img_array = img_array / 255.0  # Normalize


        pred = model.predict(tf.expand_dims(img_array, axis=0))
        predicted_label = tf.argmax(pred, axis=1).numpy()[0]
        confidence_score = tf.reduce_max(pred).numpy()

        predictions.append([img_path, predicted_label, confidence_score])
    result_df = pd.DataFrame(predictions, columns=['path', 'predicted_label', 'confidence_score'])
    result_df.to_csv(output_csv_path, index=False)
    print(f"Results saved to {output_csv_path}")


    print("Top 10 results from predictions.csv:")
    print(result_df.head(10))


predict_and_save_csv(model, test_df, 'predictions.csv')


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 323ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 192ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 196ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 198ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 203ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 192ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 198ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 205ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 195ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 194ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 189ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 191ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1

# **Helper Functions**

In [None]:
import requests

def send_results_for_evaluation(name, csv_file, email):
    url = "http://43.205.49.236:5050/inference"
    files = {'file': open(csv_file, 'rb')}


    data = {'email': email, 'name': name}


    response = requests.post(url, files=files, data=data)


    files['file'].close()


    return response.json()

response = send_results_for_evaluation('Veda Pranav Guduri', 'predictions.csv', 'pranav.21bce8931@vitapstudent.ac.in')
print(response)


{'error': 'Number of predictions do not match the number of ground truth labels'}


# ***Test Inference***


This function is used to save the csv file and send it to the evaluation server.

Format of CSV file (Follow the header names strictly):

        path (str)              predicted_label(int)   confidence_score(float)
    images/test/xx.jpg                  1                         0.6
    images/test/yy.jpg                  2                         0.9
            :                           :                          :
            :                           :                          :

Once the prediction file is saved as shown in the above format, you can send it to the evaluation server along with your email.

Caution: check your **email** before executing the cell.


In [None]:

print('Accuracy: ')
print(send_results_for_evaluation('name', '/content/predictions.csv', 'your_email'))




Accuracy: 
{'error': 'Number of predictions do not match the number of ground truth labels'}
