<a href="https://colab.research.google.com/github/Blueorchid1711/ECE/blob/main/Covid_19.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("pranavraikokte/covid19-image-dataset")

print("Path to dataset files:", path)

Downloading from https://www.kaggle.com/api/v1/datasets/download/pranavraikokte/covid19-image-dataset?dataset_version_number=2...


100%|██████████| 158M/158M [00:06<00:00, 24.0MB/s]

Extracting files...





Path to dataset files: /root/.cache/kagglehub/datasets/pranavraikokte/covid19-image-dataset/versions/2


# Task
Build a CNN model with TensorFlow for a hackathon using the dataset located at "kaggle://competitions/dogs-vs-cats/data".

## Load and preprocess the data

### Subtask:
Load the image data downloaded from Kaggle and prepare it for training the CNN model. This includes tasks like splitting the data into training and validation sets, resizing images, and applying data augmentation.


**Reasoning**:
Import necessary libraries and define constants for image loading and preprocessing.



In [None]:
import tensorflow as tf

IMG_HEIGHT = 128
IMG_WIDTH = 128
BATCH_SIZE = 32

train_dir = '/root/.cache/kagglehub/datasets/pranavraikokte/covid19-image-dataset/versions/2/Covid19-dataset/train'
test_dir = '/root/.cache/kagglehub/datasets/pranavraikokte/covid19-image-dataset/versions/2/Covid19-dataset/test'

**Reasoning**:
Load the training and validation datasets from the specified directories using `image_dataset_from_directory` and apply a validation split.



In [None]:
train_ds = tf.keras.utils.image_dataset_from_directory(
    train_dir,
    validation_split=0.2,
    subset="training",
    seed=123,
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    batch_size=BATCH_SIZE
)

val_ds = tf.keras.utils.image_dataset_from_directory(
    train_dir,
    validation_split=0.2,
    subset="validation",
    seed=123,
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    batch_size=BATCH_SIZE
)

Found 251 files belonging to 3 classes.
Using 201 files for training.
Found 251 files belonging to 3 classes.
Using 50 files for validation.


**Reasoning**:
Apply data augmentation and rescaling to the training dataset, and only rescaling to the validation dataset.



In [None]:
data_augmentation = tf.keras.Sequential(
  [
    tf.keras.layers.RandomFlip("horizontal",
                      input_shape=(IMG_HEIGHT,
                                  IMG_WIDTH,
                                  3)),
    tf.keras.layers.RandomRotation(0.1),
  ]
)

rescale = tf.keras.layers.Rescaling(1./255)

train_ds = train_ds.map(lambda x, y: (data_augmentation(x, training=True), y))
train_ds = train_ds.map(lambda x, y: (rescale(x), y))
val_ds = val_ds.map(lambda x, y: (rescale(x), y))

  super().__init__(**kwargs)


**Reasoning**:
Configure the datasets for performance using `cache()` and `prefetch()`.



In [None]:
AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

## Build the cnn model

### Subtask:
Define the architecture of the Convolutional Neural Network using TensorFlow. This involves adding layers such as convolutional layers, pooling layers, and dense layers.


**Reasoning**:
Import the necessary layers from tensorflow.keras.layers and define the CNN model architecture.



In [None]:
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Rescaling

num_classes = len(train_ds.class_names)

model = tf.keras.Sequential([
    Rescaling(1./255, input_shape=(IMG_HEIGHT, IMG_WIDTH, 3)),
    Conv2D(16, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(32, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(num_classes, activation='softmax')
])

AttributeError: '_PrefetchDataset' object has no attribute 'class_names'

**Reasoning**:
The previous code failed because the `_PrefetchDataset` object does not have a `class_names` attribute. The `class_names` attribute is available on the dataset object before applying `cache()` and `prefetch()`. I will get the class names from the original `train_ds` before applying `cache()` and `prefetch()`.



In [None]:
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Rescaling

# Get class names before applying cache() and prefetch()
original_train_ds = tf.keras.utils.image_dataset_from_directory(
    train_dir,
    validation_split=0.2,
    subset="training",
    seed=123,
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    batch_size=BATCH_SIZE
)
num_classes = len(original_train_ds.class_names)

model = tf.keras.Sequential([
    Rescaling(1./255, input_shape=(IMG_HEIGHT, IMG_WIDTH, 3)),
    Conv2D(16, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(32, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(num_classes, activation='softmax')
])

Found 251 files belonging to 3 classes.
Using 201 files for training.


  super().__init__(**kwargs)


## Compile the model

### Subtask:
Configure the model for training by specifying the optimizer, loss function, and metrics.


**Reasoning**:
Compile the defined CNN model with the specified optimizer, loss function, and metrics.



In [None]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])

## Train the model

### Subtask:
Train the CNN model using the prepared training data and validate its performance on the validation set.


**Reasoning**:
Train the compiled model using the training and validation datasets for a specified number of epochs and store the training history.



In [None]:
epochs = 10
history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=epochs
)

Epoch 1/10
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 1s/step - accuracy: 0.4567 - loss: 1.0968 - val_accuracy: 0.4800 - val_loss: 1.0601
Epoch 2/10
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 405ms/step - accuracy: 0.4459 - loss: 1.0757 - val_accuracy: 0.4800 - val_loss: 1.0591
Epoch 3/10
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 414ms/step - accuracy: 0.4459 - loss: 1.0700 - val_accuracy: 0.4800 - val_loss: 1.0498
Epoch 4/10
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 520ms/step - accuracy: 0.4459 - loss: 1.0580 - val_accuracy: 0.4800 - val_loss: 1.0279
Epoch 5/10
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 445ms/step - accuracy: 0.4459 - loss: 1.0244 - val_accuracy: 0.6600 - val_loss: 0.9695
Epoch 6/10
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 396ms/step - accuracy: 0.5613 - loss: 0.9324 - val_accuracy: 0.8000 - val_loss: 0.8357
Epoch 7/10
[1m7/7[0m [32m━━━━━━━━━━━━━━

## Evaluate the model

### Subtask:
Assess the performance of the trained model using appropriate metrics.


**Reasoning**:
Use the evaluate method of the trained model with the validation dataset to calculate the loss and accuracy, then print the results.



In [None]:
loss, accuracy = model.evaluate(val_ds)
print(f"Validation Loss: {loss}")
print(f"Validation Accuracy: {accuracy}")

[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 133ms/step - accuracy: 0.8888 - loss: 0.5447
Validation Loss: 0.5694524645805359
Validation Accuracy: 0.8799999952316284


## Summary:

### Data Analysis Key Findings

*   The CNN model achieved a validation accuracy of approximately 88% and a validation loss of approximately 0.57 after training for 10 epochs.
*   The training and validation loss generally decreased over the epochs, while training and validation accuracy generally increased.

### Insights or Next Steps

*   Consider training the model for more epochs or implementing early stopping to potentially improve performance and prevent overfitting.
*   Explore more advanced CNN architectures or transfer learning techniques to further enhance the model's accuracy.
