<a href="https://colab.research.google.com/github/ayushi-gajendra/SkinShots_AI_powered_skincare_platform/blob/tensorflow-model/Model_for_SkinShots.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Model for our SkinShots Website**
We are creating a Multiclass Classification Neural Network Model for our SkinShots project.

### **Key Concepts & Steps Covered:**

1. **Understanding the Data**

* We are starting with our **Skin** dataset.
* We define **features and labels**.
* We inspect it (shape, number of classes, etc.).

2. **Preprocessing**

* We **scale/normalize** data, if needed.
* We possibly **encode categorical labels**.
* We split it into **training, validation, test** sets.

3. **Building the Model**

* We define a neural network architecture using **tf.keras.Sequential** .
* Layers usually include **input** layer (or specifying input shape), **hidden** layers (Dense, with activation like ReLU), and **output** layer (with activation appropriate for classification, softmax for multi-class).

4. **Compile the Model**

* We specify **loss** function (e.g. sparse_categorical_crossentropy or binary_crossentropy depending on setup).
* Specify **optimizer** (e.g. Adam).
* **Metrics** (like accuracy) to monitor.

5. **Training**

* **Fit** the model on **training data.**
* We use **validation data** to see how well the model is generalizing.
* We use **callbacks** or track **history.**

6. **Evaluation**

* Evaluate on **test data** to see final performance.
* Check metrics, possibly **confusion matrix** or other **classification metrics**.

7. **Prediction**

* Use the model to **predict new/unseen data.**
* We possibly inspect **how certain it is** (softmax probabilities, etc.).



---



## 1. **Understanding the Data**

* The data is in our **Skin** Directory - with 5 subfolders representing **5 classes**
* The 5 classes are - **Acne, Blackheads, Dark Spots, Pores, Wrinkles**


## 2. **Preprocessing the Data**

* We will first split the data into - **Training, Validation** and **Test** sets



In [1]:
import tensorflow as tf

# Getting the Training data (70%)

train_ds = tf.keras.utils.image_dataset_from_directory(
    "/content/drive/MyDrive/Skin",
    image_size=(256,256),
    batch_size=32,
    label_mode="categorical",
    validation_split=0.3,
    subset="training",
    seed=42)

train_ds

Found 8182 files belonging to 5 classes.
Using 5728 files for training.


<_PrefetchDataset element_spec=(TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 5), dtype=tf.float32, name=None))>

Why it shows `None` instead of our batch size?

The dataset object (tf.data.Dataset) is abstract, it doesn’t know the batch size at the metadata level.

None means “variable dimension” — TensorFlow is leaving it flexible, because the last batch might not always be exactly 32.

Example: if our dataset had 5728 images, dividing by 32 gives 179.0 batches exactly.

If it were 5730 images, the last batch would only have 2 images.

So TensorFlow shows None to indicate “batch dimension depends on runtime”.

In [2]:
# Getting Validation and Testing data

temp_ds = tf.keras.utils.image_dataset_from_directory(
    "/content/drive/MyDrive/Skin",
    image_size=(256,256),
    batch_size=32,
    label_mode="categorical",
    validation_split=0.3,
    subset="validation",
    seed=42
)

# Splitting temp (30%) data into Validation (15%) and Testing (15%) data

total_batches = tf.data.experimental.cardinality(temp_ds).numpy()
temp_batches = int(0.5 * total_batches)
val_ds = temp_ds.take(temp_batches)
test_ds = temp_ds.skip(temp_batches)

val_ds, test_ds

Found 8182 files belonging to 5 classes.
Using 2454 files for validation.


(<_TakeDataset element_spec=(TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 5), dtype=tf.float32, name=None))>,
 <_SkipDataset element_spec=(TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None, 5), dtype=tf.float32, name=None))>)

#### 👉🏻 **Theory:**

* When you use **image_dataset_from_directory()** (or other dataset creation functions), you get a **tf.data.Dataset** object.

* A tf.data.Dataset is an **iterator-like pipeline**, not a static list. So you can’t just call len(dataset).

* To know how many elements (batches) are inside a dataset, TensorFlow provides the **tf.data.experimental.cardinality() function**.

* **Cardinality** means “number of elements.”

* Here, **each element** = **1 batch** (not individual images).

* The result is a **tf.Tensor** containing the **count**.

* **.numpy()** converts that tensor to a Python integer.

* 🔹 **Syntax:** `tf.data.experimental.cardinality(dataset)`


* 🔹 **Parameters:** dataset: a tf.data.Dataset object.

* 🔹 **Returns:** A scalar tf.Tensor with the number of elements (batches).



* **take()** and **skip()** are functional dataset transformations → they let us **slice datasets** into **non-overlapping subsets**.
* 🔹 **Syntax:**

* * ` dataset.take(n) → first n batches` : this becomes the validation set


* * ` dataset.skip(n) → everything after the first n batches` : this becomes the testing set

## **Further understanding the data**



### 1a. ***Features & Labels***

Each **dataset** element is a **tuple**: (images, labels)

**Features** (X): The **input data** we feed into our model.

* In our skin project → the images (pixels).

* Data type → tf.float32 (pixel values, usually scaled 0–255 or normalized to 0–1)


**Labels** (y): The **target/output** the model is trying to predict.

* In our project → the skin condition **class** (Acne, Blackheads, Dark Spots, Pores, Wrinkles).

* Because we used label_mode = "categorical", **labels are one-hot encoded vectors**.


### 1b. ***Inspect the dataset***

* Check number of classes and names.
* Check the shapes of one batch.


In [3]:
class_names = train_ds.class_names
print("Class Name:", class_names)

Class Name: ['acne', 'blackheades', 'dark spots', 'pores', 'wrinkles']


In [4]:
print("Number of classes:", len(class_names))

Number of classes: 5


In [5]:
for images, labels in train_ds.take(1):
  print("Image batch shape:", images.shape)
  print("Label batch shape:", labels.shape)

Image batch shape: (32, 256, 256, 3)
Label batch shape: (32, 5)


**Features** (X):

* **Shape:** `(batch_size, height, width, channels)`

* (32, 256, 256, 3) → 32 RGB images, each 256×256.

**Labels** (y):

* **Shape**: `(batch_size, num_classes)`

* (32, 5) → 32 labels, each a vector like [0,0,1,0,0] (meaning "Dark Spots").

## 3. **Building the model**

## 4. **Comipling the model**


## 5. **Training the model**

In [None]:
# Creating the model - CNN model

model = tf.keras.Sequential([

    # 1st convulation + pooling
    tf.keras.layers.Conv2D(filters=30, kernel_size=(3,3), activation="relu", input_shape=(256,256,3)),
    tf.keras.layers.MaxPooling2D(2,2),

    # 2nd convulation + pooling
    tf.keras.layers.Conv2D(filters=60, kernel_size=(3,3), activation="relu"),
    tf.keras.layers.MaxPooling2D(2,2),

    # Flatten feature maps → Dense layers
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(120, activation="relu"),
    tf.keras.layers.Dense(5, activation="softmax")
])

# Compiling the model
model.compile(loss = tf.keras.losses.CategoricalCrossentropy,
              optimizer = tf.keras.optimizers.Adam(),
              metrics=["accuracy"])

# Training/fitting the model
model.fit(train_ds, validation_data= val_ds, epochs=5)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5
[1m 67/179[0m [32m━━━━━━━[0m[37m━━━━━━━━━━━━━[0m [1m9:21[0m 5s/step - accuracy: 0.3676 - loss: 855.0195

### **Why we choose a CNN model for images**

* **Images** are not just random numbers — they have **spatial structure**(neighboring pixels form edges, textures, patterns).

* A normal **Dense** neural net:
* * Would need millions of parameters to handle a 256×256×3 image.
* * Ignores pixel positions (treats them all the same).

* **CNNs** are **designed to handle images** because they:

* * Look at small regions at a time (**local patterns**).

* * Reuse filters across the whole image (**fewer parameters**).

* * Build up from **edges → textures → shapes → objects**.

### **Why we use each part of a CNN model**

1. **Input Layer**

* Why we need it:
* * The image comes in as (256, 256, 3) (height, width, RGB channels).

* Why not flatten immediately?
* * If we flatten right away, we lose the spatial relationships (neighboring pixels that form edges/spots).

2. **Convolutional Layers (Conv2D)**

* Why:
* * Instead of learning weights for every pixel separately, Conv2D uses filters/kernels that slide across the image, learning local patterns (edges, pores, spots).

* * Looks at small patches of the image (like 3×3 pixels).

* Why not Dense from the start?
* * Dense layers treat each pixel independently → too many parameters (millions!) and no sense of location.

* Benefit: Much fewer parameters, captures spatial features.

3. **Activation Function (ReLU)**

* Why:
* * Without non-linearity, the network is basically just doing linear transformations.

* * ReLU (f(x)=max(0,x)) is simple and prevents vanishing gradients (a common training problem).

* * Just turns all negative numbers into 0. This helps the network learn faster and not get stuck.

* Why not Sigmoid/Tanh?
* * They squish values into small ranges, making deep networks hard to train.
* * ReLU is faster and works better for images.

4. **Pooling Layers (MaxPooling2D)**

* Why:
* * Images are huge, and we don’t need every pixel.
* * Pooling reduces size while keeping the most important info (like “was there a spot in this region?”).

* * Shrinks the image while keeping the most important part.

* * Example: instead of remembering every pixel of a spot, it just remembers “there was a strong spot here”.

* Why MaxPooling and not AveragePooling?

* * MaxPooling keeps the strongest signal (the most activated feature). AveragePooling can blur/lose sharp features.

* Benefit: Smaller feature maps → fewer computations, more robust to small shifts (if a spot moves slightly, the model still detects it).

5. **Flatten**

* Why:
* * After convolution + pooling, we have a 3D feature map (height × width × channels).

* * Flatten converts it into a 1D vector so we can feed it into Dense layers.

* Why not keep it 3D? Dense layers only work with 1D input.

6. **Dense (Fully Connected) Layers**

* Why:
* * These layers combine all extracted features and “decide” what class the image belongs to.

* * Think of them as the classifier that sits on top of the feature extractor.

* * Think of this part as the decision maker: “based on these features, is it acne, pores, or wrinkles?”

* Why not only Conv layers?
* * Conv layers are great at feature extraction, but Dense layers are good at combining them for final decisions.

7. **Output Layer (Dense with Softmax)**

* Why Softmax?
* * Converts the raw numbers (logits) into probabilities across your 5 classes.

* * Ensures they add up to 1, so you can interpret it as: “70% Acne, 20% Dark Spot, 10% Pores”.

* Why not Sigmoid?
* * Sigmoid is for binary classification (yes/no).
* * For multiclass, we need Softmax.

**Dropout (optional, but often used)**

Why: Prevents overfitting by randomly turning off some neurons during training.

This forces the model to learn robust patterns, not memorize the training set.

Why not always? Too much dropout = underfitting.


✅ **So in simple terms:**

* Conv2D = feature finder (edges, spots).

* ReLU = makes learning faster.

* Pooling = keeps only important stuff.

* Flatten + Dense = final decision making.

* Dropout = avoids overfitting.

* Softmax = gives probabilities for each class.

CNN is chosen because **it’s built for image data.**

Inside CNN: each part **(Conv → ReLU → Pool → Dense → Softmax)** plays a role in extracting features, simplifying them, and finally making a classification.

### 5. **Evaluating the model**

In [None]:
model.summary()

In [None]:
model.evaluate(test_ds)