Your goal is to develop a Convolutional Neural Network (CNN) that achieves the lowest possible validation loss on the CIFAR-10 dataset.
| Validation Loss Range | Points |
|---|---|
| < 0.63 | 10 pts |
| 0.63 – 0.70 | 9 pts |
| 0.70 – 0.80 | 8 pts |
| 0.80 – 0.90 | 7 pts |
| 0.90 – 1.30 | 6 pts |
| > 1.30 | Fail |
Expect a deep dive into these topics during the review. Use notes or code comments if needed, but ensure you actually understand the logic:
-
Data Prep: What is One-Hot Encoding and why do we use it for labels?
-
Optimization: * Batch Size: What value did you pick? What are the pros/cons of very small vs. very large batches?
-
Learning Rate: How does this affect convergence?
-
Callbacks: Which ones did you trigger (e.g., EarlyStopping, ModelCheckpoint)? Explain their parameters.
-
Architecture Details:
-
Layers: Be ready to explain only the layers you actually used (e.g.,
Conv2D,BatchNormalization,MaxPooling2D,Dropout,Dense). -
Activation Functions: Which ones (ReLU, Softmax, etc.) and why?
-
Weights: How were they initialized? How does the program decide which weights are the "final" best version?
-
Metrics & Analysis:
-
Loss vs. Accuracy: What is the fundamental difference between
val_lossand accuracy? -
Confusion Matrix: * What do the X and Y axes represent?
-
What does the diagonal tell you?
-
What do the off-diagonal numbers signify regarding model errors?
Would you like me to help you draft a high-performance CNN architecture in Python that targets that < 0.63 loss threshold?