# Deep vs Shallow Networks

Neural networks can be broadly categorized as:
- **Shallow Networks**: Only 1 or 2 hidden layers.
- **Deep Networks**: Many hidden layers (can be dozens or even hundreds).

### Key Differences
1. **Representation Power**:
   - Shallow networks can approximate simple functions.
   - Deep networks capture complex features through multiple transformations.

2. **Training Time**:
   - Shallow = faster to train, fewer parameters.
   - Deep = slower but more powerful.

3. **Applications**:
   - Shallow = simple tasks (basic classification, regression).
   - Deep = advanced tasks (image recognition, NLP, speech).

In [1]:
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

# Example: Shallow network (1 hidden layer)
shallow_model = Sequential([
    Dense(16, activation='relu', input_shape=(20,)),
    Dense(1, activation='sigmoid')
])

# Example: Deep network (multiple hidden layers)
deep_model = Sequential([
    Dense(64, activation='relu', input_shape=(20,)),
    Dense(64, activation='relu'),
    Dense(32, activation='relu'),
    Dense(16, activation='relu'),
    Dense(1, activation='sigmoid')
])

In [2]:
print("Shallow model summary:")
shallow_model.summary()

print("\nDeep model summary:")
deep_model.summary()

Shallow model summary:
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 16)                336       
 dense_1 (Dense)             (None, 1)                 17        
Total params: 353
Trainable params: 353
Non-trainable params: 0
_________________________________________________________________

Deep model summary:
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_2 (Dense)             (None, 64)                1344      
 dense_3 (Dense)             (None, 64)                4160      
 dense_4 (Dense)             (None, 32)                2080      
 dense_5 (Dense)             (None, 16)                528       
 dense_6 (Dense)             (None, 1)                 17        
Total params: 8,129
Trainable params: 8,129
Non-trainable 

✅ **Summary**:
- Shallow networks are easier to train but limited in complexity.
- Deep networks are harder to train but much more expressive.
- In practice, **deep networks outperform shallow ones** for large-scale problems.