Cross-entropy is a loss (cost) function that measures how well a set of predicted probabilities P matches the true labels Y.

For binary classification (labels 0 or 1) the formula is

L(Y,P) = - Σ [ Y · log P + (1-Y) · log (1-P) ]

where
· Y is the true label (1 = positive, 0 = negative)
· P is the model's predicted probability that the label is 1
· The sum Σ runs over all samples.

Key points
· If the prediction is perfect (P ≈ 1 when Y = 1, or P ≈ 0 when Y = 0) the loss is near 0.
· Confident wrong predictions (e.g. P ≈ 1 when Y = 0) incur a very large loss because log(0) → -∞.
· Cross-entropy is convex for logistic regression, giving smooth gradients for optimisation.

The code you posted implements this exactly:

```python
return -np.sum(Y * np.log(P) + (1 - Y) * np.log(1 - P))
```

It converts Y and P to floats, takes element-wise logs, multiplies by the appropriate label term, sums over all samples, and returns the negative of that sum (so lower is better).

In [1]:
import numpy as np

def cross_entropy(Y, P):
    Y = np.float_(Y)
    P = np.float_(P)
    return -np.sum(Y  *np.log(P) + (1 - Y)*  np.log(1 - P))


