In [None]:
import numpy as np

<div class="alert alert-warning">Models of categorization.</div>

The inhabitants of the planet Boton love pushing buttons. There are many kinds of buttons on Boton: red buttons, blue buttons, big buttons, small buttons. In fact, there are four dimensions along which buttons vary:

![](images/concepts.png)

Not all buttons do the same thing when pushed. Some are harmless, but others are dangerous and self-destruct. As young Botonans grow up, they are taught which buttons are safe to push and which are unsafe. Unfortunately, there are no hard-and-fast rules about which buttons are safe or unsafe, so young Botonans must develop a scheme for categorizing the buttons. For example, a Botonan might observe the following:
<a name="exemplars"></a>

![](images/exemplars.png)

Clearly, it's not enough to say that all the blue buttons are safe because there is at least one blue button known to be unsafe. So, how do the Botonans determine which buttons are safe?

## Context Model : similarity
As a cognitive scientist, you know a few different models of categorization that have been proposed. For example, you might remember the *context model* described by Medin and Schaffer (1978). The context model is an exemplar model: the probability that a stimulus is assigned to a category is based on its similarity to all the exemplars in that category.

To use the context model, we represent each stimulus (button) as a vector. For example, we can represent a big square blue textured button as $\mathbf{x}=[1, 0, 1, 1]$. Similarly, we can represent the stimulus which is small red cicular textured button as $\mathbf{x}=[0, 1, 0, 1]$. Given this representation, we can now define a similarity function.

> <a name="eq:similarity"></a>The similarity $\mathbb{S}$ of one stimulus $\mathbf{x}$ to another stimulus $\mathbf{y}$ is given by the following set of equations:
>
> $$
\begin{align*}
\mathbb{S}(\mathbf{x}, \mathbf{y}) &= \prod_{i = 1}^m s(x_i, y_i)=s(x_1, y_1)\cdot{}s(x_2, y_2)\cdot{}\ldots{}\cdot{}s(x_m, y_m)\\
s(x_i, y_i) &= \left\{
\begin{array}{rl} 1 & \text{if } x_{i} = y_{i} \\
   \theta & \text{if } x_{i} \neq y_{i}\end{array}\right.
   \end{align*}
$$
>
> where $\theta$ is a constant.



In [None]:
def calculate_similarity(x, y, theta=0.1):
    """Calculates the similarity between a stimulus x and a 
    stimulus y, where similarity is defined as:
    
        S(x, y) = s(x_1, y_1) * s(x_2, y_2) * ... * s(x_m, y_m)
    
    and:
    
        s(x_i, y_i) = 1 if x_i == y_i, theta otherwise
            
    Parameters
    ----------
    x, y : numpy arrays with shape (m,)
        The stimuli to compute similarity between
    theta : (optional) float
        A parameter to the similarity function. When the function is
        called without theta having been specified, it defaults to 0.1.
        
    Returns
    -------
    float : the similarity between x and y
    
    """
    return theta**(np.sum(x!=y))

In [None]:
# add your own test cases here!
x = np.array([1, 0, 1, 1])
y = np.array([0, 1, 1, 0])
print("S(x, y) = {}".format(calculate_similarity(x, y)))
print("S(x, y, theta=0.3) = {}".format(calculate_similarity(x, y, theta=0.3)))

## Context Model: Categorization
Now that we have defined the similarity between two stimuli, we can take a look at the equation for the context model. In the definition below, you can think of category $A$ as being the *safe buttons*, and category $B$ as being the *unsafe buttons*.

> <a name="eq:context-model"></a>The context model, which gives the probability that a novel stimulus $\mathbf{x}$ belongs to category $A$ (as opposed to category $B$) is given by:
>
> $$
P(A|\mathbf{x}) = \frac{\sum_{\mathbf{a} \in A} \mathbb{S}(\mathbf{x}, \mathbf{a})}{\sum_{\mathbf{a} \in A} \mathbb{S}(\mathbf{x}, \mathbf{a})+ \sum_{\mathbf{b} \in B} \mathbb{S}(\mathbf{x}, \mathbf{b})}
$$
>
> where $\sum_{\mathbf{a} \in A} \mathbb{S}(\mathbf{x}, \mathbf{a})$ is the sum over the similarity of $\mathbf{x}$ to all exemplars $\mathbf{a}$ in category $A$, and $\sum_{\mathbf{b} \in B} \mathbb{S}(\mathbf{x}, \mathbf{b})$ is the sum over the similarity of $\mathbf{x}$ to all exemplars $\mathbf{b}$ in category $B$.

Note that because $P(A|\mathbf{x})$ is a probability, we can easily compute from it the probability that $\mathbf{x}$ belongs to category $B$. The stimulus *must* belong to one of the two categories, thus $P(B|\mathbf{x})=1-P(A|\mathbf{x})$.


where $\mathbb{S}$ is the [similarity function](#eq:similarity) that we defined above and implemented in `calculate_similarity`.

In [None]:
def context_model(test_stimuli, exemplars, exemplar_categories, theta=0.1):
    """Computes the probability that each test stimulus belongs to 
    category A.
    
    Parameters
    ----------
    test_stimuli : numpy array with shape (n, m)
        n stimuli, each with m features, to be classified (i.e. 
        compute P(A|x) for each x)
    exemplars : numpy array with shape (k, m)
        k exemplars, each with m features
    exemplar_categories : numpy string array with shape (k,)
        Categories for the k exemplars. You can assume the values of 
        exemplar_categories will always be either be "A" or "B".
    theta : (optional) float
        A parameter to pass to the similarity function.
        
    Returns
    -------
    numpy array with shape (n,) such that the i^th element 
    corresponds to P(A|test_stimuli[i])
        
    """
    probability = np.empty(len(test_stimuli))
    for i in range(len(test_stimuli)):
        SA, SB = 0, 0
        for j in range(len(exemplars)):
            if exemplar_categories[j]=="A":
                SA += calculate_similarity(test_stimuli[i,:],exemplars[j,:],theta)
            else:
                SB += calculate_similarity(test_stimuli[i,:],exemplars[j,:],theta)
                
        probability[i] = SA/(SA+SB)
    
    return probability

In [None]:
# add your own test cases here!

test_stimuli = np.array([[1, 0, 1], [0, 1, 1], [1, 1, 1]])
test_exemplars = np.array([[1, 1, 0], [0, 0, 1], [0, 0, 0], [1, 0, 0]])
test_exemplar_categories = np.array(["B", "A", "A", "A"])
context_model(test_stimuli, test_exemplars, test_exemplar_categories)

## Context Model: Trying it out
Now that you have an implementation of the context model, let's try it out! First, let's see how well it does at categorizing the [exemplars that we already have](#exemplars):

In [None]:
safe_exemplars = np.array([
    [0, 1, 1, 0], # circle / small / blue / solid
    [1, 0, 1, 0], # square / big   / blue / solid
    [1, 1, 0, 0], # square / small / red  / solid
    [1, 1, 1, 1]  # square / small / blue / textured
])
unsafe_exemplars = np.array([
    [1, 1, 0, 1], # square / small / red  / textured
    [0, 0, 0, 1], # circle / big   / red  / textured
    [0, 1, 1, 1], # circle / small / blue / textured
    [0, 1, 0, 0]  # circle / small / red  / solid
])
all_exemplars = np.vstack([safe_exemplars, unsafe_exemplars])
exemplar_categories = np.array(["A", "A", "A", "A", "B", "B", "B", "B"])

In [None]:
context_model(all_exemplars, all_exemplars, exemplar_categories)

<div class="alert alert-success">Describe how well the context model does at categorizing the known exemplars. Does it give a higher value for $P(A|\mathbf{x})$ for those $\mathbf{x}$ which are actually in category $A$? (**0.5 points**)</div>

## Context Model: New exemplars
A young Botonan is strolling through the city, when a flash of blue and red is seen coming from an alleyway. Taking a closer look, the youngster discovers four buttons never before encountered:
<a name="novel"></a>

![](images/novel.png)

As a Botonan, this youngster feels a strong urge to push the buttons. However, the possibility that some of the buttons might be dangerous demands restraint. The Botonan pauses and takes a closer look.

Let's see what the context model says about how likely these buttons are to be dangerous:

In [None]:
novel_stimuli = np.array([
    [0, 0, 0, 0], # circle / big   / red  / solid
    [1, 0, 0, 1], # square / big   / red  / textured
    [1, 0, 1, 1], # square / big   / blue / textured
    [0, 0, 1, 0]  # circle / big   / blue / solid
])

In [None]:
context_model(novel_stimuli, all_exemplars, exemplar_categories)

<div class="alert alert-success">In words, what does the context model say? Which of these buttons are more likely to belong to category A (safe buttons)?</div>

---
## Prototype Model: Creating a prototype

You also know of another model of categorization: the prototype model. This model is similar to the context model, but rather than comparing the new stimulus to all of the exemplars, it compares the new stimulus to the category *prototypes*. How do we know what the category prototype is? We can compute a prototype from a set of exemplars.

In [None]:
def prototype(features):
    """
    Compute the prototype features, based on the given features of
    category members. The prototype should have a feature if half or
    more of the category members have that feature.
    
    Parameters
    ----------
    features : numpy array with shape (n, m)
        The first dimension corresponds to n category members, and the
        second dimension to m features.
    
    Returns
    -------
    numpy array with shape (m,) corresponding to the features
    of the prototype of the category members
    
    """
    
    return (np.mean(features,axis=0)>=0.5).astype(int)

In [None]:
# add your own test cases here!
prototype(np.array([[0, 1], [0, 0]])

<a name="prototypes"></a>Check what your function returns for the "safe button" prototype and the "unsafe button" prototype:

In [None]:
safe_prototype = prototype(safe_exemplars)
unsafe_prototype = prototype(unsafe_exemplars)
print("Safe button prototype: {}".format(safe_prototype))
print("Unsafe button prototype: {}".format(unsafe_prototype))

<div class="alert alert-success">Describe in words what the features of each prototype are (e.g., "the safe button prototype is a [size], [color], [fill] [shape]"). Remember that the first feature corresponds to shape, the second feature corresponds to size, the third feature corresponds to color, and the fourth feature corresponds to fill. (**0.25 points**)</div>

## Prototype Model: Categorization
Now that we have a way of computing prototypes from our exemplars, let's take a look at the actual prototype model.

<a name="eq:prototype"></a>
> The prototype model can be described by the following equation:
>
> $$
P(A|\mathbf{x})=\frac{\mathbb{S}(\mathbf{x}, \mathbf{\mu}_A)}{\mathbb{S}(\mathbf{x}, \mathbf{\mu}_A) + \mathbb{S}(\mathbf{x}, \mathbf{\mu}_B)}
$$
>
> where $\mathbb{S}$ is the [similarity function defined above](#eq:similarity), $\mathbf{x}$ is the novel stimulus, $\mathbf{\mu}_A$ is the prototype of category $A$, and $\mathbf{\mu}_B$ is the prototype of category $B$.

In [None]:
def prototype_model(test_stimuli, exemplars, exemplar_categories, theta=0.1):
    """Computes the probability that each test stimulus belongs to 
    category A.

    Parameters
    ----------
    test_stimuli : numpy array with shape (n, m)
        n stimuli, each with m features, to be classified (i.e. 
        compute P(A|x) for each x)
    exemplars : numpy array with shape (k, m)
        k exemplars, each with m features
    exemplar_categories : numpy string array with shape (k,)
        Categories for the k exemplars. You can assume the values of 
        exemplar_categories will always be either be "A" or "B".
    theta : (optional) float
        A parameter to pass to the similarity function. If theta is not
        specified, it defaults to 0.1.
        
    Returns
    -------
    numpy array with shape (n,) such that the i^th element 
    corresponds to P(A|test_stimuli[i])
        
    """
    # compute prototypes
    prototype_A = prototype(exemplars[exemplar_categories=="A"])
    prototype_B = prototype(exemplars[exemplar_categories=="B"])
    # compute probability for each test item
    probability = np.empty(len(test_stimuli))
    for i in range(len(test_stimuli)):
        SA = calculate_similarity(test_stimuli[i,:],prototype_A,theta)
        SB = calculate_similarity(test_stimuli[i,:],prototype_B,theta)
        probability[i] = SA/(SA+SB)
    
    return probability
    

In [None]:
# add your own test cases here!
test_stimuli = np.array([[1, 0, 1], [0, 1, 1], [1, 1, 1]])
test_exemplars = np.array([[1, 1, 0], [0, 0, 1], [0, 0, 0], [1, 0, 0]])
test_exemplar_categories = np.array(["B", "A", "A", "A"])

prototype_model(test_stimuli, test_exemplars, test_exemplar_categories, theta=0.2)

## Prototype Model: Categorizing the prototype
Let's see how well the prototype model does at categorizing the prototypes of safe and unsafe buttons:

In [None]:
prototypes = np.vstack([safe_prototype, unsafe_prototype])
prototype_model(prototypes, all_exemplars, exemplar_categories)

## Prototype Model: New exemplars
Now let's see what the prototype model would say about the [unknown buttons](#novel) that our young Botonan has encountered.

In [None]:
prototype_model(novel_stimuli, all_exemplars, exemplar_categories)

<div class="alert alert-success">What does the prototype model say? Which of these buttons are more likely to belong to category A (safe buttons)? If the prototype model is an accurate model of how Botonans categorize concepts, do you think they would push any of the buttons? </div>

## Comparing Models
As we develop computational models of cognition, it is not enough to look at a single model and declare that it is good or bad. What is it good or bad in relation to? We must always *compare* our models, in order to get a better sense of how the space of models behave on a particular type of problem.

We have now analyzed the context model and prototype model independently, but we have not compared them. First, let's compare how they both behave on the exemplars that have already been observed:

In [None]:
print("Context model on exemplars:    {}".format(context_model(all_exemplars, all_exemplars, exemplar_categories)))
print("Prototype model on exemplars:  {}".format(prototype_model(all_exemplars, all_exemplars, exemplar_categories)))
print("True exemplar categories:      {}".format(exemplar_categories))

<div class="alert alert-success">Which model is more accurate at predicting the exemplar categories? 
Even though the models have seen these exemplars, and know their true categories, they do not predict the category labels with 100% certainty. Why is this the case? 
</div>

Now let's take a look at how well the models perform on the prototypes:

In [None]:
print("Context model on prototypes:    {}".format(context_model(prototypes, all_exemplars, exemplar_categories)))
print("Prototype model on prototypes:  {}".format(prototype_model(prototypes, all_exemplars, exemplar_categories)))
print("True prototype categories:      {}".format(np.array(["A", "B"])))

<div class="alert alert-success">Which model is more accurate at predicting the prototypes? Why?</div>

Finally, let's take a look at how the models predict the novel stimuli:

In [None]:
print("Context model on novel:   {}".format(context_model(novel_stimuli, all_exemplars, exemplar_categories)))
print("Prototype model on novel: {}".format(prototype_model(novel_stimuli, all_exemplars, exemplar_categories)))

<div class="alert alert-success">If the true categories of the novel stimuli were $[B, B, B, A]$, which model would be more advantageous in this scenario (remember that category $B$ is the unsafe buttons)? If the true categories of the novel stimuli were $[B, B, A, A]$? (**0.5 points**)</div>

---