Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your collaborators below:

In [None]:
COLLABORATORS = ""

---

In [None]:
import itertools
import numpy as np

<div class="alert alert-warning">This challenge problem is **optional**. If you complete it, as well as the other challenge problems on other problem sets, then you can use the combined challenge problem scores to replace the score of one of the normal problem sets. However, if you do not do the challenge problems, it will not negatively affect your score.</div>

---
## Part A (0.5 points)

Write a function `get_distinguishing_features` that generates a set of rules that uniquely identifies a given animal against the set of all other animals in the same data used from Problem 3. These rules can use any combination of positive features and negative features. We have provided a function `get_animals` that idenitifies the set of animals that are consistent with a set of positive and negative rules; we will use this function to confirm that your set of rules identifies exactly one animal.

In [None]:
data = np.load('data/50animals.npz')

In [None]:
def get_animals(features, data):
    """Gets all animals with the given features from the data
    
    Parameters
    ----------
    features : dict
        This dict has three fields:
        animal: name of the animal
        negativeFeatures: a list of zero or more features that are true 
            for the animal
        positiveFeatures: a list of zero or more features that are false
            for the animal
    data: dict
        The data for the animal guessing game, in dictionary form. This data 
        has fields feature_data, feature_names, and animal_names
        
    Returns
    -------
    A list of animals consistent with positive and negative features
    
    """
    feature_data = data['animal_features']
    feature_names = data['feature_names']
    animal_names = data['animal_names']

    # see how many animals are picked out with features
    positives = feature_data[:, np.array([x in features['positiveFeatures'] 
        for x in feature_names])]
    negatives = feature_data[:, np.array([x in features['negativeFeatures'] 
        for x in feature_names])]
    return(list(set(animal_names[np.array([all(x) for x in positives])]) & 
        set(animal_names[np.array([not any(x) for x in negatives])])))
    

In [None]:
def get_distinguishing_features(animal, data):
    """A function that finds a set of rules that uniquely identifies an
    animal from the given data

    Hint: you may want to use itertools.product to efficiently create 
    sets of rules. 
    
    
    Parameters
    ----------
    queryAnimal: string
        The name of the animal for which the function will compute rules
    data: dict
        The data for the animal guessing game, in dictionary form. This data 
        has fields feature_data, feature_names, and animal_names
        
    Returns
    -------
    Dictionary with three fields
        animal: name of the animal
        negativeFeatures: a list of zero or more features that are true 
            for the animal
        positiveFeatures: a list of zero or more features that are false
            for the animal
    """
    # YOUR CODE HERE
    raise NotImplementedError()


In [None]:
featureSet = get_distinguishing_features('grizzly bear', data)
featureSet

In [None]:
get_animals(featureSet, data) # this should return one animal, grizzly bear

In [None]:
# add your own test cases here!


In [None]:
"""Test that getDistinguishingFeatures finds exactly one animal 
in each case. Note that these tests may take some time to run."""
from nose.tools import assert_equal

assert_equal(get_animals(get_distinguishing_features('antelope', data),
    data), ['antelope'])
assert_equal(get_animals(get_distinguishing_features('grizzly bear', data),
    data), ['grizzly bear'])
assert_equal(get_animals(get_distinguishing_features('otter', data),
    data), ['otter'])
assert_equal(get_animals(get_distinguishing_features('rabbit', data), 
    data), ['rabbit'])
assert_equal(get_animals(get_distinguishing_features('dolphin', data), 
    data), ['dolphin'])

print("Success!")

## Part B (0.5 points)

Revise `getDistinguishingFeatures` such that the it returns the *minimal* number of rules necessary to distinguish that animal from all others. 

In [None]:
# add your own test cases here!


In [None]:
"""Check if get_distinguishing_features is correct"""

# compute minimal rules for each animal (note: this may take awhile to run!)
minimalRules = [get_distinguishing_features(x, data) for x in data['animal_names']]
minimalRuleCounts = np.array([len(x['negativeFeatures']) +
    len(x['positiveFeatures']) for x in minimalRules])

# load in the correct minimal rules
goldStandardCounts = np.load('data/minimal_rule_counts.npy')

# compare them
assert not any(minimalRuleCounts > goldStandardCounts), \
    "Your rules are longer than the gold standard rules"

print("Success!")        

---

Before turning this problem in remember to do the following steps:

1. **Restart the kernel** (Kernel$\rightarrow$Restart)
2. **Run all cells** (Cell$\rightarrow$Run All)
3. **Save** (File$\rightarrow$Save and Checkpoint)

<div class="alert alert-danger">After you have completed these three steps, ensure that the following cell has printed "No errors". If it has <b>not</b> printed "No errors", then your code has a bug in it and has thrown an error! Make sure you fix this error before turning in your problem set.</div>

In [None]:
print("No errors!")