Merge latest changes (#3)

* Remove unnecessary goal test in search.py (aimacode#953) Remove unnecessary initial goal test in best_first_graph_search. The loop will catch that case immediately. * Minor Changes in Text (aimacode#955) * Minor text change (aimacode#957) To make it more accurate. * Minor change in text (aimacode#956) To make it more descriptive and accurate. * Added relu Activation (aimacode#960) * added relu activation * added default parameters * Changes in texts (aimacode#959) Added a few new sentences, modified the sentence structure at a few places, and corrected some grammatical errors. * Change PriorityQueue expansion (aimacode#962) `self.heap.append` simply appends to the end of the `self.heap` Since `self.heap` is just a python list. `self.append` calls the append method of the class instance, effectively putting the item in its proper place. * added GSoC 2018 contributors A thank you to contributors from the GSoC 2018 program! * Revamped the notebook (aimacode#963) * Revamped the notebook * A few changes reversed Changed a few things from my original PR after a review from ad71. * Text Changes + Colored Table (aimacode#964) Made a colored table to display dog movement instead. Corrected grammatical errors, improved the sentence structure and corrected any typos found. * Fixed typos (aimacode#970) Typos and minor other text errors removed * Fixed Typos (aimacode#971) Corrected typos + minor other text changes * Update intro.ipynb (aimacode#969) * Added activation functions (aimacode#968) * Updated label_queen_conflicts function (aimacode#967) Shortened it, finding conflicts separately and storing them in different variables has no use later in the notebook; so i believe this looks better.
dsaw · Oct 4, 2018 · 041e5f8 · 041e5f8
1 parent 41c818c
commit 041e5f8
Show file tree

Hide file tree

Showing 10 changed files with 943 additions and 386 deletions.
diff --git a/README.md b/README.md
@@ -168,7 +168,7 @@ Here is a table of the implemented data structures, the figure, name of the impl
 
 # Acknowledgements
 
-Many thanks for contributions over the years. I got bug reports, corrected code, and other support from Darius Bacon, Phil Ruggera, Peng Shao, Amit Patil, Ted Nienstedt, Jim Martin, Ben Catanzariti, and others. Now that the project is on GitHub, you can see the [contributors](https://github.com/aimacode/aima-python/graphs/contributors) who are doing a great job of actively improving the project. Many thanks to all contributors, especially @darius, @SnShine, @reachtarunhere, @MrDupin, and @Chipe1.
+Many thanks for contributions over the years. I got bug reports, corrected code, and other support from Darius Bacon, Phil Ruggera, Peng Shao, Amit Patil, Ted Nienstedt, Jim Martin, Ben Catanzariti, and others. Now that the project is on GitHub, you can see the [contributors](https://github.com/aimacode/aima-python/graphs/contributors) who are doing a great job of actively improving the project. Many thanks to all contributors, especially @darius, @SnShine, @reachtarunhere, @MrDupin, @Chipe1, @ad71 and @MariannaSpyrakou.
 
 <!---Reference Links-->
 [agents]:../master/agents.py

diff --git a/agents.ipynb b/agents.ipynb
diff --git a/csp.ipynb b/csp.ipynb
diff --git a/intro.ipynb b/intro.ipynb
@@ -62,7 +62,7 @@
    "source": [
     "From there, the notebook alternates explanations with examples of use. You can run the examples as they are, and you can modify the code cells (or add new cells) and run your own examples. If you have some really good examples to add, you can make a github pull request.\n",
     "\n",
-    "If you want to see the source code of a function, you can open a browser or editor and see it in another window, or from within the notebook you can use the IPython magic function `%psource` (for \"print source\") or the function `psource` from `notebook.py`. Also, if the algorithm has pseudocode, you can read it by calling the `pseudocode` function with input the name of the algorithm."
+    "If you want to see the source code of a function, you can open a browser or editor and see it in another window, or from within the notebook you can use the IPython magic function `%psource` (for \"print source\") or the function `psource` from `notebook.py`. Also, if the algorithm has pseudocode available, you can read it by calling the `pseudocode` function with the name of the algorithm passed as a parameter."
    ]
   },
   {

diff --git a/knowledge_FOIL.ipynb b/knowledge_FOIL.ipynb
diff --git a/knowledge_current_best.ipynb b/knowledge_current_best.ipynb
@@ -38,15 +38,15 @@
    "source": [
     "## OVERVIEW\n",
     "\n",
-    "Like the [learning module](https://github.com/aimacode/aima-python/blob/master/learning.ipynb), this chapter focuses on methods for generating a model/hypothesis for a domain. Unlike though the learning chapter, here we use prior knowledge to help us learn from new experiences and find a proper hypothesis.\n",
+    "Like the [learning module](https://github.com/aimacode/aima-python/blob/master/learning.ipynb), this chapter focuses on methods for generating a model/hypothesis for a domain; however, unlike the learning chapter, here we use prior knowledge to help us learn from new experiences and find a proper hypothesis.\n",
     "\n",
     "### First-Order Logic\n",
     "\n",
-    "Usually knowledge in this field is represented as **first-order logic**, a type of logic that uses variables and quantifiers in logical sentences. Hypotheses are represented by logical sentences with variables, while examples are logical sentences with set values instead of variables. The goal is to assign a value to a special first-order logic predicate, called **goal predicate**, for new examples given a hypothesis. We learn this hypothesis by infering knowledge from some given examples.\n",
+    "Usually knowledge in this field is represented as **first-order logic**; a type of logic that uses variables and quantifiers in logical sentences. Hypotheses are represented by logical sentences with variables, while examples are logical sentences with set values instead of variables. The goal is to assign a value to a special first-order logic predicate, called **goal predicate**, for new examples given a hypothesis. We learn this hypothesis by infering knowledge from some given examples.\n",
     "\n",
     "### Representation\n",
     "\n",
-    "In this module, we use dictionaries to represent examples, with keys the attribute names and values the corresponding example values. Examples also have an extra boolean field, 'GOAL', for the goal predicate. A hypothesis is represented as a list of dictionaries. Each dictionary in that list represents a disjunction. Inside these dictionaries/disjunctions we have conjunctions.\n",
+    "In this module, we use dictionaries to represent examples, with keys being the attribute names and values being the corresponding example values. Examples also have an extra boolean field, 'GOAL', for the goal predicate. A hypothesis is represented as a list of dictionaries. Each dictionary in that list represents a disjunction. Inside these dictionaries/disjunctions we have conjunctions.\n",
     "\n",
     "For example, say we want to predict if an animal (cat or dog) will take an umbrella given whether or not it rains or the animal wears a coat. The goal value is 'take an umbrella' and is denoted by the key 'GOAL'. An example:\n",
     "\n",
@@ -73,15 +73,15 @@
     "\n",
     "### Overview\n",
     "\n",
-    "In **Current-Best Learning**, we start with a hypothesis and we refine it as we iterate through the examples. For each example, there are three possible outcomes. The example is consistent with the hypothesis, the example is a **false positive** (real value is false but got predicted as true) and **false negative** (real value is true but got predicted as false). Depending on the outcome we refine the hypothesis accordingly:\n",
+    "In **Current-Best Learning**, we start with a hypothesis and we refine it as we iterate through the examples. For each example, there are three possible outcomes: the example is consistent with the hypothesis, the example is a **false positive** (real value is false but got predicted as true) and the example is a **false negative** (real value is true but got predicted as false). Depending on the outcome we refine the hypothesis accordingly:\n",
     "\n",
-    "* Consistent: We do not change the hypothesis and we move on to the next example.\n",
+    "* Consistent: We do not change the hypothesis and move on to the next example.\n",
     "\n",
     "* False Positive: We **specialize** the hypothesis, which means we add a conjunction.\n",
     "\n",
     "* False Negative: We **generalize** the hypothesis, either by removing a conjunction or a disjunction, or by adding a disjunction.\n",
     "\n",
-    "When specializing and generalizing, we should take care to not create inconsistencies with previous examples. To avoid that caveat, backtracking is needed. Thankfully, there is not just one specialization or generalization, so we have a lot to choose from. We will go through all the specialization/generalizations and we will refine our hypothesis as the first specialization/generalization consistent with all the examples seen up to that point."
+    "When specializing or generalizing, we should make sure to not create inconsistencies with previous examples. To avoid that caveat, backtracking is needed. Thankfully, there is not just one specialization or generalization, so we have a lot to choose from. We will go through all the specializations/generalizations and we will refine our hypothesis as the first specialization/generalization consistent with all the examples seen up to that point."
    ]
   },
   {
@@ -138,7 +138,7 @@
    "source": [
     "### Implementation\n",
     "\n",
-    "As mentioned previously, examples are dictionaries (with keys the attribute names) and hypotheses are lists of dictionaries (each dictionary is a disjunction). Also, in the hypothesis, we denote the *NOT* operation with an exclamation mark (!).\n",
+    "As mentioned earlier, examples are dictionaries (with keys being the attribute names) and hypotheses are lists of dictionaries (each dictionary is a disjunction). Also, in the hypothesis, we denote the *NOT* operation with an exclamation mark (!).\n",
     "\n",
     "We have functions to calculate the list of all specializations/generalizations, to check if an example is consistent/false positive/false negative with a hypothesis. We also have an auxiliary function to add a disjunction (or operation) to a hypothesis, and two other functions to check consistency of all (or just the negative) examples.\n",
     "\n",
@@ -148,7 +148,9 @@
   {
    "cell_type": "code",
    "execution_count": 3,
-   "metadata": {},
+   "metadata": {
+    "collapsed": true
+   },
    "outputs": [
     {
      "data": {
@@ -370,7 +372,7 @@
     "\n",
     "We will take a look at two examples. The first is a trivial one, while the second is a bit more complicated (you can also find it in the book).\n",
     "\n",
-    "First we have the \"animals taking umbrellas\" example. Here we want to find a hypothesis to predict whether or not an animal will take an umbrella. The attributes are `Species`, `Rain` and `Coat`. The possible values are `[Cat, Dog]`, `[Yes, No]` and `[Yes, No]` respectively. Below we give seven examples (with `GOAL` we denote whether an animal will take an umbrella or not):"
+    "Earlier, we had the \"animals taking umbrellas\" example. Now we want to find a hypothesis to predict whether or not an animal will take an umbrella. The attributes are `Species`, `Rain` and `Coat`. The possible values are `[Cat, Dog]`, `[Yes, No]` and `[Yes, No]` respectively. Below we give seven examples (with `GOAL` we denote whether an animal will take an umbrella or not):"
    ]
   },
   {
@@ -427,7 +429,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We got 5/7 correct. Not terribly bad, but we can do better. Let's run the algorithm and see how that performs."
+    "We got 5/7 correct. Not terribly bad, but we can do better. Lets now run the algorithm and see how that performs in comparison to our current result. "
    ]
   },
   {
@@ -472,7 +474,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "[{'Rain': '!No', 'Species': 'Cat'}, {'Rain': 'Yes', 'Coat': 'Yes'}, {'Coat': 'Yes', 'Species': 'Cat'}]\n"
+      "[{'Species': 'Cat', 'Rain': '!No'}, {'Species': 'Dog', 'Coat': 'Yes'}, {'Coat': 'Yes'}]\n"
      ]
     }
    ],
@@ -563,7 +565,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Say our initial hypothesis is that there should be an alternative option and let's run the algorithm."
+    "Say our initial hypothesis is that there should be an alternative option and lets run the algorithm."
    ]
   },
   {
@@ -613,7 +615,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "[{'Pat': '!Full', 'Alt': 'Yes'}, {'Hun': 'No', 'Res': 'No', 'Rain': 'No', 'Pat': '!None'}, {'Fri': 'Yes', 'Type': 'Thai', 'Bar': 'No'}, {'Fri': 'No', 'Type': 'Italian', 'Bar': 'Yes', 'Alt': 'No', 'Est': '0-10'}, {'Fri': 'No', 'Bar': 'No', 'Est': '0-10', 'Type': 'Thai', 'Rain': 'Yes', 'Alt': 'No'}, {'Fri': 'Yes', 'Bar': 'Yes', 'Est': '30-60', 'Hun': 'Yes', 'Rain': 'No', 'Alt': 'Yes', 'Price': '$'}]\n"
+      "[{'Alt': 'Yes', 'Type': '!Thai', 'Hun': '!No', 'Bar': '!Yes'}, {'Alt': 'No', 'Fri': 'No', 'Pat': 'Some', 'Price': '$', 'Type': 'Burger', 'Est': '0-10'}, {'Rain': 'Yes', 'Res': 'No', 'Type': '!Burger'}, {'Alt': 'No', 'Bar': 'Yes', 'Hun': 'Yes', 'Pat': 'Some', 'Price': '$$', 'Rain': 'Yes', 'Res': 'Yes', 'Est': '0-10'}, {'Alt': 'No', 'Bar': 'No', 'Pat': 'Some', 'Price': '$$', 'Est': '0-10'}, {'Alt': 'Yes', 'Hun': 'Yes', 'Pat': 'Full', 'Price': '$', 'Res': 'No', 'Type': 'Burger', 'Est': '30-60'}]\n"
      ]
     }
    ],
@@ -627,6 +629,13 @@
    "source": [
     "It might be quite complicated, with many disjunctions if we are unlucky, but it will always be correct, as long as a correct hypothesis exists."
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {
@@ -645,7 +654,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.5.3"
+   "version": "3.6.5"
   }
  },
  "nbformat": 4,

diff --git a/learning.py b/learning.py
@@ -4,7 +4,8 @@
     removeall, unique, product, mode, argmax, argmax_random_tie, isclose, gaussian,
     dotproduct, vector_add, scalar_vector_product, weighted_sample_with_replacement,
     weighted_sampler, num_or_str, normalize, clip, sigmoid, print_table,
-    open_data, sigmoid_derivative, probability, norm, matrix_multiplication
+    open_data, sigmoid_derivative, probability, norm, matrix_multiplication, relu, relu_derivative,
+    tanh, tanh_derivative, leaky_relu, leaky_relu_derivative, elu, elu_derivative
 )
 
 import copy
@@ -652,7 +653,7 @@ def predict(example):
 
 
 def NeuralNetLearner(dataset, hidden_layer_sizes=None,
-                     learning_rate=0.01, epochs=100):
+                     learning_rate=0.01, epochs=100, activation = sigmoid):
     """Layered feed-forward network.
     hidden_layer_sizes: List of number of hidden units per hidden layer
     learning_rate: Learning rate of gradient descent
@@ -664,9 +665,9 @@ def NeuralNetLearner(dataset, hidden_layer_sizes=None,
     o_units = len(dataset.values[dataset.target])
 
     # construct a network
-    raw_net = network(i_units, hidden_layer_sizes, o_units)
+    raw_net = network(i_units, hidden_layer_sizes, o_units, activation)
     learned_net = BackPropagationLearner(dataset, raw_net,
-                                         learning_rate, epochs)
+                                         learning_rate, epochs, activation)
 
     def predict(example):
         # Input nodes
@@ -695,7 +696,7 @@ def random_weights(min_value, max_value, num_weights):
     return [random.uniform(min_value, max_value) for _ in range(num_weights)]
 
 
-def BackPropagationLearner(dataset, net, learning_rate, epochs):
+def BackPropagationLearner(dataset, net, learning_rate, epochs, activation=sigmoid):
     """[Figure 18.23] The back-propagation algorithm for multilayer networks"""
     # Initialise weights
     for layer in net:
@@ -743,8 +744,18 @@ def BackPropagationLearner(dataset, net, learning_rate, epochs):
             # Error for the MSE cost function
             err = [t_val[i] - o_nodes[i].value for i in range(o_units)]
 
-            # The activation function used is the sigmoid function
-            delta[-1] = [sigmoid_derivative(o_nodes[i].value) * err[i] for i in range(o_units)]
+            # The activation function used is relu or sigmoid function
+            if node.activation == sigmoid:
+                delta[-1] = [sigmoid_derivative(o_nodes[i].value) * err[i] for i in range(o_units)]
+            elif node.activation == relu:
+                delta[-1] = [relu_derivative(o_nodes[i].value) * err[i] for i in range(o_units)]
+            elif node.activation == tanh:
+                delta[-1] = [tanh_derivative(o_nodes[i].value) * err[i] for i in range(o_units)]
+            elif node.activation == elu:
+                delta[-1] = [elu_derivative(o_nodes[i].value) * err[i] for i in range(o_units)]
+            else:
+                delta[-1] = [leaky_relu_derivative(o_nodes[i].value) * err[i] for i in range(o_units)]
+
 
             # Backward pass
             h_layers = n_layers - 2
@@ -756,7 +767,20 @@ def BackPropagationLearner(dataset, net, learning_rate, epochs):
                 # weights from each ith layer node to each i + 1th layer node
                 w = [[node.weights[k] for node in nx_layer] for k in range(h_units)]
 
-                delta[i] = [sigmoid_derivative(layer[j].value) * dotproduct(w[j], delta[i+1])
+                if activation == sigmoid:
+                    delta[i] = [sigmoid_derivative(layer[j].value) * dotproduct(w[j], delta[i+1])
+                            for j in range(h_units)]
+                elif activation == relu:
+                    delta[i] = [relu_derivative(layer[j].value) * dotproduct(w[j], delta[i+1])
+                            for j in range(h_units)]
+                elif activation == tanh:
+                    delta[i] = [tanh_derivative(layer[j].value) * dotproduct(w[j], delta[i+1])
+                            for j in range(h_units)]
+                elif activation == elu:
+                    delta[i] = [elu_derivative(layer[j].value) * dotproduct(w[j], delta[i+1])
+                            for j in range(h_units)]
+                else:
+                    delta[i] = [leaky_relu_derivative(layer[j].value) * dotproduct(w[j], delta[i+1])
                             for j in range(h_units)]
 
             #  Update weights
@@ -800,14 +824,14 @@ class NNUnit:
     weights: Weights to incoming connections
     """
 
-    def __init__(self, weights=None, inputs=None):
+    def __init__(self, activation=sigmoid, weights=None, inputs=None):
         self.weights = weights or []
         self.inputs = inputs or []
         self.value = None
-        self.activation = sigmoid
+        self.activation = activation
 
 
-def network(input_units, hidden_layer_sizes, output_units):
+def network(input_units, hidden_layer_sizes, output_units, activation=sigmoid):
     """Create Directed Acyclic Network of given number layers.
     hidden_layers_sizes : List number of neuron units in each hidden layer
     excluding input and output layers
@@ -818,7 +842,7 @@ def network(input_units, hidden_layer_sizes, output_units):
     else:
         layers_sizes = [input_units] + [output_units]
 
-    net = [[NNUnit() for n in range(size)]
+    net = [[NNUnit(activation) for n in range(size)]
            for size in layers_sizes]
     n_layers = len(net)