Skip to content

Commit

Permalink
Refactor reporting
Browse files Browse the repository at this point in the history
  • Loading branch information
saulshanabrook committed Jul 7, 2017
1 parent fffb0fe commit 5f046d3
Show file tree
Hide file tree
Showing 28 changed files with 1,423 additions and 815 deletions.
195 changes: 190 additions & 5 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,197 @@
[Here](https://gist.github.com/thelmuth/1361411) is a document describrining how
to contribute to this project.

## Logging

We break up the logging of the run into a series of **events**, each
identified by a label.
We call `clojush.log/log!` with both the label of the event and the input
data.

We use
[Plumatic's Graph framework](https://github.com/plumatic/plumbing#graph-the-functional-swiss-army-knife)
to compute data for that event. We define each event's graph
in `clojush.log.events.<label>/compute-graph`, They are compiled with `lazy-compile`.
so that their outputs are lazy maps. This means that the values are only calculated
when we ask for them.
We execute the compiled graph for that event with the input
passed into `log!` merged with a mapping of each previouslly logged event label to it's computed
data.

Then, we calls all the **handlers** that have handle functions defined
for this event. We define the mapping of event label to handle functions
in `clojush.log.handlers.<handler name>/handler`. Each handle function
is called with a mapping of each event label to it's last computed result.

You can see what each event graph takes in as inputs and what it produces,
by running (respectively):

```bash
lein run -m clojush.cli.log/event-inputs
lein run -m clojush.cli.log/event-outputs
```

### Population
There are a bunch of bits things we might want to know about each
individual in the population for logging purposes. These might
be as simple as the mean of it's error vector and as complicated
as program string of the partially simplified version of itself. Some
of these things we just want to know about the best individual and others
about all individuals, and we never want to compute them unless we need to.

So the `generation` event computes an augmented `population`, where
each item has all the original keys of the individual plus some extra
lazy dynamic ones. It does this by, again, computing a lazy map based on
a graph. It is defined at `clojush.log.events.generation.individual/compute-graph`
and takes as input the original individual, under the key `:individual` as
well as the argmap and some computed stats on the population. The lazy map output
is merged with the original individual record. You can see the input and
output of this graph with:

```bash
lein run -m clojush.cli.log/individual-input
lein run -m clojush.cli.log/individual-output
```

Since we need these extra values on the "best" individual, for logging,
we compute the best from this augmented population.

After the generation finished, the pushgp function needs to know if
we have succeeded and get the "best" individual, so that it can return it.
So it get's the computed data from the generation and accesses
the `outcome` and the `best`.

### Modifying Logging

#### Adding computed data
OK let's say you want to log some more data during the run. First, decide
which graph it should be in:

* Depends on the command line arguments? Then it belongs in `init`
* Computes something about the machine environment (like git hash) or depends
on the push argmap? Put that in `config`.
* Computed ever generation and is population wide? -> `generation`
* Computed for each individual in every generation? -> `individual`

Then, add it to that graph. The most straightforward way to do that
is to define a keyword function (`defnk`) in that file and put
that keyword function in the `compute-graph` in that same file. Use the above
logic and CLI commands to understand what you can ask for as input in the `defnk`.


#### Adding a handler
If you want to create a new handler to support logging in some new format
or new source, you should:

1. Add a toggle for the handler in `clojush.args/argmap`.
2. Create a `clojush.log.handlers.<label>` file. In it, define a `handler`
var that maps from event labels to handle functions. Make those handle
funtions execute based on the toggle you defined in the argmap.
3. Add that handler to `clojush.log.handlers/handlers`.


#### Example
For exmaple, let's create a handler that logs, to a file, the number of empty
genomes in each population every 20 generations.

First we add a couple of options to `clojush.args/argmap`:

```clojure
:print-empty-genome-logs false
:empty-genome-logs-every-n-generations 20
:emtpy-genome-logs-filename "empty-genomy-logs.txt"
```

Then, let's make a file for this handler at `clojush.log.handlers.empty-genome`:

```clojure
(ns clojush.log.handlers.empty-genome
(:require [plumbing.core :refer [defnk]]
[clojure.java.io :as io]))

(defnk handle-config
"Save the header of the file before the run starts"
[[:config [:argmap print-empty-genome-logs emtpy-genome-logs-filename]]]
(when print-empty-genome-logs
(spit emtpy-genome-logs-filename "Generation NumEmptyGenomes\n")))

(defnk handle-generation
"At every generation, if it's the nth generation, save the # of emtpy genomes"
[[:config [:argmap print-empty-genome-logs
emtpy-genome-logs-filename
empty-genome-logs-every-n-generations]]
[:generation index :as generation]]
(when (and print-empty-genome-logs
(= 0 (mod index empty-genome-logs-every-n-generations)))
(spit
emtpy-genome-logs-filename
(str index " " (:empty-genomes-n generation))
:append true)))

(def handler
{:config handle-config
:generation handle-generation})
```

Then add that `handler` to `clojush.log.handlers/handlers`. As you can
see, I didn't actually do any computation in the handler to figure out
the `empty-genomes-n`. Instead, I just asked for that value from the
`generation`. So let's define this key on this generation. First
we add a new keyword funciton in `clojush.log.events.generation`:

```clojure
(defnk empty-genomes-n [population]
(count (filter #(empty? (:genome %)) population)))


(def compute-graph
(graph/graph
...

empty-genomes-n))
```

We are getting the `population` that is also computed by a different keyword
function.

Then we add that keyword function to `clojush.log.events.generation/compute-graph`.
It infers the name by looking at the name of the keyword function.

By moving the computation out of the handler, any other handler can also access
this attribute now of the generation.

One other thing we could do to clean this up is to add a `genome-empty?` value on
each individual. To do this, add a keyword function in `clojush.log.events.generation.individual`:

```clojure
(defnk empty-genome? [[:individual genome]]
(empty? genome))

(def compute-graph
(graph/graph
...
empty-genome?))
```

Then we can clean up the generation level attribute:

```clojure
(defnk empty-genomes-n [population]
(count (filter :empty-genome? population)))
```

### Debugging the graph

If you set the `CLOJUSH_DEBUG_GRAPH` environmental variables, then it will
print to stderr when all values in the graph are being calculated.

## Travic CI
Recently we have begun using [Travis CI](travis-ci.org) to automate multiple
parts of development.

We use [Travis CI](travis-ci.org) for...

### Testing

Primarily it serves as a way to test every branch and pull request, using commands
It tests every branch and pull request, using commands
like `lein check` and `lein test`.


Expand Down Expand Up @@ -38,17 +222,18 @@ Since there are some things that will always change (like the time and git hash)
there is some manual find and replace logic in `clojush.test.integration-test`
that tries to replace things will change with `xxx` in the test output.


### Docs

Docs are auto generated from function metadata using
The docs are auto generated from function metadata using
[`codox`](https://github.com/weavejester/codox).

On every commit to master, the docs are automatically regenerated and pushed
to the [`gh-pages` branch](http://lspector.github.io/Clojush/).

To generate them locally run `lein codox` and then open `doc/index.html`.

Currently, generating the docs have the side effect of running some examples,
Generating the docs have the side effect of running some examples,
[because I couldn't figure out how stop codox from loading all example files](https://github.com/weavejester/codox/issues/100).

In the metadata, you can [skip functions](https://github.com/weavejester/codox#metadata-options)
Expand Down
3 changes: 2 additions & 1 deletion project.clj
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@
;; https://mvnrepository.com/artifact/org.apache.commons/commons-math3
[org.apache.commons/commons-math3 "3.2"]
[cheshire "5.7.1"]
[prismatic/plumbing "0.5.4"]]
[prismatic/plumbing "0.5.4"]
[mvxcvi/puget "1.0.1"]]
:plugins [[lein-codox "0.9.1"]
[lein-shell "0.5.0"]
[lein-gorilla "0.4.0"]
Expand Down
15 changes: 7 additions & 8 deletions src/clojush/args.clj
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
(ns clojush.args
(:require [clj-random.core :as random])
(:use [clojush globals random util pushstate]
[clojush.instructions.tag]
[clojush.pushgp report]))
[clojush.instructions.tag]))

(def push-argmap
(atom (sorted-map
Expand Down Expand Up @@ -89,8 +88,8 @@
:uniform-addition 0.0
:uniform-addition-and-deletion 0.0
:uniform-combination-and-deletion 0.0
:genesis 0.0
}
:genesis 0.0}

;; The map supplied to :genetic-operator-probabilities should contain genetic operators
;; that sum to 1.0. All available genetic operators are defined in clojush.pushgp.breed.
;; Along with single operators, pipelines (vectors) containing multiple operators are
Expand Down Expand Up @@ -376,11 +375,11 @@
;; The number of simplification steps that will happen during final report
;; simplifications.

:problem-specific-initial-report default-problem-specific-initial-report
:problem-specific-initial-report (fn [argmap] :no-problem-specific-initial-report-function-defined)
;; A function can be called to provide a problem-specific initial report, which happens
;; before the normal initial report is printed.

:problem-specific-report default-problem-specific-report
:problem-specific-report (fn [& args] :no-problem-specific-report-function-defined)
;; A function can be called to provide a problem-specific report, which happens before
;; the normal generational report is printed.

Expand Down Expand Up @@ -462,10 +461,10 @@
;; Should be in the format "<hostname>:<port>"
;; If set, will send logs of each run to a server running on this
;; host
:label nil
:label nil)))
;; If set, will send this in the configuration of the run, to the
;; external record
)))


(defn load-push-argmap
[argmap]
Expand Down
42 changes: 42 additions & 0 deletions src/clojush/cli/log.clj
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
(ns clojush.cli.log
(:require [puget.printer :as puget]
[plumbing.fnk.pfnk :as pfnk]
[plumbing.core :refer [map-vals]]

[clojush.log.events :refer [label->compute-graph]]
[clojush.log.events.generation.individual :as individual])
(:import (schema.core.Predicate)
(schema.core.AnythingSchema)))

(def schema-handlers
{schema.core.Predicate (fn [_1 _2] nil)
schema.core.AnythingSchema (fn [_1 _2] nil)})

(defn my-print [form]
(puget/cprint
form
{:print-handlers schema-handlers}))


(defn event-inputs
"Prints the mapping of each event to it's outputs"
[& args]
(my-print
(map-vals pfnk/input-schema label->compute-graph)))

(defn event-outputs
"Prints the mapping of each event to it's inputs"
[& args]
(my-print
(map-vals pfnk/output-schema label->compute-graph)))


(defn individual-input [& args]
(my-print
(pfnk/input-schema
individual/compute-graph)))

(defn individual-output [& args]
(my-print
(pfnk/output-schema
individual/compute-graph)))
12 changes: 4 additions & 8 deletions src/clojush/core.clj
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@
;; for more details.

(ns clojush.core
(:require [clojush.pushgp.record :as r])
(:use [clojush.pushgp pushgp report])
(:require [clojush.log :refer [log!]])
(:use [clojush.pushgp pushgp])
(:gen-class))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Expand All @@ -31,18 +31,14 @@
This allows one to run an example with a call from the OS shell prompt like:
lein run examples.simple-regression :population-size 3000"
[& args]
(r/new-run!)
(println "Command line args:" (apply str (interpose \space args)))
(let [param-list (map #(if (.endsWith % ".ser")
(str %)
(read-string %))
(rest args))]
(require (symbol (r/config-data! [:problem-file] (first args))))
(require (symbol (first args)))
(let [example-params (eval (symbol (str (first args) "/argmap")))
params (merge example-params (apply sorted-map param-list))]
(println "######################################")
(println "Parameters set at command line or in problem file argmap; may or may not be default:")
(print-params (into (sorted-map) params))
(log! :init {:args args :params params})
(println "######################################")
(pushgp params)
(shutdown-agents))))
19 changes: 19 additions & 0 deletions src/clojush/log.clj
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
(ns clojush.log
(:require [clojush.structured-logger :refer [->structured-logger]]
[clojush.log.events :refer [label->compute-graph]]
[clojush.log.handlers :refer [handlers]]))
;; this stuff is spread out over many files in subdirectories
;; Not only does this help with keeping files shorter, but it also allows
;; us to use the defnk names directly in the graph, because we don't
;; have to namespace them. For example, if we define `best` in the
;; lexicase file, then we can just throw that in the graph we defined
;; there, and it will registered under :best.

(def structured-logger
(->structured-logger
{:handlers handlers
:label->compute-graph label->compute-graph}))

(def log! (:log! structured-logger))

(def get-computed (:get-computed structured-logger))
9 changes: 9 additions & 0 deletions src/clojush/log/events.clj
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
(ns clojush.log.events
(:require [clojush.log.events.init]
[clojush.log.events.config]
[clojush.log.events.generation]))

(def label->compute-graph
{:init clojush.log.events.init/compute-graph
:config clojush.log.events.config/compute-graph
:generation clojush.log.events.generation/compute-graph})

0 comments on commit 5f046d3

Please sign in to comment.