Skip to content

Commit

Permalink
Refactor reporting
Browse files Browse the repository at this point in the history
  • Loading branch information
saulshanabrook committed Jul 4, 2017
1 parent 47b44e3 commit 636d1f5
Show file tree
Hide file tree
Showing 23 changed files with 1,278 additions and 817 deletions.
113 changes: 102 additions & 11 deletions CONTRIBUTING.md
Expand Up @@ -3,27 +3,120 @@
[Here](https://gist.github.com/thelmuth/1361411) is a document describrining how
to contribute to this project.

## Logging

We break up the logging of the run into a series of **events**, each
identified by a label.
We call `clojush.log/log!` with both the label of the event and the input
data.

We use
[Plumatic's Graph framework](https://github.com/plumatic/plumbing#graph-the-functional-swiss-army-knife)
to compute data for that event. We define each event's graph
in `clojush.log.events.<label>/compute-graph`, They are compiled with `lazy-compile`.
so that their outputs are lazy maps. This means that the values are only calculated
when we ask for them.
We execute the compiled graph for that event with the input
passed into `log!` merged with a mapping of each previouslly logged event label to it's computed
data.

Then, we calls all the **handlers** that have handle functions defined
for this event. We define the mapping of event label to handle functions
in `clojush.log.handlers.<handler name>/handler`. Each handle function
is called with a mapping of each event label to it's last computed result.

You can see what each event graph takes in as inputs and what it produces,
by running (respectively):

```bash
lein run -m clojush.cli.log/event-inputs
lein run -m clojush.cli.log/event-outputs
```

### Population
There are a bunch of bits things we might want to know about each
individual in the population for logging purposes. These might
be as simple as the mean of it's error vector and as complicated
as program string of the partially simplified version of itself. Some
of these things we just want to know about the best individual and others
about all individuals, and we never want to compute them unless we need to.

So the `generation` event computes an augmented `population`, where
each item has all the original keys of the individual plus some extra
lazy dynamic ones. It does this by, again, computing a lazy map based on
a graph. It is defined at `clojush.log.events.generation.individual/compute-graph`
and takes as input the original individual, under the key `:individual` as
well as the argmap and some computed stats on the population. The lazy map output
is merged with the original individual record. You can see the input and
output of this graph with:

```bash
lein run -m clojush.cli.log/individual-input
lein run -m clojush.cli.log/individual-output
```

Since we need these extra values on the "best" individual, for logging,
we compute the best from this augmented population.

After the generation finished, the pushgp function needs to know if
we have succeeded and get the "best" individual, so that it can return it.
So it get's the computed data from the generation and accesses
the `outcome` and the `best`.

### Modifying Logging

#### Adding computed data
OK let's say you want to log some more data during the run. First, decide
which graph it should be in:

* Depends on the command line arguments? Then it belongs in `init`
* Computes something about the machine environment (like git hash) or depends
on the push argmap? Put that in `config`.
* Computed ever generation and is population wide? -> `generation`
* Computed for each individual in every generation? -> `individual`

Then, add it to that graph. The most straightforward way to do that
is to define a keyword function (`defnk`) in that file and put
that keyword function in the `compute-graph` in that same file. Use the above
logic and CLI commands to understand what you can ask for as input in the `defnk`.


#### Adding a handler
You have found the fatal flaw in Clojush! No XML support :/ So you go about
creating an XML handler, so that you can get everything your heart desires
into big long XML files.

All you need to do is create a `clojush.log.handlers.xml` file, and
in it create a `handlers` map, that maps from each event keyword to a
function that takes in a map of each event keyword to it's computed data.
The easiest way to do this is to use a keyword function (`defnk`), which
makes destructuring the input simple. Whatever values you pull off
of the computed data will be calculated if they haven't already,
as per the lazy map abstraction.

Then, add that handler to `clojush.log.handlers/handlers` and you are
good to go!

## Travic CI
Recently we have begun using [Travis CI](travis-ci.org) to automate multiple
parts of development.
We use [Travis CI](travis-ci.org) for...

### Testing

Primarily it serves as a way to test every branch and pull request, using commands
like `lein check` and `lein test`. Currently, the test cases are very limited
It tests every branch and pull request, using commands
like `lein check` and `lein test`. The test cases are very limited
and do not cover much of the codebase.

### Docs

Docs are auto generated from function metadata using
The docs are auto generated from function metadata using
[`codox`](https://github.com/weavejester/codox).

On every commit to master, the docs are automatically regenerated and pushed
to the [`gh-pages` branch](http://lspector.github.io/Clojush/).

To generate them locally run `lein codox` and then open `doc/index.html`.

Currently, generating the docs have the side effect of running some examples,
Generating the docs have the side effect of running some examples,
[because I couldn't figure out how stop codox from loading all example files](https://github.com/weavejester/codox/issues/100).

In the metadata, you can [skip functions](https://github.com/weavejester/codox#metadata-options)
Expand All @@ -36,11 +129,11 @@ It needs this so it can push the updated docs back to Github.
### Releases

We use [the `lein release` command](https://github.com/technomancy/leiningen/blob/master/doc/DEPLOY.md#releasing-simplified)
to add a new release on every build on the `master` branch. Check the
`:release-tasks` key in the [`project.clj`](./project.clj) for a list of
to add a new release on every build on the `master` branch. Check the
`:release-tasks` key in the [`project.clj`](./project.clj) for a list of
all steps it takes.

This requires setting the `LEIN_USERNAME` and `LEIN_PASSWORD` in
This requires setting the `LEIN_USERNAME` and `LEIN_PASSWORD` in
the [repository settings in Travis](http://docs.travis-ci.com/user/environment-variables/#Defining-Variables-in-Repository-Settings),
so that it can the release to Clojars. It also needs the `GITHUB_TOKEN`
in order to push the added tag and commit back to Github.
Expand All @@ -55,5 +148,3 @@ Travis will:
2. Create jar and push that to clojars
3. bump release number to next minor version
4. Push new commits + tag back to github


6 changes: 4 additions & 2 deletions project.clj
@@ -1,4 +1,4 @@
(defproject clojush "3.0.0-1-SNAPSHOT"
(defproject clojush "3.0.0-1-SNAPSHOT"
:description "The Push programming language and the PushGP genetic programming
system implemented in Clojure. See http://pushlanguage.com"
:license {:name "Eclipse Public License"
Expand All @@ -14,7 +14,9 @@
[clj-random "0.1.7"]
;; https://mvnrepository.com/artifact/org.apache.commons/commons-math3
[org.apache.commons/commons-math3 "3.2"]
[cheshire "5.7.1"]]
[cheshire "5.7.1"]
[prismatic/plumbing "0.5.4"]
[mvxcvi/puget "1.0.1"]]
:plugins [[lein-codox "0.9.1"]
[lein-shell "0.5.0"]
[lein-gorilla "0.4.0"]
Expand Down
15 changes: 7 additions & 8 deletions src/clojush/args.clj
@@ -1,8 +1,7 @@
(ns clojush.args
(:require [clj-random.core :as random])
(:use [clojush globals random util pushstate]
[clojush.instructions.tag]
[clojush.pushgp report]))
[clojush.instructions.tag]))

(def push-argmap
(atom (sorted-map
Expand Down Expand Up @@ -89,8 +88,8 @@
:uniform-addition 0.0
:uniform-addition-and-deletion 0.0
:uniform-combination-and-deletion 0.0
:genesis 0.0
}
:genesis 0.0}

;; The map supplied to :genetic-operator-probabilities should contain genetic operators
;; that sum to 1.0. All available genetic operators are defined in clojush.pushgp.breed.
;; Along with single operators, pipelines (vectors) containing multiple operators are
Expand Down Expand Up @@ -376,11 +375,11 @@
;; The number of simplification steps that will happen during final report
;; simplifications.

:problem-specific-initial-report default-problem-specific-initial-report
:problem-specific-initial-report (fn [argmap] :no-problem-specific-initial-report-function-defined)
;; A function can be called to provide a problem-specific initial report, which happens
;; before the normal initial report is printed.

:problem-specific-report default-problem-specific-report
:problem-specific-report (fn [& args] :no-problem-specific-report-function-defined)
;; A function can be called to provide a problem-specific report, which happens before
;; the normal generational report is printed.

Expand Down Expand Up @@ -462,10 +461,10 @@
;; Should be in the format "<hostname>:<port>"
;; If set, will send logs of each run to a server running on this
;; host
:label nil
:label nil)))
;; If set, will send this in the configuration of the run, to the
;; external record
)))


(defn load-push-argmap
[argmap]
Expand Down
42 changes: 42 additions & 0 deletions src/clojush/cli/log.clj
@@ -0,0 +1,42 @@
(ns clojush.cli.log
(:require [puget.printer :as puget]
[plumbing.fnk.pfnk :as pfnk]
[plumbing.core :refer [map-vals]]

[clojush.log.events :refer [label->compute-graph]]
[clojush.log.events.generation.individual :as individual])
(:import (schema.core.Predicate)
(schema.core.AnythingSchema)))

(def schema-handlers
{schema.core.Predicate (fn [_1 _2] nil)
schema.core.AnythingSchema (fn [_1 _2] nil)})

(defn my-print [form]
(puget/cprint
form
{:print-handlers schema-handlers}))


(defn event-inputs
"Prints the mapping of each event to it's outputs"
[& args]
(my-print
(map-vals pfnk/input-schema label->compute-graph)))

(defn event-outputs
"Prints the mapping of each event to it's inputs"
[& args]
(my-print
(map-vals pfnk/output-schema label->compute-graph)))


(defn individual-input [& args]
(my-print
(pfnk/input-schema
individual/compute-graph)))

(defn individual-output [& args]
(my-print
(pfnk/output-schema
individual/compute-graph)))
15 changes: 5 additions & 10 deletions src/clojush/core.clj
Expand Up @@ -16,33 +16,28 @@
;; for more details.

(ns clojush.core
(:require [clojush.pushgp.record :as r])
(:use [clojush.pushgp pushgp report])
(:require [clojush.log :refer [log!]])
(:use [clojush.pushgp pushgp])
(:gen-class))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; main function

(defn -main
(defn -main
"A main function for Clojush, which assumes that the first argument is the name
of a problem file that contains an argmap of arguments to PushGP.
Exits after completion of the call.
Any arguments after the first are treated as arguments to PushGP as key-value pairs.
This allows one to run an example with a call from the OS shell prompt like:
lein run examples.simple-regression :population-size 3000"
[& args]
(r/new-run!)
(println "Command line args:" (apply str (interpose \space args)))
(let [param-list (map #(if (.endsWith % ".ser")
(str %)
(read-string %))
(rest args))]
(require (symbol (r/config-data! [:problem-file] (first args))))
(require (symbol (first args)))
(let [example-params (eval (symbol (str (first args) "/argmap")))
params (merge example-params (apply sorted-map param-list))]
(println "######################################")
(println "Parameters set at command line or in problem file argmap; may or may not be default:")
(print-params (into (sorted-map) params))
(println "######################################")
(log! :init {:args args :params params})
(pushgp params)
(System/exit 0))))
19 changes: 19 additions & 0 deletions src/clojush/log.clj
@@ -0,0 +1,19 @@
(ns clojush.log
(:require [clojush.structured-logger :refer [->structured-logger]]
[clojush.log.events :refer [label->compute-graph]]
[clojush.log.handlers :refer [handlers]]))
;; this stuff is spread out over many files in subdirectories
;; Not only does this help with keeping files shorter, but it also allows
;; us to use the defnk names directly in the graph, because we don't
;; have to namespace them. For example, if we define `best` in the
;; lexicase file, then we can just throw that in the graph we defined
;; there, and it will registered under :best.

(def structured-logger
(->structured-logger
{:handlers handlers
:label->compute-graph label->compute-graph}))

(def log! (:log! structured-logger))

(def get-computed (:get-computed structured-logger))
9 changes: 9 additions & 0 deletions src/clojush/log/events.clj
@@ -0,0 +1,9 @@
(ns clojush.log.events
(:require [clojush.log.events.init]
[clojush.log.events.config]
[clojush.log.events.generation]))

(def label->compute-graph
{:init clojush.log.events.init/compute-graph
:config clojush.log.events.config/compute-graph
:generation clojush.log.events.generation/compute-graph})
45 changes: 45 additions & 0 deletions src/clojush/log/events/config.clj
@@ -0,0 +1,45 @@
(ns clojush.log.events.config
(:require [plumbing.core :refer [defnk]]
[plumbing.graph :as graph]
[clj-random.core :as random]
[local-file]
[clojure.string :as string]))

(defnk clojush-version []
(let [version-str (apply str (butlast (re-find #"\".*\""
(first (string/split-lines
(local-file/slurp* "project.clj"))))))]
(.substring version-str 1 (count version-str))))

(defnk argmap [argmap-input]
argmap-input)

(defnk registered-instructions [registered-instructions-input]
registered-instructions-input)

(defnk argmap-with-random-str [argmap-input]
(update argmap :random-seed random/seed-to-string))

(defnk git-hash []
(let [dir (local-file/project-dir)]
(string/trim
(slurp
(str dir
"/.git/"
(subs
(string/trim
(slurp
(str dir "/.git/HEAD")))
5))))))

(defnk initialization-ms [timing-map]
(:initialization timing-map))

(def compute-graph
(graph/graph
clojush-version
argmap
registered-instructions
git-hash
initialization-ms
argmap-with-random-str))

0 comments on commit 636d1f5

Please sign in to comment.