Harness #28

kowey · 2015-05-18T05:47:45Z

Notes:

this is part of a cross-repo (including irit-rst-dt and irit-stac) refactor. So there'll be multiple pull requests with the same name
it only looks like there's a lot of patches, but what happened is that I merged the irit-rst-dt repo into attelo and did some git mv (irit-rst-dt still stands alone; I just wanted to preserve history) Only the changes after 646b99b matter

I may just merge this myself, as I understand Mathieu is busy

Hide any extra arguments we may have to pass along

No need for our own stack trace. It's confusing.

Makes it more convenient for saving results

Instead of using the monolithic attelo evaluate command, break the evaluation process down into - extracting folds (attelo enfold) - looping over folds: - learning and saving model (attelo learn) - decoding with saved model (shared!) (attelo decode) - summarising the results (attelo report) This may also one day open the way to running folds concurrency

Only Pandoc supports definition lists, I suppose

We want to maximise code between various experiments

Pick up an evaluation where we left off

Kill: - any scratch directories - any incomplete eval dirs It's also tempting to get rid of feature directories without evaluations but it's not likely to be a common case in future development

I want it to be as easy to know where we are at a glance.

Sigh

ENH add random forest and decision tree classifiers

Ooops! I must have done this only locally on the cluster

Conflicts: irit_rst_dt/local.py

I just noticed that it seems possible to use joblib parallel in a sort of producer/consumer pattern: we don't need to have all our jobs ready in one go So if I understand correctly, this means we can have a generator expression that yields decoder jobs as soon as the parser for them has been fit. Outcome: 1. No more waiting for all the learners to complete before we decode; start decoding as soon as the relevant learner is done 2. But still apply parallelism across decoders for multiple configurations So if all learners are fast, that's great, we get to run lots of jobs in parallel. If some learners are slow, that's hopefully still OK because we can still work on decoding while they're crunching away. I hope this means less dead space. More cores humming away in parallel

…eptrons FIX update and fix calls to perceptrons

Conflicts: .gitignore README.md requirements.txt setup.py

You will need to define HarnessConfig which contains settings, file name conventions, and evaluations for the particular harness. This was taken from irit_rst_dt

Don't try to hash models we may not have built

Harness

kowey added 30 commits July 14, 2014 07:45

Stub infrastructure

b6cacb3

More pieces of the infrastructure

76adfd5

Update doc for subcommand-based infrastructure

2a0c699

flesh out irit-rst-dt gather

a42980f

.gitignore

164a10d

Slightly more helpful banners

0651233

Add pip --freeze output to snapshots

98c1bfc

Add decoder as attelo option

3530029

Fail on subcommand failure

f7ec5b9

irit-rst-dt features: print feature listing

cd9da98

Hide any extra arguments we may have to pass along

Prettier subprocess call failure output

8543bcb

No need for our own stack trace. It's confusing.

Hardlink gathered features in evaluation.

66236bc

Makes it more convenient for saving results

also learn/enfold with attelo API

90fbf69

Say where we've saved the scores

8ef268a

Clean away attelo command calls

033473a

Markdown tweak

065b73c

Only Pandoc supports definition lists, I suppose

Move helpers to attelo.harness

2b83a98

We want to maximise code between various experiments

Generate per-fold scores

24bac69

Better name for intermediary counts file

6c50107

irit-rst-dt evaluate --resume

58e2fc2

Pick up an evaluation where we left off

irit-rst-dt clean: remove intermediary eval files

ca0da56

Kill: - any scratch directories - any incomplete eval dirs It's also tempting to get rid of feature directories without evaluations but it's not likely to be a common case in future development

Flesh out harness documentation

6aecfe2

Say more in intermediary banners

805b930

I want it to be as easy to know where we are at a glance.

Use attelo harness symlink util

7e24ea1

Oops

bd68cbd

Doc tweak

800b64c

Typo

394ed88

STAC? what STAC?

9dd1592

PTB features

20422ca

kowey and others added 28 commits April 13, 2015 17:53

FIX type error in model summary computation

3de8c27

Sigh

MAINT catch up with attelo learner API change

38be39f

ENH add random forest and decision tree classifiers

b31b8b3

Merge pull request irit-melodi#14 from moreymat/random-forest

7aa074e

ENH add random forest and decision tree classifiers

MAINT catch up w attelo decomposable parser API

bc17723

MAINT catch up with new parser cache mechanism

da75cac

MAINT switch to 31 cores per node

3aeced3

Ooops! I must have done this only locally on the cluster

MAINT switch to 31 cores per node

324bf56

Ooops! I must have done this only locally on the cluster

Merge branch 'master' into parser

af19400

Conflicts: irit_rst_dt/local.py

MAINT ParserConfig moved to attelo

e7e84d1

MAINT remove obsolete warnings

cbb6454

MAINT tidy command structure

1b95acc

MAINT remove obsolete irit-rst-dt features

bc39cee

MAINT catch up with attelo Team.relate/label rename

7282958

MAINT converge a bit with irit-stac

4c476ce

MAINT converge a bit more with irit-stac

568504b

MAINT educe and attelo moved to irit-melodi

e29a143

ENH mechanism to lock the fold allocations down

84f788b

FIX update and fix calls to perceptrons

558c26e

FIX wrap dp's perc and pa with SkLearnXXXClassifier

67ba60f

Merge pull request irit-melodi#16 from moreymat/update-fix-local-perc…

9f758aa

…eptrons FIX update and fix calls to perceptrons

Merge remote-tracking branch 'irit-rst-dt/master' into harness

646b99b

Conflicts: .gitignore README.md requirements.txt setup.py

MAINT remove irit-rst-dt-specific files

111320e

ENH generic test harness

3a197a4

You will need to define HarnessConfig which contains settings, file name conventions, and evaluations for the particular harness. This was taken from irit_rst_dt

MAINT delete more harness junk

d4bc602

MAINT tiny harness API a little bit

c628479

FIX harness hashing logic

e906b66

Don't try to hash models we may not have built

kowey added a commit that referenced this pull request May 18, 2015

Merge pull request #28 from kowey/harness

96468d6

Harness

kowey merged commit 96468d6 into irit-melodi:master May 18, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harness #28

Harness #28

kowey commented May 18, 2015

Harness #28

Harness #28

Conversation

kowey commented May 18, 2015