Skip to content

Commit

Permalink
Merge branch 'release/v0.5-release'
Browse files Browse the repository at this point in the history
  • Loading branch information
markcoletti committed Feb 3, 2023
2 parents 7bc7ce6 + f254d96 commit 071b14e
Show file tree
Hide file tree
Showing 9 changed files with 144 additions and 87 deletions.
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
/local
*.log
*.pyc
*.pdf
*.pt
*.gz
*-ubyte
__pycache__
docker-nom/build/
\#*\#
Expand Down Expand Up @@ -80,4 +84,4 @@ __MACOSX
dask-worker-space/

# Generated data
*.csv
*.csv
33 changes: 26 additions & 7 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,3 +1,22 @@
* `v0.5`, 1/27/23

Installed executable now `gremlin` instead of `gremlin.py`. Compensated
for LEAP API changes. (Note that `gremlin` still depends on the LEAP
`develop` branch and not on the official LEAP `master` branch releases to
take advantage of more up to date LEAP features.)

Added optional `with_client` `async` config section for code to be executed
after Dask client is started. This can be used to start worker plugins or
wait for a certain number of workers to become available.

`setup.py` now installs third party dependencies. Please note that the
latest LEAP version in LEAP `develop` will have to be installed.

Now better catch exceptions in LEAP code such that any errors that
propagate from there don't silently kill Gremlin.

Made a number of minor bug fixes and code format changes.

* `v0.4`, 9/30/22

Replaced `imports` with `preamble` in YAML config files thus giving more
Expand All @@ -11,19 +30,19 @@

* `v0.3`, 3/9/22

Add support for config variable `algorithm` that denotes if using a
traditional by-generation EA or an asynchronous steady-state EA
Add support for config variable `algorithm` that denotes if using a
traditional by-generation EA or an asynchronous steady-state EA

* `v0.2dev`, 2/17/22

Revamped config system and heavily refactored/simplified code
Revamped config system and heavily refactored/simplified code

* Version `0.1dev` Migrated to github, 10/14/2021

This version moved from code.ornl.gov repository to github to facilitate
use as an open-source project.
This version moved from code.ornl.gov repository to github to facilitate
use as an open-source project.

* Version `0.0` Migrated from internal repository, 7/13/2021

Migrated from internal git repository to code.ornl.gov, and generalized
source to be more readily applicable to new problems.
Migrated from internal git repository to code.ornl.gov, and generalized
source to be more readily applicable to new problems.
30 changes: 22 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,19 @@ perform better for those sets.
![2022 R&D 100 Award Winner](RD100_2022_Winner_Logo-small.png) Gremlin is a [2022 R&D 100 Award Winner!](https://www.rdworldonline.com/rd-100-winners-for-2022-are-announced/)

## Requires
* Python 3.[78]
* [LEAP https://github.com/AureumChaos/LEAP](https://github.com/AureumChaos/LEAP)
* Python >= 3.7.0
* [LEAP https://github.com/AureumChaos/LEAP/tree/develop](https://github.
com/AureumChaos/LEAP/tree/develop) -- **Note that this is for the LEAP
`develop` branch.

## Installation

1. Activate your conda or virtual environment
2. cd into top-level gremlin directory
2. `cd` into top-level gremlin directory
3. `pip install .`

## Configuration
Gremlin is essentially a thin convenience wrapper around [LEAP]
Gremlin is a thin convenience wrapper around [LEAP]
(https://github.com/AureumChaos/LEAP). Instead of writing a script in LEAP,
one would instead point the `gremlin` executable at a YAML file that describes
what LEAP classes, subclasses, and functions to use, as well as other salient
Expand All @@ -42,7 +44,7 @@ async: # parameters for asynchronous steady-state EA
ind_file_probe: probe.log_ind # optional functor or function for writing ind_file

pop_file: pop.csv # where we will write out each generation in CSV format
problem: problem.QLearnerBalanceProblem("${env:GREMLIN_QLEARNER_CARTPOLE_MODEL_FPATH}")
problem: problem.QLearnerBalanceProblem("${oc.env:GREMLIN_QLEARNER_CARTPOLE_MODEL_FPATH}")
representation: representation.BalanceRepresentation()
preamble: |
import probe # need to import our probe.py so that LEAP sees our probe pipeline operator
Expand Down Expand Up @@ -86,8 +88,21 @@ This can be run simply by (must be in `examples/MNIST` directory):
$ gremlin config.yml
```

## Documentation
The [wiki](https://github.com/markcoletti/gremlin/wiki) has more detailed
documentation, particularly on how the YAML config files can be set up for
Gremlin runs.

## Versions

Note that more detailed explanations for version changes can be found in the
`CHANGELOG`.

* `v0.6`, in progress on `develop`
* `v0.5`, 2/3/23
* Main installed executable now `gremlin` and not `gremlin.py`. Added
optional `async.with_client` config section. Improvements made to `setup.
py`.
* `v0.4`, 9/30/22
* Added config variable `async.with_client` that allows for interacting
with Dask before the EA runs; e.g., `client.wait_for_workers()` or
Expand All @@ -110,6 +125,5 @@ $ gremlin config.yml

## Main web site

The `gremlin` github repository is [https://github.com/markcoletti/gremlin]
(https://github.com/markcoletti/gremlin). `main` is the release branch and
active work occurs on the `develop` branch.
The `gremlin` github repository is https://github.com/markcoletti/gremlin.
`main` is the release branch and active work occurs on the `develop` branch.
4 changes: 4 additions & 0 deletions examples/MNIST/broken.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# A configuration that intentionally breaks a gremlin run to observe how
# it handles errors/exceptions.
preamble: |
import doesnotexist
13 changes: 0 additions & 13 deletions examples/MNIST/config_debug.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,6 @@
# Usage:
# $ gremlin.py config.yml config_debug.yml
pop_size: 1
algorithm: bygen
bygen: # parameters that only make sense for a by-generation EA
max_generations: 1
k_elites: 0
problem: problem.MNISTProblem()
representation: representation.MNISTRepresentation()
pop_file: pop.csv # where we will write out each generation in CSV format
preamble: |
import probe # need to import our probe.py so that LEAP sees our probe pipeline operator
pipeline:
- ops.tournament_selection
- ops.clone
- mutate_randint(expected_num_mutations=1, bounds=representation.MNISTRepresentation.genome_bounds)
- ops.evaluate
- probe.IndividualProbeCSV('inds.csv') # our own probe to see every single created offspring
- ops.pool(size=${pop_size})
Empty file added gremlin/__init__.py
Empty file.
2 changes: 1 addition & 1 deletion gremlin/__version__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = 'v0.4'
__version__ = 'v0.5'
123 changes: 73 additions & 50 deletions gremlin/gremlin.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@

from omegaconf import OmegaConf

import rich
from rich.logging import RichHandler

# Create unique logger for this namespace
Expand All @@ -43,9 +44,13 @@

pretty.install()

from rich.traceback import install
rich.traceback.install(show_locals=True)

from rich.console import Console
console = Console()



install()

from distributed import Client, LocalCluster

Expand Down Expand Up @@ -253,7 +258,7 @@ def run_async_ea(pop_size, init_pop_size, max_births, problem, representation,
client.register_worker_plugin(WorkerLoggerPlugin())

final_pop = asynchronous.steady_state(client,
births=max_births,
max_births=max_births,
init_pop_size=init_pop_size,
pop_size=pop_size,

Expand Down Expand Up @@ -285,7 +290,7 @@ def run_async_ea(pop_size, init_pop_size, max_births, problem, representation,
client.register_worker_plugin(WorkerLoggerPlugin())

final_pop = asynchronous.steady_state(client,
births=max_births,
max_births=max_births,
init_pop_size=init_pop_size,
pop_size=pop_size,

Expand All @@ -302,7 +307,7 @@ def run_async_ea(pop_size, init_pop_size, max_births, problem, representation,
print([str(x) for x in final_pop])


if __name__ == '__main__':
def main():
logger.info('Gremlin started')

parser = argparse.ArgumentParser(
Expand Down Expand Up @@ -332,50 +337,68 @@ def run_async_ea(pop_size, init_pop_size, max_births, problem, representation,

pop_size = int(config.pop_size)

if config.algorithm == 'async':
logger.debug('Using async EA')

scheduler_file = None if 'scheduler_file' not in config['async'] else \
config['async'].scheduler_file

ind_file = None if 'ind_file' not in config['async'] else \
config['async'].ind_file

ind_file_probe = None if 'ind_file_probe' not in config['async'] else \
config['async'].ind_file_probe

# This is for optional code to be executed after the Dask client has
# been established, but before execution of the EA. This allows for
# things like client.wait_for_workers() or client.upload_file() or the
# registering of dask plugins. This is a string that will be `exec()`
# later after a dask client has been connected.
with_client_exec_str = None if 'with_client' not in config['async'] else \
config['async'].with_client

run_async_ea(pop_size,
int(config['async'].init_pop_size),
int(config['async'].max_births),
problem, representation, pipeline,
config.pop_file,
ind_file,
ind_file_probe,
scheduler_file,
with_client_exec_str)
elif config.algorithm == 'bygen':
# default to by generation approach
logger.debug('Using by-generation EA')

# Then run leap_ec.generational_ea() with those classes while writing
# the output to CSV and other, ancillary files.
max_generations = int(config.bygen.max_generations)
k_elites = int(config.bygen.k_elites) if 'k_elites' in config else 1

run_generational_ea(pop_size, max_generations, problem, representation,
pipeline,
config.pop_file, k_elites,
with_client_exec_str)
else:
logger.critical(f'Algorithm type {config.algorithm} not supported')
sys.exit(1)
try:
if config.algorithm == 'async':
logger.debug('Using async EA')

scheduler_file = None if 'scheduler_file' not in config['async'] else \
config['async'].scheduler_file

ind_file = None if 'ind_file' not in config['async'] else \
config['async'].ind_file

ind_file_probe = None if 'ind_file_probe' not in config['async'] else \
config['async'].ind_file_probe

# This is for optional code to be executed after the Dask client has
# been established, but before execution of the EA. This allows for
# things like client.wait_for_workers() or client.upload_file() or the
# registering of dask plugins. This is a string that will be `exec()`
# later after a dask client has been connected.
# TODO generalize this to be algorithm agnostic in config file
with_client_exec_str = None if 'with_client' not in config['async'] else \
config['async'].with_client

run_async_ea(pop_size,
int(config['async'].init_pop_size),
int(config['async'].max_births),
problem, representation, pipeline,
config.pop_file,
ind_file,
ind_file_probe,
scheduler_file,
with_client_exec_str)
elif config.algorithm == 'bygen':
# default to by generation approach
logger.debug('Using by-generation EA')

# Then run leap_ec.generational_ea() with those classes while writing
# the output to CSV and other, ancillary files.
max_generations = int(config.bygen.max_generations)
k_elites = int(config.bygen.k_elites) if 'k_elites' in config else 1

# This is for optional code to be executed after the Dask client has
# been established, but before execution of the EA. This allows for
# things like client.wait_for_workers() or client.upload_file() or the
# registering of dask plugins. This is a string that will be `exec()`
# later after a dask client has been connected.
# TODO LEAP does not (yet) support Dask for by-generation. Soon!
# with_client_exec_str = None if 'with_client' not in config['bygen'] else \
# config['bygen'].with_client

run_generational_ea(pop_size, max_generations, problem, representation,
pipeline,
config.pop_file, k_elites)
else:
logger.critical(f'Algorithm type {config.algorithm} not supported')
sys.exit(1)
except Exception as e:
logger.critical(f'Caught {e!s} during run. Exiting.')
console.print_exception()

logger.info('Gremlin finished.')



if __name__ == '__main__':
main()
20 changes: 13 additions & 7 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,23 @@
version=__version__,
packages=['gremlin'],
scripts=['gremlin/gremlin.py'],
# entry_points={
# 'console_scripts': [
# 'gremlin = gremlin.gremlin:client'
# ],
# },
python_requires=">=3.7.0",
url='https://github.com/markcoletti/gremlin',
license='MIT License',
author='Mark Coletti',
author_email='colettima@ornl.gov',
long_description=long_description,
long_description_content_type='text/markdown',
description=('Adversarial evolutionary algorithm for'
'training data optimization')
description=('Adversarial evolutionary algorithm for training data '
'optimization'),
entry_points={
"console_scripts": [
"gremlin = gremlin:main"
]
},
install_requires=[
'leap-ec',
'omegaconf',
'tqdm',
'rich']
)

0 comments on commit 071b14e

Please sign in to comment.