Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiproc call signiture and Pool parallelism #33

Merged
merged 14 commits into from
Feb 20, 2022
Merged

Conversation

MDCHAMP
Copy link
Owner

@MDCHAMP MDCHAMP commented Jan 22, 2022

PR for #32

@codecov-commenter
Copy link

codecov-commenter commented Jan 22, 2022

Codecov Report

Merging #33 (69f5878) into main (21da558) will increase coverage by 0.02%.
The diff coverage is 98.27%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #33      +/-   ##
==========================================
+ Coverage   98.55%   98.57%   +0.02%     
==========================================
  Files          17       17              
  Lines        1105     1121      +16     
==========================================
+ Hits         1089     1105      +16     
  Misses         16       16              
Impacted Files Coverage Δ
tests/test_benchmarking.py 71.42% <0.00%> (ø)
src/freelunch/base.py 100.00% <100.00%> (ø)
src/freelunch/util.py 96.66% <100.00%> (ø)
tests/test_base.py 100.00% <100.00%> (ø)
tests/test_optimisers.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 21da558...69f5878. Read the comment docs.

@MDCHAMP
Copy link
Owner Author

MDCHAMP commented Jan 22, 2022

Naive approach based on just taking the raw objective function and not counting nfes...

The trouble is that wrapping the objective function is a nice way to do handling of bad obj scores and nfe counting but then the objs cannot be pickled for IPC.

There are ways around this:
- Modify the return signiture of optimiser.run to include nfe counts
- Modify the obj return checking to be done functionally by apply_obj()

But these will require a medium sized amount of work (changes to every optimiser.run method, tech and base at a minimum.... )

The other major issue is that the curent API more or less forces end users to wrap their objs before passing to freelunch if they want to write good code.

I guess some sort of freelunch-specific warning when a wrapped function is provided to the multiproc API could at least let users know that they need to provide a pickle-able obj?

All this is a ballache.... unless I'm fundamentally missunderstanding something about multiprocessing/pickle?

@MDCHAMP
Copy link
Owner Author

MDCHAMP commented Jan 23, 2022

Ok MWE here using functools.partial that solves the issue with only a little bit of an onus on the user to write different code...

I'll work tomorrow on implementing this in the PR. Definitely going to need some documentation / example code for this. Might even be worth catching the inevitable AttributeError that gets thrown when the pickler can't handle the wrapped functions and include a link to the docs..

import numpy as np
from functools import partial
from multiprocessing import Pool

# MODULE SIDE CODE

def obj_wrapper(obj, opt_inst, x):
    opt_inst.nfe += 1
    print(opt_inst.nfe)
    ret = obj(x)
    if ret > 9.5:
        ret = 0
    else:
        ret = 1
    return ret

class optimiser:

    def __init__(self, obj):
        self.nfe = 0
        self.obj = partial(obj_wrapper, obj, self)
        

    def run(self):
        return [self.obj(np.random.uniform(0,1)) for _ in range(3)], self.nfe

    def __call__(self, runs):
        return Pool(3).starmap(self.run, [() for _ in range(runs)])



# USER SIDE CODE

y_tgt = 10 # some prediction / hyperparameter needed for objective function logic

def my_predict(x): # some functional logic needed for objective function logic
    return x

# Obvious pythonic way to do it... doesn't work... obviously

def wrap_my_obj(predictor, hyper):
    def obj(x):
        y_hat = predictor(x)
        return y_tgt-y_hat
    return obj

not_mp_safe_obj = wrap_my_obj(my_predict, y_tgt)

# Onus is on user to prepare functions like this for multiproc...

def my_obj(predictor, hyper, x): # objective function with too many / invalid args for freelunch
    y_hat = predictor(x)
    return y_tgt-y_hat

mp_safe_obj = partial(my_obj, my_predict, y_tgt) # get a multiproc safe obj without wrapping



# RUNTIME
if __name__ == '__main__':

    opt = optimiser(mp_safe_obj)
    for a in opt(3):
        print(a)


# ([0, 1, 1], 3)
# ([0, 1, 1], 3)
# ([1, 1, 0], 3)

...python man

@MDCHAMP MDCHAMP marked this pull request as ready for review January 23, 2022 01:46
@MDCHAMP
Copy link
Owner Author

MDCHAMP commented Jan 23, 2022

Functionality is there at last.

Just need to add some documentation and something to the readme to cover the new functionality and we are good to go!

@MDCHAMP MDCHAMP merged commit 7e10418 into main Feb 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants