Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add neldermead #34

Merged
merged 39 commits into from May 28, 2020
Merged

Add neldermead #34

merged 39 commits into from May 28, 2020

Conversation

jangerit
Copy link
Contributor

@jangerit jangerit commented May 19, 2020

I added the Nelder-Mead Simplex algorithm to strategies (#18). Therefore, I've adapted the Nelder-Mead code from the scipy-optmize package to make it usable for sequential requests.

Also, the modifications proposed by Cortes-Borda et al. in terms of reflecting points, dimension reduction and dimension recovery are implemented. These modifications increase the chance of finding optimal conditions that are close to the predefined bounds of the variables.

Constraints are treated within the suggest_experiments method of the Nelder-Mead strategy. Suggested points that violate the constraints are penalized with a function value of ininity (for minimization problems) and the algorithm is called again. Note that if more than one point is suggested by the algorithm, only the points that meet the constraints are returned. Points that violate the constraints are then stored temporarily and read in (again with an associated infinity value) the next time the algorithm is called.

Note that the number of suggested points can vary for different iterations of the algorithm and one iteration requires at least 2 squential function evalutions, hence one call of the Nelder-Mead algorithm does not correspond to one iteration.

One 2D (Himmelblau function) and one 3D (Hartmann function) application of the Nelder-Mead strategy are implemented in tests/test_strategies.py. The results are plotted, whereby each point corresponds to a suggested point that meets the constraints and connected points indicate the resulting simplex of one iteration.

@jangerit
Copy link
Contributor Author

jangerit commented May 19, 2020

The pytest check raises an error for the pandas.eval() method. When I run pytest locally it runs without an error.

summit/strategies/neldermead.py:109: in suggest_experiments
    next_experiments, xbest, fbest, param = self.inner_suggest_experiments(prev_res=prev_res, prev_param=inner_prev_param)
summit/strategies/neldermead.py:368: in inner_suggest_experiments
    mask_valid_next_experiments = self.check_constraints(next_experiments)
summit/strategies/neldermead.py:656: in check_constraints
    tmp_r = pd.eval(tmp_c, resolvers=[tmp_next_experiments])
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/eval.py:293: in eval
    parsed_expr = Expr(expr, engine=engine, parser=parser, env=env,
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:749: in __init__
    self.terms = self.parse()
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:766: in parse
    return self._visitor.visit(self.expr)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:331: in visit
    return visitor(node, **kwargs)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:337: in visit_Module
    return self.visit(expr, **kwargs)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:331: in visit
    return visitor(node, **kwargs)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:340: in visit_Expr
    return self.visit(node.value, **kwargs)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:331: in visit
    return visitor(node, **kwargs)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:666: in visit_Compare
    return self.visit(binop)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:331: in visit
    return visitor(node, **kwargs)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:435: in visit_BinOp
    op, op_class, left, right = self._maybe_transform_eq_ne(node)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:372: in _maybe_transform_eq_ne
    left = self.visit(node.left, side='left')
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:331: in visit
    return visitor(node, **kwargs)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:435: in visit_BinOp
    op, op_class, left, right = self._maybe_transform_eq_ne(node)
../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:374: in _maybe_transform_eq_ne
    right = self.visit(node.right, side='right')
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pandas.core.computation.expr.PandasExprVisitor object at 0x7f716320e0d0>
node = <_ast.Constant object at 0x7f716320e250>, kwargs = {'side': 'right'}
method = 'visit_Constant'
visitor = <bound method NodeVisitor.visit_Constant of <pandas.core.computation.expr.PandasExprVisitor object at 0x7f716320e0d0>>

    def visit(self, node, **kwargs):
        if isinstance(node, string_types):
            clean = self.preparser(node)
            try:
                node = ast.fix_missing_locations(ast.parse(clean))
            except SyntaxError as e:
                from keyword import iskeyword
                if any(iskeyword(x) for x in clean.split()):
                    e.msg = ("Python keyword not valid identifier"
                             " in numexpr query")
                raise e
    
        method = 'visit_' + node.__class__.__name__
        visitor = getattr(self, method)
>       return visitor(node, **kwargs)
E       TypeError: visit_Constant() got an unexpected keyword argument 'side'

../home/.cache/pypoetry/virtualenvs/summit-cDQ_M3T8-py3.8/lib/python3.8/site-packages/pandas/core/computation/expr.py:331: TypeError

I tried several modifications of the code (<tmp_next_experiments> is a DataSet object), but it seems more like a compatibility problem (see issue below).

if len(self.domain.constraints)>0:
constr = [c.constraint_type + "0" for c in self.domain.constraints]
constr_mask = [pd.eval(c.lhs + constr[i], resolvers=[tmp_next_experiments]) for i, c in enumerate(self.domain.constraints)]
constr_mask = [tmp_next_experiments.eval(c.lhs + constr[i], resolvers=[tmp_next_experiments]) for i, c in enumerate(self.domain.constraints)]
     constr_mask = []
     for i, c in enumerate(self.domain.constraints):
            tmp_c = c.lhs + constr[i]
            tmp_r = pd.eval(tmp_c, resolvers=[tmp_next_experiments])
            constr_mask.append(tmp_r)

I've found a similar issue here and it seems like there is a problem with python 3.8. However, updating the pandas version to 0.25.3 did not help. On my local machine, the code runs with python 3.7 and 3.8. Do you have any idea what could cause this?

@jangerit jangerit requested a review from marcosfelt May 19, 2020 07:51
@marcosfelt
Copy link
Member

marcosfelt commented May 19, 2020 via email

@jangerit
Copy link
Contributor Author

jangerit commented May 19, 2020

That’s really strange. Maybe try changing the python version to 3.7 on .github/workflows/ci.yml? Just to see if the issue is the python version used on the CI system.

Ok, I'll try this. Thanks!

Also, did you commit the updated poetry lock file? Poetry should install from the lock file when it runs on the CI system.

Yes, the pytest log displayed that the updated version of pandas 0.25.3 was installed.

@jangerit
Copy link
Contributor Author

jangerit commented May 19, 2020

Changing python to version 3.7.7 worked.

I've then adjusted the code and tried again changing back to python 3.8.2 with pandas 0.25.3, it works now. However, it seems like there's been a change from pandas 0.24.2 to 0.25.3, as now the DataSets are printed like this:

pandas 0.25.3:

    NAME temperature flowrate_a             strategy
    TYPE        DATA       DATA             METADATA
    0          0.500      0.500  Nelder-Mead Simplex
    1          0.625      0.500  Nelder-Mead Simplex
    2          0.500      0.625  Nelder-Mead Simplex

compared to pandas 0.24.2:

    NAME temperature flowrate_a             strategy
    0          0.500      0.500  Nelder-Mead Simplex
    1          0.625      0.500  Nelder-Mead Simplex
    2          0.500      0.625  Nelder-Mead Simplex

Maybe we need to make some changes in dataset.py, but the algorithms all work now.

@marcosfelt
Copy link
Member

marcosfelt commented May 19, 2020 via email

Copy link
Member

@marcosfelt marcosfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it was a massive effort. Good job! I never realized the Nelder-Mead Simplex has so much logic.

My main comment was factoring out the test functions into a benchmark to make the tests simpler.

Comment on lines 319 to 340
xlist = np.linspace(-7, 7, 1000)
ylist = np.linspace(-7, 7, 1000)
X, Y = np.meshgrid(xlist, ylist)
Z = (((X**2 + Y - 11)**2+(X + Y**2 -7)**2))
ax.contour(X,Y,Z, levels=np.logspace(-2, 3, 30, base=10), alpha=0.3)
p = PatchCollection(patches, facecolors="None", edgecolors='grey', alpha=1)
ax.add_collection(p)
for c in range(len(points)):
ax.scatter(points[c][0], points[c][1])
ax.text(points[c][0] + .01, points[c][1] + .01, c+1, fontsize=7)
ax.axvline(x=-4, color='k', linestyle='--')
ax.axhline(y=6, color='k', linestyle='--')
ax.axvline(x=4, color='k', linestyle='--')
ax.axhline(y=-6, color='k', linestyle='--')
if constraint:
x = np.linspace(-4, 4, 400)
y = np.linspace(-6, 6, 400)
x, y = np.meshgrid(x, y)
ax.contour(x, y, (x*y+10), [0], colors='grey',linestyles='dashed')
ax.contour(x, y, (x+y+7), [0], colors='grey', linestyles='dashed')
plt.show()
plt.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to have plotting code in the unit tests. Maybe it would be better to break the simulation function and plotting code out into a benchmark and then run the tests against that? I think the plots are really nice and would be great to have in general for this benchmark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, thanks for the suggestion! I've outsourced the test functions for the optimization algorithms (Himmelblau and Hartmann3D) to an extra file test_functions.py which is part of the benchmark class. Thus, the test functions evaluations are treated similar to the chemical simulations now.

Also, I've added a function plot() to both test functions. The points that are evaluated with run_experiments() are automatically stored and passed to the plot function if called. So, test_strategies.py does not have plotting code anymore. Only the simplex of the current iteration needs to be stored, if the Nelder-Mead algorithm is used, so that later the simplexes can be plotted.

# Get previous results
if prev_res is not None:
initial_run = False
inputs, outputs = self.transform.transform_inputs_outputs(prev_res)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this in because you need to reference the transforms in this way from now on. I'm not sure where to add theun_transform method, so it would be great if you could do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the un_transform in neldermead.py. I'm not sure, though, if there is a case where this is used, as neldermead does not return any function evaluations.

@jangerit
Copy link
Contributor Author

jangerit commented May 26, 2020

Updated this PR with following changes:

  • remove test functions from test_strategies.py and create test_functions.py that includes the test functions within benchmark class
  • remove plotting code from test_strategies.py and include it to respective class in test_functions.py by calling a function plot()
  • add un_transform to neldermead.py

Concerning this, I'll adjust the snobfit algorithm and its test in test_strategies.py in a separate PR, as the scikit-quant team has updated the SQSnobFit package to solve the issues I reported. I think, we can now change back from the forked repo to the original SQSnobFit code.

@marcosfelt
Copy link
Member

We're almost there. I removed the error catching on the tests because we definitely shouldn't do that. That exposed the following problem: Stacking pytest.mark.parametrize wrappers causes every combination to be run (see here). However, in the test_nm3D function, I think you want to specify which x_start values should not come with a constraint (e.g., [1,1,0.2]). The way to do that is to combine it all into one call pytest.mark.parametrize using tuples.

I also made a change to the way the loop is written to remove some the branched if statements. That should make the code easier to read.

@jangerit
Copy link
Contributor Author

Perfect, thanks!

The last commit includes:

  • adaption of the test_nm3D function to be parametrized with tuples;
  • change of the test_nm2D similar to your changes for test_nm3D;
  • fix loop in neldermead (stop inner loop after 5 consecutive trials w/o finding a new point and return error)

@marcosfelt marcosfelt merged commit c83856e into master May 28, 2020
@marcosfelt marcosfelt linked an issue May 28, 2020 that may be closed by this pull request
@marcosfelt marcosfelt deleted the add_neldermead branch July 10, 2020 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Nelder-Mead Simplex strategy
2 participants