# Lecture 15, Further topics and current research in optimization

## Black box surrogate-based global optimization

In the current course, (almost) all the models have been based on algebraic equations. 

However, in many cases, you do not have algebraic equations describing the problem, but instead you have a software or a piece of code that can calculate the values for you.

In many cases like this, you need to treat the model as a *"black box"*, which means that you only know what goes in and what comes out.

To solve the problem, you have a (limited) budget of expensive function evaluations to be used. Thus, your method is going to have to be intelligent in how to figure out which solutions to evaluate and which not.

In addition, these models may be highly nonconvex and, thus, you are going to have to use *global optimization methods*.

The methods described in this course are so-called local optimization methods. Local optimization methods are highly efficient in finding a local minimum of a problem, but they cannot guarantee global optimum.

Global optimization methods need to have some strategy for searching as much as possible of the search space.

In global optimization, there is the so-called **exploration vs. exploitation** ratio. Exploitation means that the method is basically acting as a local optimization method to find the nearest local optimum and exploration means that the method uses some strategy to try to find other local optima.

So-called soft-computing methods are very popular, although others also exist.

Finally, these black box models are often *computationally expensive* w.r.t. to time and/or money. For example, evaluating objective functions requires 
* doing some lab experiments that cost a lot and also take a lot of time or
* a numerical simulation based on e.g. solving partial differential equations.

Therefore, evaluation times for the objectives can range from minutes to days. In practice this means that only a very limited number of function evaluations (tens or 100-200) can be performed and a special attention has to be paid in which values of the decision variables these evaluations are done. 

One approach is to use a so-called surrogate to save function calls to the black box model. A *surrogate* is a function that can be used to approximate the original objective (or constraint) function but is fast to evaluate. A set of sample points with the original functions is needed to train the surrogate for approximating the original function. This is illustrated below. The approach is called **surrogate-assisted optimization**.

![alt text](images/surrogate.png "Surrogate")

<i>Figure by Mohammad Tabatabaei</i>

In practice, this means that there is a clever way of 
1. deciding whether to evaluate a solution with the black-box model or the surrogate model, and
2. when to update the surrogate with solutions evaluated using the black-box.

Usual surrogates are neural networks, radial basis functions and Kriging models/Gaussian processes. When Bayesian models are used like Gaussian processes, the term **Bayesian optimization** is often used.

More information on surrogate-assisted optimization can be found e.g. in
* <a href="https://www.sciencedirect.com/science/article/abs/pii/S2210650211000198">Y. Jin, **Surrogate-assisted evolutionary computation: Recent advances and future challenges**, *Swarm and Evolutionary Computation*, 1 (2), 61-70, 2011</a>
* <a href="https://link.springer.com/article/10.1007/s00158-015-1226-z">M. Tabatabaei et al., **A survey on handling computationally expensive multiobjective optimization problems using surrogates: non-nature inspired methods**, *Structural and Multidisciplinary Optimization*, 52 (1), 1-25, 2015</a>
* <a href="https://link.springer.com/chapter/10.1007/978-3-030-18764-4_10"> Stork J. et al., **Open Issues in Surrogate-Assisted Optimization**, In: Bartz-Beielstein T., Filipič B., Korošec P., Talbi EG. (eds) *High-Performance Simulation-Based Optimization*, Springer, 2020</a>

Recent PhD thesis related to using surrogates in optimization: 
* Tomi Haanpää, https://jyx.jyu.fi/handle/123456789/40501
* Mohammad Tabatabaei, https://jyx.jyu.fi/handle/123456789/52165
* Tinkle Chugh, https://jyx.jyu.fi/handle/123456789/54314


## Connecting "Big Data" and optimization
### Also called prescriptive analytics

Sometimes, the model of the problem is not based on an algebraic model, nor a computer program, but instead you have (e.g., measured) data about the phenomena concerning the problem. 

**This raises completely new kind of problems.**

Dealing with "Big Data", you have to deal with the four v:s:
* volume:
  * the data is actually big and you need to have specific tools for accessing it
  * in addition, one needs to figure out what is the relevant data
* variety:
  * the data is in completely different formats and you may have to deal with all of them (e.g., video, spread sheets, natural language),
* velocity:
  * the data is constantly changing and more data is being gathered,
* veracity:
  * the data is bad and untrusworthy, 
  * there is a lot of missing data. 

## An example of prescriptive analytics 
(by Jean Francois Puget, IBM, 2014)

* You are in a yacht race: What would you use to maneuver your ship to reach your destination as fast as possible?
* Naturally, the speed of your boat depends on the wind strength and direction

* First thing to utilize are weather reports (**descriptive analytics**) that tell you the current (and past) wind direction and speed. Based on that one can adjust the direction of the boat to move as fast as possible towards the goal.

* When you start sailing, the direction and the speed of wind will change -> you have to change your course
* Now, weather forecasts (**predictive analytics**) can be used to predict how the wind changes in the (near) future

* Your destination is several days/weeks ahead, how to use weather predictions? 
* Your route can be optimized based on a given weather forecast (**prescriptive analytics**) and this will give you the fastest route to your destination


E.g., a paper about combining local optimization and big data 

 * <a href="http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6879615&tag=1">V. Cevher et al., **Convex Optimization for Big Data: Scalable, randomized, and parallel algorithms for big data analytics**, *IEEE Signal Processing Magazine*, 31 (5), 32-43, 2014</a>

More recent paper dealing with evolutionary approaches and data: 

 * <a href="https://ieeexplore.ieee.org/document/8456559">Y. Jin et al., **Data-Driven Evolutionary Optimization: An Overview and Case Studies**, *IEEE Transactions on Evolutionary Computation*, 23 (3), 442-458, 2019 </a>

Also, in this case, one often needs machine learning techiques to first make sense of the data and then to optimize based on that information gathered. 

An example of this can be found in

 * <a href="https://link.springer.com/chapter/10.1007/978-3-030-13709-0_9"> J. Hakanen et al., **Data-Driven Interactive Multiobjective Optimization Using a Cluster-Based Surrogate in a Discrete Decision Space**, In: Nicosia G., Pardalos P., Giuffrida G., Umeton R., Sciacca V. (eds) *Machine Learning, Optimization, and Data Science (LOD 2018)*, Springer, 104-115, 2019</a>
    
which was published in a fairly recent conference series dedicated to combining machine learning and optimization: https://lod2019.icas.xyz/


Last year, there was a course 

***TIES583 Advanced Course in Optimization (5 credits)*** 

that dealt with optimization problems generated from data and how to solve them. 

This will most probably be lectured again in *spring 2022* since it is alternating with **TIES598 Nonlinear multiobjective optimization**.

This year in the JYU Summer School, there will be a course 

***COM3: Multicriteria Design Optimization in the Age of Data Science - Fundamentals and Case Studies*** 

that deals with data and optimization. Lecturer will be <a href="www.universiteitleiden.nl/en/staffmembers/michael-emmerich#tab-1">Prof. Michael Emmerich (Leiden University, The Netherlands)</a>

More information at www.jyu.fi/en/research/summer-and-winter-schools/jss/courses/courses-in-computational-sciences.

## Multiobjective optimization and decision support systems

**The whole point of optimization is to support decision making!**

However,
* most decision problems have multiple conflicting objectives, and
* human beings are not rational decision makers.

First item needs methods to deal with multiple objectives.

There are still a lot of unresolved questions in how the decision makers interact with optimization and, also, in just how to compute Pareto optimal solutions for complicated problems.

In addition, a user interface is a very important piece of a decision support system and should be taken into account for any implementation of multiobjective optimization methods. That enables interaction with a decision maker both in DM analysing the existing solutions and providing new preferences. The following aspects are emphasized in the user interface:
* visualization techniques for solutions having a high number of objectives
* linked visualizations that enable simultaneous analysis by using different types of visualizations
* responsive and live interface

Closely related to <a href="https://en.wikipedia.org/wiki/Visual_analytics">Visual Analytics</a>. The following paper describes visual analytics in more details: 

<a href="https://link.springer.com/chapter/10.1007/978-3-540-70956-5_7"> D. Keim et al., **Visual Analytics: Definition, Process, and Challenges**, In: A. Kerren et al. (eds) *Information Visualization*, 154-175, Springer, 2008</a>

## A screenshot of a user interface

![alt text](images/Teaser_With_Orange.png "User interface")

Second item (*human beings are not rational decision makers*) needs a completely separate type of research.

In fact, it has been shown that most of the decision making that humans do, is dictated by feelings.

Thus, one needs to take into account human beings as complete beings.

**This is studied in <a href="https://en.wikipedia.org/wiki/Behavioral_operations_management">behavioural operations research</a>**

## Further reading
* Multiobjective optimization e.g., in a recent paper by Kaisa Miettinen and others

  * <a href="http://dx.doi.org/10.1007/s11573-015-0786-0">K. Miettinen and F. Ruiz, **NAUTILUS framework: towards trade-off-free interaction in multiobjective optimization**, *Journal of Business Economics*, 86, 5–21, 2016</a>

* Behavioral aspects have been studied e.g., in a paper

 * <a href="http://www.sciencedirect.com/science/article/pii/S0167487015001427">N. Ravaja et al., **Emotional–motivational responses predicting choices: The role of asymmetrical frontal cortical activity**, *Journal of Economic Psychology*, 52, 56-70, 2016</a>

Good books on decision making (written for laymen):
* Daniel Kahneman: *Thinking fast and slow*
www.nytimes.com/2011/11/27/books/review/thinking-fast-and-slow-by-daniel-kahneman-book-review.html
* Dan Ariely: *Predictably irrational*
http://danariely.com/books/predictably-irrational/


Two articles on machine decision makers that try to mimic human DMs:
* <a href="https://link.springer.com/chapter/10.1007/978-3-319-15892-1_20">M. López-Ibáñez and J. Knowles, **Machine Decision Makers as a Laboratory for interactive EMO**, In: A. Gaspar-Cunha et al. (eds) *Evolutionary Multi-Criterion Optimization (EMO 2015)*, 295-309, Springer, 2015</a>
* <a href="https://link.springer.com/chapter/10.1007/978-3-319-45823-6_45"> V. Ojalehto et al., **Towards Automatic Testing of Reference Point Based Interactive Methods**, In: J. Handl et al. (eds) *Parallel Problem Solving from Nature – (PPSN 2016), 483-492, Springer, 2016</a>

## Dealing with risk

**Almost all real-life decisions include risk!**

How to deal with this risk, is an active research topic in optimization.

Basically, there are two competing underlying approaches:
1. scenario-based approaches, where the possible states involving the decision problem are modelled as different scenarios and
2. probabilistic (and similar like fuzzy) approaches, where the possible states are modelled using a distribution (or similar).

There are also different risk measures that can be taken into account.

For example, there is one paper   
  * <a href="http://www.nrcresearchpress.com/doi/pdf/10.1139/cjfr-2014-0443">A. Kangas et al., **Simultaneous optimization of harvest schedule and data quality**, *Canadian Journal of Forest Management*, 45 (8), 2015</a>

where the uncertainty is modelled using scenarios, but the twist is that there is a possibility of measuring the states, which removes a part or all of the uncertainty. 