Change grid search for parallel tuning #796

haifeng-jin · 2022-12-23T02:01:29Z

GridSearch Redesign

Things to consider:

Conditional space.
Lazy discovery of hps.
Concurrent calls.

With these things in mind, even a simple grid search algorithm can be hard.

Overall process

We populate all the value sets (only for the discovered hps during Tuner._populate_initial_space, not including the not discovered ones) at the begining and put them in a queue. This queue is for new populate_space() requests to fetch from it. When a trial is finished, we check if there are more combinations between this finished one and its original next trial. If so, we put all of them into the queue. To check if anything in between two trials, we also need to maintain a linked list of trials sorted in ascending order to get the trial next to it in the combination order. It use linked list because we will keep inserting the new combinations in between trials to maintain the ascending order.

Pseudo code:

# Try to exhaust all combinations between a1 & a2:
while next_combination(a1) < a2:
    new_a1 = next_combination(a1)
    queue.append(new_a1)
    link_list.insert_after(item=new_a1, pos=a1)
    a1 = new_a1

In the actual implementation, we do not push multiple combinations into the queue all at once, but do it in a lazy manner. We keep a queue of trials that waiting to populate its next combination. Everytime we just pick one from the queue and get its next. When a trial is finished, we just push it into the queue.

Compare two sets of values

To achieve the above, we should have a function to compare two set of values, which one is larger in their combination order to sort them in ascending order.

When comparing, only the active values should be considered. We compare from the left most to the right most. The first different value decides who is larger.

If the they have a difference set of values due to different conditional scope activation, it still works since the parent hp should be different, which is on the left of the first different activated hp.

A corner case example

We should also make it work when comparing a finished trial and a ongoing trial (new hps not reported back yet). The above logic should resolve most of the cases, but the case one combination is the prefix of another combination needs a special casing here. In this "prefix" case, we should judge the longer one as larger.

This decision is for the following use case:

class MyHyperModel(keras_tuner.HyperModel):
    def build(self, hp):
        hp.Int("hp1", 0, 5)
        ...
    
    def fit(self, hp, model, **kwargs):
        hp.Int("hp2", 0, 5, default=0)
        ...

In the first round of parallel Oracle.create_trial(), the oracle never know hp2. It populates hp1 from 0 to 5. If the trial with hp1=0 (hp2=0 was discovered during the trial) finished first. It starts to populate hp1=0, hp2=1 to 5 by calling next_combination(), then it would get hp1=1,hp2=0, which is actually equal to hp1=1 (the second trial) as the hp2=0 will be discovered during the trial. So, these 2 trials are not equal in values does not mean they are not equal. In this case, we need to see if hp1=1,hp=0 is greater equal than hp1=1, and we judge that it is.

Here is the general description. Trial a1 < a2 and they are next to each other before a1 got some new hps when finished. a2 keeps running for a long time. a1 start to produce the combinations whose order is between a1 and a2 by changing the values of the newly appeared hps. The new value sets are produce using get next combination mechanism. When a new set of values are produced we need to judge whether all the combinations between a1 and a2 are exhausted, which is decided by if the newly produced values is larger than a2. This is when a2 is the prefix of a newly produced set of values, we have exhausted the values.

With the comparison function above, we achieved the following. Given 2 finished trials, we can tell if there are not tried value sets between them by next_combination(a1) < a2 (If true, there are sets between a1 & a2) even when a2 is not finished.

So even when a trial is finished with new hps, we can start to produce more trials between it and its original next trial. This is good for parallel.

Caveat

Do not use Oracle._tried_so_far, which did not count the new hps of a1 in a2. Even when it is exhausted, the new set will not equal to a2 due to the new hps.

codecov-commenter · 2022-12-23T02:21:16Z

Codecov Report

Base: 95.09% // Head: 95.18% // Increases project coverage by +0.08% 🎉

Coverage data is based on head (9a42ef8) compared to base (bbff4b5).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #796      +/-   ##
==========================================
+ Coverage   95.09%   95.18%   +0.08%     
==========================================
  Files          50       50              
  Lines        3101     3159      +58     
==========================================
+ Hits         2949     3007      +58     
  Misses        152      152

Impacted Files	Coverage Δ
keras_tuner/tuners/gridsearch.py	`100.00% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

haifeng-jin marked this pull request as draft December 27, 2022 19:15

haifeng-jin force-pushed the grid branch from c63987c to f73d8fe Compare January 1, 2023 03:14

haifeng-jin changed the title ~~[WIP] Change grid search for parallel tuning~~ Change grid search for parallel tuning Jan 6, 2023

haifeng-jin marked this pull request as ready for review January 6, 2023 06:53

haifeng-jin marked this pull request as draft January 6, 2023 06:53

haifeng-jin added 7 commits January 7, 2023 00:59

design doc

3619b77

update design

6e287ca

add compare

5a44d59

implemented the design

d43d4b7

add tests

2ab046a

remove unreachable lines

0464937

use linked list instead of list

9a42ef8

haifeng-jin force-pushed the grid branch from 4287bdd to 9a42ef8 Compare January 7, 2023 01:00

haifeng-jin marked this pull request as ready for review January 7, 2023 02:21

haifeng-jin merged commit feb1bdb into master Jan 10, 2023

haifeng-jin deleted the grid branch January 10, 2023 19:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change grid search for parallel tuning #796

Change grid search for parallel tuning #796

haifeng-jin commented Dec 23, 2022 •

edited

codecov-commenter commented Dec 23, 2022 •

edited

Change grid search for parallel tuning #796

Change grid search for parallel tuning #796

Conversation

haifeng-jin commented Dec 23, 2022 • edited

GridSearch Redesign

Things to consider:

Overall process

Compare two sets of values

A corner case example

Caveat

codecov-commenter commented Dec 23, 2022 • edited

Codecov Report

haifeng-jin commented Dec 23, 2022 •

edited

codecov-commenter commented Dec 23, 2022 •

edited