-
Notifications
You must be signed in to change notification settings - Fork 270
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
new docs, optimizer rehaul and AutoML
- there is now a new sub-module `autom8` inside which several AutoML features live - `AutoParams` automatically generates a parameter dictionary and streamlines its manipulation before experiment - `AutoModel` automatically creates a input model for `Scan()` which is fully wired for use with `AutoParams` or other experiment with comprehensive search - `AutoScan` leverage `AutoParams` and `AutoModel` to reduce the whole experiment into a single line of code - `AutoPredict` takes the results of `Scan` (or `AutoScan`), picks best model candidates, evaluates the candidates, picks the winner, and makes predictions with it on input data - the new docs are now completed - added `local_strategy` to reduction strategies, which allows making changes to the parameter space from local system while the experiment is running - added `pearson` and `kendall` reduction strategies - streamlined the way custom strategies can be added - completely rebuilt `correlation` strategy, including the underlying statistical approach - added a helper function `cols_to_multilabel` for custom reducers - added a new generator `SequenceGenerator` - removed redundant files from the repo - tests are updated in regards to the changes but not yet new features
- Loading branch information
1 parent
64a31ab
commit eed709e
Showing
57 changed files
with
1,641 additions
and
492 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# AutoModel | ||
|
||
`AutoModel` provides a meaningful way to test several network architectures in an automated manner. Currently there are five supported architectures: | ||
|
||
- conv1d | ||
- lstm | ||
- bidirectional_lstm | ||
- simplernn | ||
- dense | ||
|
||
`AutoModel` creates an input model for Scan(). Optimized for being used together with `AutoParams()` and expects one or more of the above architectures to be included in params dictionary, for example: | ||
|
||
```python | ||
|
||
p = {... | ||
'networks': ['dense', 'conv1d', 'lstm'] | ||
...} | ||
|
||
``` | ||
|
||
## AutoModel Arguments | ||
|
||
Argument | Input | Description | ||
--------- | ------- | ----------- | ||
`task` | str or None | `binary`, `multi_label`, `multi_class`, or `continuous` | ||
`metric` | None or list | One or more Keras metric (functions) to be used in the model | ||
|
||
Setting `task` effects which various aspects of the model and should be set according to the specific prediction task, or set to `None` in which case `metric` input is required. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
# AutoParams | ||
|
||
`AutoParams()` allows automated generation of comprehensive parameter dictionary to be used as input for `Scan()` experiments as well as a streamlined way to manipulate parameter dictionaries. | ||
|
||
#### to automatically create a params dictionary | ||
|
||
```python | ||
p = talos.Autom8.AutoParams().params | ||
|
||
``` | ||
NOTE: The above example yields a very large permutation space so configure `Scan()` accordingly with `fraction_limit`. | ||
|
||
#### an alternative way where a class object is returned | ||
|
||
```python | ||
param_object = talos.Autom8.AutoParams() | ||
|
||
``` | ||
|
||
Now various properties can be accessed through `param_object`, these are detailed below. For example: | ||
|
||
#### modifying a single parameter in the params dictionary | ||
|
||
```python | ||
param_object.batch_size(bottom_value=20, max_value=100, steps=10) | ||
``` | ||
|
||
Now the modified params dictionary can be accessed through `params_object.params` | ||
|
||
#### to append a current parameter dictionary | ||
|
||
```python | ||
params_dict = talos.Autom8.AutoParams(p, task='multi_label').params | ||
|
||
``` | ||
NOTE: Note, when the dictionary is created for a prediction task other than 'binary', the `task` argument has to be declared accordingly (`binary`, `multi_label`, `multi_class`, or `continuous`). | ||
|
||
## AutoParams Arguments | ||
|
||
Argument | Input | Description | ||
--------- | ------- | ----------- | ||
`params` | dict or None | If `None` then a new parameter dictionary is created | ||
`task` | str | 'binary', 'multi_class', 'multi_label', or 'continuous' | ||
`replace` | bool | Replace current dictionary entries with new ones. | ||
`auto` | bool | automatically generate or append params dictionary with all available parameters. | ||
`network` | network | If `True` several model architectures will be added | ||
|
||
## AutoParams Properties | ||
|
||
The **`params`** property returns the parameter dictionary which can be used as an input to `Scan()`. | ||
|
||
The **`resample_params`** accepts `n` as input and resamples the params dictionary so that n values remain for each parameter. | ||
|
||
All other properties relate with manipulating individual parameters in the parameter dictionary. | ||
|
||
**`activations`** For controlling the corresponding parameter in the parameters dictionary. | ||
|
||
**`batch_size`** For controlling the corresponding parameter in the parameters dictionary. | ||
|
||
**`dropout`** For controlling the corresponding parameter in the parameters dictionary. | ||
|
||
**`epochs`** For controlling the corresponding parameter in the parameters dictionary. | ||
|
||
**`kernel_initializer`** For controlling the corresponding parameter in the parameters dictionary. | ||
|
||
**`last_activation`** For controlling the corresponding parameter in the parameters dictionary. | ||
|
||
**`layers`** For controlling the corresponding parameter (i.e. `hidden_layers`) in the parameters dictionary. | ||
|
||
**`losses`** For controlling the corresponding parameter in the parameters dictionary. | ||
|
||
**`lr`** For controlling the corresponding parameter in the parameters dictionary. | ||
|
||
**`networks`** For controlling the Talos present network architectures (`dense`, `lstm`, `bidirectional_lstm`, `conv1d`, and `simplernn`). NOTE: the use of preset networks requires the use of the input model from `AutoModel()` for `Scan()`. | ||
|
||
**`neurons`** For controlling the corresponding parameter (i.e. `first_neuron`) in the parameters dictionary. | ||
|
||
**`optimizers`** For controlling the corresponding parameter in the parameters dictionary. | ||
|
||
**`shapes`** For controlling the Talos preset network shapes (`brick`, `funnel`, and `triangle`). | ||
|
||
**`shapes_slope`** For controlling the shape parameter with a floating point value to set the slope of the network from input layer to output layer. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
# AutoPredict | ||
|
||
`AutoPredict()` automatically handles the process of finding the best models from a completed `Scan()` experiment, evaluates those models, and uses the winning model to make predictions on input data. | ||
|
||
```python | ||
scan_object = talos.autom8.AutoPredict(scan_object, x_val=x, y_val=y, x_pred=x) | ||
``` | ||
|
||
NOTE: the input data must be in same format as 'x' that was used in `Scan()`. | ||
Also, `x_val` and `y_val` should not have been exposed to the model during the | ||
`Scan()` experiment. | ||
|
||
`AutoPredict()` will add four new properties to `Scan()`: | ||
|
||
**`preds_model`** contains the winning Keras model (function) | ||
**`preds_parameters`** contains the hyperparameters for the selected model | ||
**`preds_probabilities`** contains the prediction probabilities for `x_pred` | ||
**`predict_classes`** contains the predicted classes for `x_pred`. | ||
|
||
## AutoPredict Arguments | ||
|
||
Argument | Input | Description | ||
--------- | ------- | ----------- | ||
`scan_object` | class object | the class object returned from `Scan()` | ||
`x_val` | array or list of arrays | validation data features | ||
`y_val` | array or list of arrays | validation data labels | ||
`y_pred` | array or list of arrays | prediction data features | ||
`n` | int | number of promising models to be included in the evaluation process | ||
`metric` | None | the metric against which the validation is performed | ||
`folds` | None | number of folds to be used for cross-validation | ||
`shuffle` | None | if data is shuffled before splitting | ||
`average` | str | 'binary', 'micro', 'macro', 'samples', or 'weighted' | ||
`asc` | None | should be True if metric is a loss |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# AutoScan | ||
|
||
`AutoScan()` provides a streamlined way for conducting a hyperparameter search experiment with any dataset. It is particularly useful for early exploration as with default settings `AutoScan()` casts a very broad parameter space including all common hyperparameters, network shapes, sizes, as well as architectures | ||
|
||
Configure the `AutoScan()` experiment and then use the property `start` in the returned class object to start the actual experiment. | ||
|
||
```python | ||
auto = talos.autom8.AutoScan(task='binary', max_param_values=2) | ||
auto.start(x, y, experiment_name='testing.new', fraction_limit=0.001) | ||
``` | ||
|
||
NOTE: `auto.start()` accepts all `Scan()` arguments. | ||
|
||
## AutoScan Arguments | ||
|
||
Argument | Input | Description | ||
--------- | ------- | ----------- | ||
`task` | str or None | `binary`, `multi_label`, `multi_class`, or `continuous` | ||
`max_param_values` | int | Number of parameter values to be included | ||
|
||
Setting `task` effects which various aspects of the model and should be set according to the specific prediction task, or set to `None` in which case `metric` input is required. | ||
|
||
## AutoScan Properties | ||
|
||
The only property **`start`** starts the actual experiment. `AutoScan.start()` accepts the following arguments: | ||
|
||
Argument | Input | Description | ||
--------- | ------- | ----------- | ||
`x` | array or list of arrays | prediction features | ||
`y` | array or list of arrays | prediction outcome variable | ||
`kwargs` | arguments | any `Scan()` argument can be passed into `AutoScan.start()` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Custom Reducer | ||
|
||
A custom reduction strategy can be created and dropped into Talos. Read more about the reduction principle | ||
|
||
There are only two criteria to meet: | ||
|
||
- The input of the custom strategy is 2-dimensional | ||
- The output of the custom strategy is in the form: | ||
|
||
```python | ||
return label, value | ||
``` | ||
Here `value` is any hyperparameter value, and `label` is the name of any hyperparameter. Any arbitrary strategy can be implemented, as long as the input and output criteria are met. | ||
|
||
The file containing the strategy can then be placed in `/reducers` in Talos package, and corresponding changes made into `/reducers/reduce_run.py` to make the strategy available in `Scan()`. Having done this, the reduction strategy is now available as per the example [above](#probabilistic-reduction). | ||
|
||
A [pull request](https://github.com/autonomio/talos/pulls) is highly encouraged once a beneficial reduction strategy has been successfully added. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# AutoML | ||
|
||
Performing an AutoML style hyperparameter search experiment with Talos could not be any easier. | ||
|
||
The single-file code example can be found [here](Examples_AutoML_Code.md). | ||
|
||
### Imports | ||
|
||
```python | ||
import talos | ||
import wrangle | ||
``` | ||
|
||
### Loading Data | ||
```python | ||
x, y = talos.templates.datasets.cervical_cancer() | ||
|
||
# we spare 10% of data for testing later | ||
x, y, x_test, y_test = wrangle.array_split(x, y, .1) | ||
|
||
# then validation split | ||
x_train, y_train, x_val, y_val = wrangle.array_split(x, y, .2) | ||
``` | ||
|
||
`x` and `y` are expected to be either numpy arrays or lists of numpy arrays and same applies for the case where `x_train`, `y_train`, `x_val`, `y_val` is used instead. | ||
|
||
### Defining the Model | ||
|
||
In this case there is no need to define the model. `talos.autom8.AutoModel()` is used behind the scenes, where several model architectures fully wired for Talos are found. We simply initiate the `AutoScan()` object first: | ||
|
||
```python | ||
autom8 = talos.autom8.AutoScan('binary', 5) | ||
``` | ||
|
||
### Parameter Dictionary | ||
|
||
There is also no need to worry about the parameter dictionary. This is handled in the background with `AutoParams()`. | ||
|
||
|
||
### Scan() | ||
|
||
The `Scan()` itself is started through the **`start`** property of the `AutoScan()` class object. | ||
|
||
```python | ||
autom8.start(x=x_train, | ||
y=y_train, | ||
x_val=x_val, | ||
y_val=y_val, | ||
fraction_limit=0.000001) | ||
``` | ||
We pass data here just like we would do it in `Scan()` normally. Also, you are free to use any of the `Scan()` arguments here to configure the experiment. Find the description for all `Scan()` arguments [here](Scan.md#scan-arguments). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
[BACK](Examples_AutoML.md) | ||
|
||
# AutoML | ||
|
||
```python | ||
|
||
x, y = talos.templates.datasets.cervical_cancer() | ||
|
||
# we spare 10% of data for testing later | ||
x, y, x_test, y_test = wrangle.array_split(x, y, .1) | ||
|
||
# then validation split | ||
x_train, y_train, x_val, y_val = wrangle.array_split(x, y, .2) | ||
|
||
autom8 = talos.autom8.AutoScan('binary', 5) | ||
|
||
autom8.start(x=x_train, | ||
y=y_train, | ||
x_val=x_val, | ||
y_val=y_val, | ||
fraction_limit=0.000001) | ||
``` |
Oops, something went wrong.