- Introduction
- Installation and Requirements
- Quickly Build Your Bilevel Meta-Learning Model
- Modification and Extension
- Authors and Liscense
BOML is a bilevel optimization library in Python for meta learning. Before reading the documentation, you could refer to View on GitHub for a brief introduction about meta learning and BOML.
Here we provide detailed instruction to quickly get down to your research and test performance of popular algorithms and new ideas.
BOML implements various meta learning approaches based on TensorFlow, which is one of the most popular macheine learning platform. Besides, Numpy and basical image processing modules are required for installation.
We also provide requirements.txt as reference for version control.
BOML requires Python 3.5+ and TensorFlow 1.13+.
1. Install from GitHub page:
git clone https://github.com/liuyaohua918/boml.git
python setup.py install
or
pip install -r requirements.txt
2. use pip instruction
pip install boml
or
pip install --upgrade boml
-
Core Modules:
-
load_data
- Related:
- boml.load_data.meta_omniglot
- boml.load_data.meta_mini_imagenet
- boml.load_data.mnist
- ...
- boml.load_data.meta_omniglot
boml.meta_omniglot( folder=DATA_FOLDER, std_num_classes=None, examples_train=None, examples_test=None, one_hot_enc=True, _rand=0, n_splits=None) boml.meta_mini_imagenet( folder=DATA_FOLDER, sub_folders=None, std_num_classes=None, examples_train=None, examples_test=None, resize=84, one_hot_enc=True, load_all_images=True, h5=False):
boml.load_data manages different datasets and generate batches of tasks for training and testing.
- Args:
- folder: str, root folder name. Use os module to modify the path to the datasets. For example, os.environ["DATASETS_FOLDER"] = "../data/"
- std_num_classes: int, number of classes for N-way classification
- examples_train: int, number of examples to be picked in each generated per classes for training (eg .1 shot, examples_train=1)
- examples_test: int, number of examples to be picked in each generated per classes for testing
- one_hot_enc: BOOLEAN, whether to adopt one hot encoding
- _rand: int, random seed or RandomState for generate training, validation, testing meta-datasets split
- n_splits: num of classes per split
- folder: str, root folder name. Use os module to modify the path to the datasets. For example, os.environ["DATASETS_FOLDER"] = "../data/"
- Usage:
dataset = boml.meta_omniglot(args.num_classes, (args.num_examples, args.examples_test))
- Returns: an initialized instance of data loader
- Related:
-
BOMLExperiment
- Aliases:
- boml.load_data.BOMLExperiment
boml.BOMLExperiment( dataset=None, dtype=tf.float32)
boml.BOMLExperiment manages inputs, outputs and task-specific parameters.
- Args:
- dataset: initialized instance of load_data
- dtype: default to be float32
- dataset: initialized instance of load_data
- Attributes:
- x: input placeholder of input for your defined lower level problem
- y: label placeholder of output for yourdefined lower level problem
- x_:input placeholder of input for your defined upper level problem
- y_:label placeholder of output for your defined upper level problem
- model: used to restore the task-specific model
- errors: dictionary to restore defined loss functions of different levels
- scores: dictionary to restore defined accuracies functions
- optimizer: dictonary to restore optimized chosen for inner and outer loop optimization
- x: input placeholder of input for your defined lower level problem
- Usage:
ex = boml.BOMLExperiment(datasets = dataset) ex.errors['training'] = boml.utils.cross_entropy(pred=ex.model.out, label=ex.y, method='MetaRper') ex.scores['accuracy'] = tf.contrib.metrics.accuracy(tf.argmax(tf.nn.softmax(ex.model.out), 1), tf.argmax(ex.y, 1)) ex.optimizer['apply_updates'], _ = boml.BOMLOptSGD(learning_rate=lr0).minimize(ex.errors['training'],var_list=ex.model.var_list)
- Returns: an initialized instance of BOMLExperiment
- Aliases:
-
BOMLOptimizer
- Aliases:
- boml.boml_optimizer.BOMLOptimizer
boml.BOMLOptimizer( Method=None, inner_method=None, outer_method=None, truncate_iter=-1, experiments=[] )
BOMLOptimizer is the main class in
boml
, which takes responsibility for the whole process of model construnction and back propagation.- Args:
- Method: str, define basic method for following training process, it should be included in [
MetaInit
,MetaFeat
],MetaInit
type includes methods likeMAML
,FOMAML
,MT-net
,WarpGrad
;MetaFeat
type includes methods likeBDA
,RHG
,TRHG
,HOAG
,DARTS
; - inner_method: method chosen for solving LLproblem, including [
Trad
,Simple
,Aggr
], MetaInit type choose eitherTrad
for traditional optimization strategies orAggr
for Gradient Aggragation optimization. 'MetaInit' type should chooseSimple
, and set specific parameters for detailed method choices like FOMAML or MT-net. - outer_method: str, method chosen for solving LLproblem, including [
Reverse
,Simple
,DARTS
,Implcit
],MetaFeat
type should chooseSimple
, and set specific parameters for detailed method choices likeFMAML
- truncate_iter: str, specific parameter for
Truncated Gradient(TRHG)
method, defining number of iterations to truncate in the Back propagation process - experiments: list of BOMLExperiment objects that has already been initialized
- Method: str, define basic method for following training process, it should be included in [
- Usage:
ex = boml.BOMLExperiment(boml.meta_omniglot(5,1,15)) boml_ho = boml.BOMLOptimizer( Method='MetaInit', inner_method='Simple', outer_method='Simple', experiments=ex)
- Utility Functions:
- learning_rate(): returns defined inner learning rate
- meta_learning_rate(): returns defined outer learning rate
- Method: return defined method type
- param_dict: return the dictionary that restores general parameters, like use_t,use_warp, output shape of defined model, learn_lr, s, t, alpha, first_order.
- Returns: an initialized instance of BOMLOptimizer
- Aliases:
-
-
Core Built-in functions of BOMLOptimizer:
-
BOMLOptimizer.meta_learner:
- Aliases:
- boml.boml_optimizer.BOMLOptimizer.meta_learner()
boml.boml_optimizer.BOMLOptimizer.meta_learner( _input, dataset, meta_model='V1', name='Hyper_Net', use_t=False, use_warp=False, **model_args )
This method must be called once at first to build meta modules and initialize meta parameters and neural networks.
- Args:
- _input: orginal input for neural network construction;
- dataset: which dataset to use for training and testing. It should be initialized before being passed into the function
- meta_model: model chosen for neural network construction,
V1
for C4L with fully connected layer,V2
for Residual blocks with fully connected layer. - name: name for Meta model modules used for BOMLNet initialization
- use_t: whether to use T layer for C4L neural networks
- use_warp: whether to use Warp layer for C4L neural networks
- model_args: optional arguments to set specific parameters of neural networks.
- Aliases:
-
BOMLOptimizer.base_learner:
- Aliases:
- boml.boml_optimizer.BOMLOptimizer.base_learner()
boml.boml_optimizer.BOMLOptimizer.base_learner( _input, meta_learner, name='Task_Net', weights_initializer=tf.zeros_initializer )
This method has to be called for every experiment and takes responsibility for defining task-specific modules and inner optimizer.
- Args:
- _input: orginal input for neural network construction of task-specific module;
- meta_learner: returned value of meta_learner function, which is a instance of BOMLNet or its child classes
- name: name for Base model modules used for BOMLNet initialization
- weights_initializer: initializer function for task_specific network, called by 'MetaInit' method
- Returns: task-specific model part
- Aliases:
-
BOMLOptimizer.ll_problem:
- Aliases: - boml.boml_optimizer.BOMLOptimizer.ll_problem()
boml.boml_optimizer.BOMLOptimizer.ll_problem( inner_objective, learning_rate, T, inner_objective_optimizer='SGD', outer_objective=None, learn_lr=False, alpha_init=0.0, s=1.0, t=1.0, learn_alpha=False, learn_st=False, learn_alpha_itr=False, var_list=None, init_dynamics_dict=None, first_order=False, loss_func=utils.cross_entropy, momentum=0.5, beta1=0.0, beta2=0.999, regularization=None, experiment=None, scalor=0.0, **inner_kargs )
After construction of neural networks, solutions to lower level problems should be regulated in ll_problem.
- Args:
- inner_objective: loss function for the inner optimization problem
- learning_rate: step size for inner loop optimization
- T: numbers of steps for inner gradient descent optimization
- inner_objective_optimizer: Optimizer type for the outer parameters, should be in list [
SGD
,Momentum
,Adam
] - outer_objective: loss function for the outer optimization problem, which need to be claimed in BDA agorithm
- alpha_init: initial value of ratio of inner objective to outer objective in BDA algorithm
- s,t: coefficients of aggregation of inner and outer objectives in BDA algorithm, default to be 1.0
- learn_alpha: specify parameter for BDA algorithm to decide whether to initialize alpha as a hyper parameter
- learn_alpha_itr: parameter for BDA algorithm to specify whether to initialize alpha as a vector, of which every dimension's value is step-wise scale factor fot the optimization process
- learn_st: specify parameter for BDA algorithm to decide whether to initialize s and t as hyper parameters
- first_order: specific parameter to define whether to use implement first order MAML, default to be
FALSE
- loss_func: specifying which type of loss function is used for the maml-based method, which should be consistent with the form to compute the inner objective
- momentum: specific parameter for Optimizer.BOMLOptMomentum to set initial value of momentum
- regularization: whether to add regularization terms in the inner objective
- experiment: instance of BOMLExperiment to use in the Lower Level Problem, especifially needed in the
MetaFeat
type of method. - var_list: optional list of variables (of the inner optimization problem)
- inner_kargs: optional arguments to pass to
boml.boml_optimizer.BOMLOptimizer.compute_gradients
- Returns: task-specific model part
-
BOMLOptimizer.ul_problem
- Aliases:
- boml.boml_optimizer.BOMLOptimizer.ul_problem()
boml.boml_optimizer.BOMLOptimizer.ul_Problem( outer_objective, meta_learning_rate, inner_grad, meta_param=None, outer_objective_optimizer='Adam', epsilon=1.0, momentum=0.5, global_step=None )
- boml.boml_optimizer.BOMLOptimizer.ul_problem()
This method define upper level problems and choose optimizer to optimize meta parameters, which should be called afer ll_problem.
- Args:
- outer_objective: scalar tensor for the outer objective
- meta_learning_rate: step size for outer loop optimization
- inner_grad: Returned value of boml.BOMLOptimizer.LLProblem()
- meta_param: optional list of outer parameters and model parameters
- outer_objective_optimizer: Optimizer type for the outer parameters, should be in list [
SGD
,Momentum
,Adam
] - epsilon: Float, cofffecients to be used in DARTS algorithm
- momentum: specific parameters to be used to initialize
Momentum
algorithm
- Returns:meta_param list, used for debugging
- Aliases:
-
aggregate_all:
- Aliases:
- boml.boml_optimizer.BOMLOptimizer.aggregate_all()
boml.boml_optimizer.BOMLOptimizer.aggregate_all( aggregation_fn=None, gradient_clip=None )
- Args:
- aggregation_fn:Optional operation to aggregate multiple outer_gradients (for the same meta parameter),by (default: reduce_mean)
- gradient_clip: optional operation to clip the aggregated outer gradients
- Returns: None Finally, aggregate_all has to be called to aggregate gradient of different tasks, and define operations to apply outer gradients and update meta parametes.
- Aliases:
-
run:
- Aliases:
- boml.boml_optimizer.BOMLOptimizer.run()
boml.boml_optimizer.BOMLOptimizer.run( inner_objective_feed_dicts=None, outer_objective_feed_dicts=None, session=None, _skip_hyper_ts=False, _only_hyper_ts=False, callback=None )
- Args:
- inner_objective_feed_dicts: an optional feed dictionary for the inner problem. Can be a function of step, which accounts for, e.g. stochastic gradient descent.
- outer_objective_feed_dicts: an optional feed dictionary for the outer optimization problem (passed to the evaluation of outer objective). Can be a function of hyper-iterations steps (i.e. global variable), which may account for, e.g. stochastic evaluation of outer objective.
- session: optional session
- callback: optional callback function of signature (step (int), feed_dictionary,
tf.Session
) -> None that are called after every forward iteration.
- Returns: None
- Aliases:
-
-
Simple Running Example
import boml from boml import utils from test_script.script_helper import * dataset = boml.load_data.meta_omniglot( std_num_classes=args.classes, examples_train=args.examples_train, examples_test=args.examples_test, ) # create instance of BOMLExperiment for ong single task ex = boml.BOMLExperiment(dataset)
boml_ho = boml.BOMLOptimizer( method="MetaInit", inner_method="Simple", outer_method="Simple" ) meta_learner = boml_ho.meta_learner(_input=ex.x, dataset=dataset, meta_model="V1") ex.model = boml_ho.base_learner(_input=ex.x, meta_learner=meta_learner)
loss_inner = utils.cross_entropy(pred=ex.model.out, label=ex.y) accuracy = utils.classification_acc(pred=ex.model.out, label=ex.y) inner_grad = boml_ho.ll_problem( inner_objective=loss_inner, learning_rate=args.lr, T=args.T, experiment=ex, var_list=ex.model.var_list, )
loss_outer = utils.cross_entropy(pred=ex.model.re_forward(ex.x_).out, label=ex.y_) boml_ho.ul_problem( outer_objective=loss_outer, meta_learning_rate=args.meta_lr, inner_grad=inner_grad, meta_param=tf.get_collection(boml.extension.GraphKeys.METAPARAMETERS), )
# Only need to be called once after all the tasks are ready boml_ho.aggregate_all()
with tf.Session() as sess: tf.global_variables_initializer().run(session=sess) for itr in range(args.meta_train_iterations): # Generate the feed_dict for calling run() everytime train_batch = BatchQueueMock( dataset.train, 1, args.meta_batch_size, utils.get_rand_state(1) ) tr_fd, v_fd = utils.feed_dict(train_batch.get_single_batch(), ex) # Meta training step boml_ho.run(tr_fd, v_fd) if itr % 100 == 0: print(sess.run(loss_inner, utils.merge_dicts(tr_fd, v_fd)))
- Extensible Base Calsses and Modules
- BOMLNet
- Aliases:
- boml.networks.BOMLNet
- Methods to be overridden:
- forward(): uses defined convolutional neural networks with initial input
- re_forward(new_input): reuses defined convolutional with new input and update the output results
- create_outer_parameters():
this method creates parameters of upper level problems, and adds them to defined collections named
METAPARAMETERS
- Args:
- var_collections: collections to restore meta parameters created in the so called scope
- Returns: dictionary that indexes the outer parameters
- Args:
- create_model_parameters():
this method creates model parameters of upper level problems like
t-layer
orWarp-layer
, and adds them to define collections calledMETAPARAMETERS
- Utility functions:
- get_conv_weight(boml_net, layer, initializer):
- Args:
- boml_net: initialized instance of BOMLNet
- layer: int32, the layer-th weight of convolutional block to be created
- initializer: the tensorflow initializer used to initialize the filters -Returns: created parameter
- Args:
- get_bias_weight(boml_net, layer, initializer):
- Args:
- boml_net: initialized instance of BOMLNet
- layer: int32, the layer-th bias of convolutional block to be created
- initializer: the tensorflow initializer used to initialize the bias
- Returns: created parameter
- Args:
- get_identity(dim, name, conv=True):
- Args:
- dim: the dimension of identity metrix
- name: name to initialize the metrix
- conv: BOOLEAN , whether initialize the metrix or initialize the real value, default to be True
- Returns: the created parameter
- Args:
- conv_block(boml_net, cweight, bweight):
uses defined convolutional weight and bias with current ouput of boml_net
- Args:
- boml_net: initialized instance of BOMLNet
- cweight: parameter of convolutional filter
- bweight: parameter of bias for convolutional neural networks
- Args:
- conb_block_t(boml_net, conv_weight, conv_bias, zweight):
uses defined convolutional weight, bias, and weights of t layer with current ouput of boml_net
- Args:
- boml_net: initialized instance of BOMLNet
- cweight: parameter of convolutional filter
- bweight: parameter of bias for convolutional neural networks
- Args:
- conv_block_warp(boml_net, cweight, bweight, zweight, zbias):
uses defined convolutional weight, bias and filters of warp layer with current ouput of boml_net
- Args:
- boml_net: initialized instance of BOMLNet
- cweight: parameter of convolutional filter
- bweight: parameter of bias for convolutional neural networks
- Args:
- get_conv_weight(boml_net, layer, initializer):
- Aliases:
- BOMLInnerGrad
- Aliases:
- boml.LLProblem.BOMLInnerGrad
- Methods to be overridden:
- compute_gradients(boml_opt, loss_inner, loss_outer=None,inner_method=None, param_dict=OrderedDict(), var_list=None, **inner_kargs):
delivers equivalent functionality to the method called compute_gradients() in
tf.train.Optimizer
- Args:
- boml_opt: instance of boml.optimizer.BOMLOpt, which is automatically create by the method in
boml.boml_optimizer.BOMLOptimizer
- loss_inner: inner objective, which could be passed by
boml.boml_optimizer.BOMLOptimizer.ll_problem
or called directly. - loss_outer: outer objective,which could be passed automatically by
boml.boml_optimizer.BOMLOptimizer.ll_problem
, or called directly - param_dict: automatically passed by 'boml.boml_optimizer.BOMLOptimizer.ll_problem'
- var_list: list of lower level variables
- inner_kargs: optional arguments, which are same as
tf.train.Optimizer
- boml_opt: instance of boml.optimizer.BOMLOpt, which is automatically create by the method in
- Returns:self
- compute_gradients(boml_opt, loss_inner, loss_outer=None,inner_method=None, param_dict=OrderedDict(), var_list=None, **inner_kargs):
delivers equivalent functionality to the method called compute_gradients() in
- Utility functions:
- apply_updates():
Descent step, as returned by
tf.train.Optimizer.apply_gradients
. - initialization(): a list of operations that return the values of the state variables for this learning dynamics after the execution of the initialization operation. If an initial dynamics is set, then it also executed.
- state(): A generator for all the state variables (optimized variables and possibly auxiliary variables) being optimized
- apply_updates():
Descent step, as returned by
- Aliases:
- BOMLOuterGrad
- Aliases:
- boml.ul_problem.BOMLOuterGrad
- Methods to be overridden:
- compute_gradients(outer_objective, bml_inner_grad, meta_param=None):
- Args:
- bml_inner_grad: OptimzerDict object resulting from the inner objective optimization.
- outer_objective: A loss function for the outer parameters (scalar tensor)
- meta_param: Optional list of outer parameters to consider. If not provided will get all variables in the hyperparameter collection in the current scope.
- Returns: list of meta parameters involved in the computation
- Args:
- apply_gradients( inner_objective_feed_dicts=None, outer_objective_feed_dicts=None, initializer_feed_dict=None, param_dict=OrderedDict(), train_batches=None, experiments= [], global_step=None, session=None, online=False, callback=None)
- Args:
- inner_objective_feed_dicts: Optional feed dictionary for the inner objective
- outer_objective_feed_dicts: Optional feed dictionary for the outer objective (note that this is not used in ForwardHG since hypergradients are not variables)
- initializer_feed_dict: Optional feed dictionary for the inner objective
- global_step: Optional global step for the optimization process
- param_dict: dictionary of parameters passed by
boml.boml_optimizer.BOMLOptimizer
- train_batches: mini batches of data, needed when Reptile Algorithm are implemented
- session: Optional session (otherwise will take the default session)
- experiments: list of instances of
BOMLExperiment
, needed when Reptile Algorithm are implemented - callback: callback funciton for the forward optimization
- Args:
- compute_gradients(outer_objective, bml_inner_grad, meta_param=None):
- Utility functions:
- outer_grads_and_vars(meta_param=None, aggregation_fn=None, gradient_clip=None): Method for getting outergradient and outer parameters as required by apply_gradient methods from tensorflow optimizer. - Args: - meta_param: Optional list of outer parameters to consider. If not provided will get all variables in the hyperparameter collection in the current scope. - aggregation_fn: Optional operation to aggregate multiple outer gradients (for the same parameter of meta-learner), by default reduce_mean - gradient_clip: Optional operation like clipping to be applied.
- initialization(): Returns groups of operation that initializes the variables in the computational graph
- state(): returns current state values of lower level variables
- Aliases:
- BOMLOpt
- Aliases:
- boml.optimizer.BOMLOpt
- Methods to be overridden:
- minimize(loss_inner, var_list=None, global_step=None, gate_gradients=tf.train.Optimizer.GATE_OP,
aggregation_method=None, colocate_gradients_with_ops=False, name=None, grad_loss=None):
- Returns: an
bml_inner_grad
object relative to this minimization, same astf.train.Optimizer.minimize.
- Returns: an
- minimize(loss_inner, var_list=None, global_step=None, gate_gradients=tf.train.Optimizer.GATE_OP,
aggregation_method=None, colocate_gradients_with_ops=False, name=None, grad_loss=None):
- Utility functions:
- learning_rate(): - Returns: the step size of this BOMLOptimizer
- Utility Functions
- get_dafault_session(): get and return the default tensorflow session
- BatchQueueMock: generates batches of taskes and feed them into corresponding placeholders.
- Aliases:
- BOMLNet
- Utility Modules:
- get_default_session(): gets and returns the default tensorflow session
- BatchQueueMock(): responsible for generates batches of taskes and feed them into corresponding placeholders.
- cross_entropy(pred, label):
return loss function that matches different methods in [
MetaFeat
,MetaInit
] - vectorize_all(var_list, name=None): Vectorize the variables in the list named var_list with the given name
- remove_from_collectinon(key,*var_list): removes the variables in the var_list according to the given Graph key
- set_gpu(): set primary parameters of GPU configuration.
MIT License
Copyright (c) 2020 Yaohua Liu
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.