Combining some features easy to use of Keras-Tensorflow, Pytorch, Caffe for training and testing workflow.
- optimized well and can make training process very fast.
- easy to start for beginner.
- supported with many tools in engineering.
- easy to customize and debug.
- ops supported flexible with numpy.
- Config.json is really easy to use when to modify training configurations but don't want to change code.
- python == 3.8.11
- tensorflow == 2.3.0 (auto with Keras) (pip install tensorflow==2.3)
- tensorboard == 2.6.0 (pip install tensorboard)
- pydot == 1.4.2 (for save model_info.png) (pip install pydot)
- graphviz == 0.17 (for save model_info.png) (pip install graphviz)
Workfolder/
├── Dataset/CatDog/ - default dataset dir
| |
| ├── train/
| | ├── Cat/ - pictures of cat
| | └── Dog/ - pictures of dog
| |
| └── test/
| ├── Cat/ - pictures of cat
| └── Dog/ - pictures of dog
|
├── Saved/CatDog/2021111_111111/ - default saved dir for log and model
| |
| ├── log/ - tensorboard log dir
| └── models/ - model will be saved here
|
├── KerasTemplate/
| │
| ├── train.py - main script to start training
| ├── test.py - evaluation of trained model
| │
| ├── config.json - holds configuration for training
| ├── parse_config.py - class to handle config file
| │
| ├── Base/ - base classes and common / customized callback functions
| │ ├── base_data_loader.py
| │ ├── base_model.py
| │ ├── base_trainer.py
| │ └── callback.py
| │
| ├── DataLoader/ - anything about data loading goes here
| │ └── data_loader.py
| │
| ├── Model/ - models, losses, and metrics
| │ ├── model.py
| │ ├── metric.py
| │ └── loss.py
| │
| ├── Trainer/ - trainers
| │ └── trainer.py
| │
| └── utils/ - small utility functions
| └── util.py
{
"name": "CatDog", // training session name
"gpu":{ // GPU settings
"devices" : "0, 1", // GPU devices choose if you have multi gpus
"memory_limit" : true // GPU memory limit for run more programs
},
"data_loader": {
"type": "CatDogDataLoader", // selecting data loader
"args":{
"data_root_dir": "../Dataset/CatDog/train", // where the dataset is
"validation_split": 0.2, // size of validation dataset. float(portion)
"dataset_cache_dir" : "../Dataset/CatDog/dataset_cache", // caching dataset in binary file for accelerating data load process
"cache_clean" : true, // clean cache files of last training experiment
"batch_size" : 32, // batch size
"shuffle": true // shuffle training data before splitting
}
},
"model": {
"type": "SimpleModel", // name of model architecture to train
"show": true, // show model information and save model architecture in model_infor.png
"args": { // some configrations for model
"input_h" : 224,
"input_w" : 224,
"input_c" : 3,
"num_classes" : 2
}
},
"optimizer": {
"type": "Adam", // choose optimizer
"args":{
"learning_rate" : 0.001, // set learning rate
"amsgrad": true
}
},
"loss": {
"type": "cce", // choose loss
"args":{}
},
"metrics": [
"acc" // choose metric
],
"trainer": {
"pretrained_weights_path" : null, // set pretrained weights path for transfer learning or fine-tune
"checkpoint" : null, // checkpoint path for resume training
"epochs": 20, // set epochs
"saved_dir": "../Saved", // default saved dir for logs and models
"callbacks":[ // set callbacks for calling during the training process
{
"used" : true, // use or not
"type" : "reduce_lr", // callback name
"args" : { // params for callback function, you can go to tensorflow webpage for more detail
"monitor" : "loss",
"factor" : 0.1,
"patience" : 5,
"verbose" : 1,
"mode" : "auto",
"min_delta" : 1e-4
}
},
{
"used" : true,
"type" : "early_stopping",
"args" : {
"monitor" : "val_loss",
"patience" : 6,
"verbose" : 1,
"mode" : "auto",
"restore_best_weights" : true
}
},
{
"used" : true,
"type" : "model_checkpoint",
"args" : {
"monitor" : "val_loss",
"verbose" : 1,
"save_best_only" : true,
"save_weights_only" : true
}
},
{
"used" : true,
"type" : "tensorboard",
"args" : {
"log_dir": null,
"histogram_freq": 1
}
}
]
}
}
Run script python train.py -c /path/to/config/config.json
Run script python test.py -w /path/to/weights/weights.h5
When finish training, run script tensorboard --logdir /path/to/Saved/CatDog/log
, then you can open the link http://localhost:6006/ in your browser to see the log
Customization in two Steps
0. Check examples I offered first if you get in trouble
1. Implement in code
2. Modify config json file
-
BaseDataLoader
handle thetf.data.Datasets
, batch size, cache and prefetch
-
- inherit
BaseDataLoader
, handle shuffling and validation split yourself
- inherit
-
- modify
type
andargs
in the part of data_loader in config.json
- modify
-
BaseModel
handle thebuild_model
, that means compile model with loss, metric, and optimizer
-
- inherit
BaseModel
, which is subclass oftf.keras.models.Model
, implement the architecture of model yourself
- inherit
-
- modify
type
andargs
in the part of model in config.json
- modify
-
- implement the loss function yourself with inputs as
(y_true, y_pred)
, or you can use the loss functions keras provided
- implement the loss function yourself with inputs as
-
- modify
type
andargs
in the part of loss in config.json
- modify
-
- implement the metric function yourself with inputs as
(y_true, y_pred)
, or you can use the metric functions keras provided
- implement the metric function yourself with inputs as
-
- add the function name in the metric list in the part of metric in config.json
-
BaseTrainer
handle thetraning process and callback
, that means you can do something you want during the training process
-
- inherit
BaseTrainer
, implement the logic you want during the training process, you can also implement the training process from scratch without usingfit()
method keras provided
- inherit
-
- modify
type
andargs
in the part of trainer in config.json
- modify
-
- implement the callback function yourself inherit
tf.keras.callbacks.Callback
, return its name to function like I do
- implement the callback function yourself inherit
-
- add your callback into callback list in config.json as follows, then modify
type
andargs
in the part of your callback in config.json
- add your callback into callback list in config.json as follows, then modify
"callbacks":[
{
"used" : true,
"type" : "reduce_lr",
"args" : {
"monitor" : "loss",
"factor" : 0.1,
"patience" : 5,
"verbose" : 1,
"mode" : "auto",
"min_delta" : 1e-4
}
},
------------------add your callback as follows-------------------
{
"used" : true,
"type" : "your_callback_function_name",
"args" : {}
}
-----------------------------------------------------------------
]
Threoretically, you can modified any thing you want, but known with how I thought may help
- parameters, all params passed on by variable
config
in code which is dictionary like - every sub module in
config.json
are dict like withtype, args
attribute, the former is module name, the latter will be params passed the to module in code when useconfig.init_obj()
method to initialize the module instance
- Chinese documentation
- Customized Optimizers
- Support more tensorboard functions