Training models
Our repository makes it incredibly easy to take existing CSV file(s) and train a model to forecast on them. We have some basic tutorials below with more extensive documentation underneath.
We take inspiration from AllenNLP and use JSON configuration files to effectively track experiments. To get started define a configuration file in the following format:
{
"model_name": string,
"model_type": string
"model_params": {
""
},
"dataset_params":
{. "type": string
"training_path": list
"validation_path": list
"test_path": list
"batch_size":integer,
"forecast_history":integer,
"forecast_length":integer
},
"training_params":
{
"criterion":string
"optimizer": string
"optim_params":{
"lr": number
"momentum": number
}
"epochs": integer
}
"GCS":{
"run_save_path": string,
"project_id": string,
"credential_path":string
}
"wandb": {
"name":string,
"tags":List[string]
}
}
Model parameters should include all necessary parameters to initialize your model for instance if your model init function is def __init__(self, hidden_layer_dim, encoder_layers)
then your model_params would be
"model_params":
{
"hidden_layer_dim: integer
"encoder_layers": integer
}
Dataset parameters include things like batch size, file paths, and anything the data loader will need. Data params also allows you define a train_end
and a valid_start
parameters if all your data is in the same CSV file. We also support automated methods to interpolate data.
"dataset_params":
{ "class": "default",
"training_path": file_path,
"validation_path": file_path,
"test_path": file_path,
"batch_size":wandb_config["batch_size"],
"forecast_history":wandb_config["forecast_history"],
"forecast_length":wandb_config["out_seq_length"],
"train_end": int(train_number),
"valid_start":int(train_number+1),
"valid_end": int(validation_number),
"target_col": ["milk_price"],
"relevant_cols": ["milk_price", "gallons_produced"],
"scaler": "StandardScaler",
"interpolate": False
}
Forward Params In 99% of cases this will simply be:
"forward_params":{
}
Inference Parameters
Optional parameters
- early_stopping
"early_stopping":{
"patience": 2
}
- sweep (either true or false). This must be set to true if you plan to use a Wandb sweep.
- wandb (either true or false). Set to false if you are planning on using a sweep. Only set to true if you are using Wandb without a sweep.