How to switch batch size during training? #153

cclvr · 2021-12-14T02:33:51Z

@takuseno , firstly thanks a lot for your clear and complete code base for offline RL. Recently I try to conduct new algorithms based on this code base, and I want to switch batch size during the training process, but I don't know how to modify it with the smallest changes . Could you help to give some clue? Looking forward to your replay.

takuseno · 2021-12-14T09:27:48Z

@cclvr Hi, thanks for the issue. You can do it by setting callback.

import d3rlpy

cql = d3rlpy.algos.CQL(batch_size=256)

def callback(algo, epoch, total_step):
    if total_step > 10000:
        algo.set_params(batch_size=1024)

cql.fit(..., callback=callback)

cclvr · 2021-12-18T08:58:31Z

@cclvr Hi, thanks for the issue. You can do it by setting callback.

import d3rlpy

cql = d3rlpy.algos.CQL(batch_size=256)

def callback(algo, epoch, total_step):
    if total_step > 10000:
        algo.set_params(batch_size=1024)

cql.fit(..., callback=callback)

@takuseno Thanks for your kind and patient reply. I have tried this way but it doesn't work. Since I downloaded this codebase about 4 months ago, so I' m wondering whether the feature existed at that time? Thanks again for your patience.

takuseno · 2021-12-18T12:18:48Z

How did you confirm it didn't work? Probably, you can print the value during callback.

def callback(algo, epoch, total_step):
    if total_step > 10000:
        algo.set_params(batch_size=1024)
    print(algo.batch_size)

cclvr · 2021-12-18T13:34:05Z

How did you confirm it didn't work? Probably, you can print the value during callback.
def callback(algo, epoch, total_step):
    if total_step > 10000:
        algo.set_params(batch_size=1024)
    print(algo.batch_size)

@takuseno , I print the batch_size during callback and it does change. But when I print the shape of batch.observations in functions e.g. in compute_critic_loss() of cql_impl.py, it doesn't change. And the time cost of one epoch is the same for different batch_size, which also means that batch_size is not set successfully.

takuseno · 2021-12-18T13:41:05Z

Ah, yes. You're right. Currently, the batch sampling is done at Iterator class.

d3rlpy/d3rlpy/iterators/base.py

Line 46 in 8eb11db

def __next__(self) -> TransitionMiniBatch:

If you don't mind, you can hack around there. However, there is no way to change the mini-batch size without the hack for now. Sorry for the inconvenience.

cclvr · 2021-12-18T13:45:40Z

Ah, yes. You're right. Currently, the batch sampling is done at Iterator class.

d3rlpy/d3rlpy/iterators/base.py

Line 46 in 8eb11db

def __next__(self) -> TransitionMiniBatch:

If you don't mind, you can hack around there. However, there is no way to change the mini-batch size without the hack for now. Sorry for the inconvenience.

OK thanks, it is good enough right now.

jamartinh · 2021-12-27T16:26:33Z

Hi, try this instead.

fitter = algo.fitter(
    dataset,      
    n_epochs=10,
    verbose=False,
    tensorboard_dir=None,
    save_metrics=False,
    shuffle=True,
)

for epoch, metrics in fitter:
    algo.batch_size += 1

Let me know what happends

cclvr · 2021-12-28T03:04:22Z

Hi, try this instead.

fitter = algo.fitter(
    dataset,      
    n_epochs=10,
    verbose=False,
    tensorboard_dir=None,
    save_metrics=False,
    shuffle=True,
)

for epoch, metrics in fitter:
    algo.batch_size += 1

Let me know what happends

Hi @jamartinh , thanks for your clue, I have tried your method but it still doesn't work.

takuseno · 2021-12-28T05:08:01Z

@cclvr When you set n_epochs instead of n_steps, epoch means a set of iterations until the all training data are consumed. So 3899 came out from dataset size / batch size. If you want to set the exact number of steps per epoch, you should use n_steps.

cclvr · 2021-12-28T06:13:41Z

@cclvr When you set n_epochs instead of n_steps, epoch means a set of iterations until the all training data are consumed. So 3899 came out from dataset size / batch size. If you want to set the exact number of steps per epoch, you should use n_steps.

OK thanks！

takuseno closed this as completed Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to switch batch size during training? #153

How to switch batch size during training? #153

cclvr commented Dec 14, 2021

takuseno commented Dec 14, 2021

cclvr commented Dec 18, 2021

takuseno commented Dec 18, 2021

cclvr commented Dec 18, 2021

takuseno commented Dec 18, 2021

cclvr commented Dec 18, 2021

jamartinh commented Dec 27, 2021

cclvr commented Dec 28, 2021 •

edited

takuseno commented Dec 28, 2021

cclvr commented Dec 28, 2021

How to switch batch size during training? #153

How to switch batch size during training? #153

Comments

cclvr commented Dec 14, 2021

takuseno commented Dec 14, 2021

cclvr commented Dec 18, 2021

takuseno commented Dec 18, 2021

cclvr commented Dec 18, 2021

takuseno commented Dec 18, 2021

cclvr commented Dec 18, 2021

jamartinh commented Dec 27, 2021

cclvr commented Dec 28, 2021 • edited

takuseno commented Dec 28, 2021

cclvr commented Dec 28, 2021

cclvr commented Dec 28, 2021 •

edited