How to continue training without replacing the previous training? #19

molo32 · 2021-03-30T02:06:52Z

How to continue training without replacing the previous training?
I have images of a face divided into part1 and part2

first I train part1 of images of face.

when I finish training and do a driver, you can see the face of the images in part1

Then I want to add the images from part 2.

then I load the pth and train from there with the images of part 2

The problem is that when I make a driver, only the images of part2 are chosen, it is as if part2 were to overwrite part 1.

What I expected was to have more variety of expressions, so expressions from part 1 and part 2 are chosen.

How to avoid overwriting images or expressions from a previous training session?

shrubb · 2021-03-30T10:19:23Z

Sorry, I didn't get it at all, what do you mean by "overwriting images"? And why do you want to train (or fine-tune?) two times. If you have two datasets, just fine-tune to them independently using a single meta-learned checkpoint

molo32 · 2021-03-30T16:23:04Z

I want to train two sets of the same person separately because if I load the whole dataset,
cuda out of memory error.
so to avoid that error I split the dataset in A and B and first train a datasetA then datasetB.

dataset are images-cropped generated by preprocess data.py

images are expressions or the face of a person.

driver is to take a video driver, a chekpoint and produce a video with driver.py

When selecting images I mean which images from the data set are chosen to make the output video with driver.py

By overwriting images I mean overwriting expressions or faces.

If data set A has different illumination to data set B, if I refine a meta-learned model with data set A with python3 train.py, then I repeat but with data set B, when making a driver only images are seen with the illumination of B, then B overwritten A.

I don't want to train, I want to tune.
I want to fit two dataset A and B independently, I select the latent-pose-release.pth) to train with the first dataset A, in DATASET_ROOT = path set A, I run python3 train.py
At the end of training it gives me a checkpoint.pth, if I do a driver with that checkpoint the expressions of data set A are seen, then I load that chekpoint.pth to continue training from there to data set B, I finish training and
it gives me another chekpointB.pth, but when I do a driver with the chekpointB, only images are selected from the last dataset B and not from dataset A, that's what I mean by overwrite.

shrubb · 2021-03-30T19:09:28Z

because if I load the whole dataset, cuda out of memory error.

This means that you're doing something wrong: GPU memory doesn't depend on the dataset size. Just use smaller batches. For example, with a batch size of 1 you can fine-tune on as many images as you want.

As I understood, you're trying to fine-tune a meta-learned checkpoint to dataset A, then take that fine-tuned model and fine-tune it further to dataset B. Well, we never tried that. I don't know if that will even work -- that's a research question. You'll probably need to modify the code for that, and it's entirely at your risk, I'm afraid I can't help here.

molo32 · 2021-03-31T17:26:10Z

ok I understand, another thing, can I make the checkpoint smaller?, output is always 1 gb size.

shrubb · 2021-03-31T17:36:33Z

Yes, it's not hard (just don't include discriminator, embedder, optimizer state etc. in the checkpoint) but for that you'll have to modify the code yourself.

shrubb closed this as completed Apr 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to continue training without replacing the previous training? #19

How to continue training without replacing the previous training? #19

molo32 commented Mar 30, 2021 •

edited

Loading

shrubb commented Mar 30, 2021

molo32 commented Mar 30, 2021 •

edited

Loading

shrubb commented Mar 30, 2021

molo32 commented Mar 31, 2021

shrubb commented Mar 31, 2021 •

edited

Loading

How to continue training without replacing the previous training? #19

How to continue training without replacing the previous training? #19

Comments

molo32 commented Mar 30, 2021 • edited Loading

shrubb commented Mar 30, 2021

molo32 commented Mar 30, 2021 • edited Loading

shrubb commented Mar 30, 2021

molo32 commented Mar 31, 2021

shrubb commented Mar 31, 2021 • edited Loading

molo32 commented Mar 30, 2021 •

edited

Loading

molo32 commented Mar 30, 2021 •

edited

Loading

shrubb commented Mar 31, 2021 •

edited

Loading