You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now the quick start guide suggests that I shouldn't really bother about resizing my dataset images, but it will be handled by Onetrainer if I activate resolution bucketing, but I noticed that when selecting multiple training resolution, if I set batch size to 1 it uses all samples, but if I set it to 2 the number of steps is less than half, so some image is not used anymore.
What's not really clear is what happens, let's make an example:
I set training resolutions to 512, 640, 768, 960.
I have a 639*641 image, is it always cropped to 512x640, or sometimes to 512x512?
I have a 256x320 image, is it upscaled to 512x640 or can sometimes end up at 768x960?
I also noticed that even with crop jitter enabled the preview is static, if I have a 1024*512 image do I get crops of image[0:960,0:512] and [64:1024, 0:512] or the crops are always centered? Will it sometimes be cropped to resolutions different from 960x512?
What would you like to see as a solution?
I have 5 proposals to improve both clarity and training:
Use all images option: when batch size > 1, always try to have batch_size images for every resolution even if it involves using crops with less coverage of the original images
correclty show crop jitter's effect in the preview (assuming righ now it only shows a centered square crop and not what's actually used)
vary scaling option: if possible, also uses samples downscaled to lower resolutions, not only maximum one
when using samples below a set resolution (even if upscaled), optionally add a set tag (for example "low resolution, low quality") to the prompt, same when above certain resolution (for example "high resolution")
allow to set both horizontal and vertical resolution, so that i can set something like "384, 512x512, 768" and have as a set of allowed resolutions "384x384, 384x768, 512x512, 768x768, 768x384"
Have you considered alternatives? List them here.
right now I can probably have multiple copies of each image with different resolutions/aspect ratios/cropping, but would require a lot of them to truly cover each possible crop of each image
The text was updated successfully, but these errors were encountered:
The main reason I ask this is that I noticed that training with multiple resolutions slightly improves quality, but on some datasets it seems to overfit to a subset of the images at each available resolution, so I'd like to have each image used at multiple scales to prevent this
What I can add is that if you add a number of repeats >1 to your concept setting and enable crop/jitter you actually seem to get multiple versions of the same picture(s) => judging on the amount of items cached. This also makes sense from the standpoint that after each epoch, a full run of all training data was performed. Hence my guess is, that for each repeat a crop/jitter "instance" of each image is created and each of these "instances" gets used during one epoch.
Describe your use-case.
Right now the quick start guide suggests that I shouldn't really bother about resizing my dataset images, but it will be handled by Onetrainer if I activate resolution bucketing, but I noticed that when selecting multiple training resolution, if I set batch size to 1 it uses all samples, but if I set it to 2 the number of steps is less than half, so some image is not used anymore.
What's not really clear is what happens, let's make an example:
I also noticed that even with crop jitter enabled the preview is static, if I have a 1024*512 image do I get crops of image[0:960,0:512] and [64:1024, 0:512] or the crops are always centered? Will it sometimes be cropped to resolutions different from 960x512?
What would you like to see as a solution?
I have 5 proposals to improve both clarity and training:
Have you considered alternatives? List them here.
right now I can probably have multiple copies of each image with different resolutions/aspect ratios/cropping, but would require a lot of them to truly cover each possible crop of each image
The text was updated successfully, but these errors were encountered: