Some questions about model testing #45

Philharmy-Wang · 2024-05-21T07:36:28Z

Dear Author,

Firstly, I would like to express my profound gratitude for your contributions and research in this field! I am currently working on multi-modal remote sensing for forest fire detection. Your recent introduction of the img2img method has inspired new directions in my research.

In remote sensing applications, acquiring registered multi-modal data (visible and thermal imaging) is exceedingly difficult. I was very intrigued by your project's handling of scenarios like Day to Night and Clear to Rainy transitions. I am interested in converting visible light images into registered thermal imaging to enrich my dataset with synthetic data.

To this end, I have trained a model locally using the training code you provided, utilizing some multi-modal forest fire data I have, which is already registered. I formatted the RGB and thermal images according to the structure used in your dataset.

The specific training command was as follows:

accelerate launch src/train_pix2pix_turbo.py \
    --pretrained_model_name_or_path="stabilityai/sd-turbo" \
    --output_dir="output/pix2pix_turbo/fs_rgb_ir_03" \
    --dataset_folder="data/fs_rgb_ir" \
    --resolution=512 \
    --train_batch_size=1 \
    --enable_xformers_memory_efficient_attention --viz_freq 25 \
    --track_val_fid \
    --learning_rate=8e-5 \
    --num_training_epochs=150 \
    --max_train_steps=200000 \
    --lr_scheduler="cosine_with_restarts" \
    --lr_warmup_steps=1000 \
    --lr_num_cycles=1 \
    --report_to "wandb" --tracker_project_name "pix2pix_turbo_fs_rgb_ir"

The model performed exceptionally well during validation, where the generated thermal images were almost identical to the target images.

10000 epoch:

20000 epoch:

30026 epoch:

40001 epoch:

However, when I applied the model to test on other forest fire datasets, the results were disappointing and vastly different from what was expected.

The specific testing command was:

python src/inference_paired.py --model_path "output/pix2pix_turbo/fs_rgb_ir_03/checkpoints/model_66001.pkl" \
    --input_image "VOCdevkit-rsy-all/images/train/1_1.jpg" \
    --prompt "RGB to IR" \
    --output_dir "outputs/fs-rgb_ir-512"

Given the above, I have several questions:

Are there any oversights in my training setup? Due to hardware limitations, I had to set the train_batch_size to 1, which might have compromised effective learning.
I used your project's src/inference_paired.py for inference with all parameters set to default. Do you think the distortion in inference could be related to the script's configuration?
The model performed well during training and validation, but the inference on a new dataset was poor. Does this suggest that the model might be overfitting?

I look forward to your reply and thank you for taking the time to assist with these questions!

The text was updated successfully, but these errors were encountered:

GaParmar · 2024-05-26T23:24:31Z

Hi,

Thank you for you interest in this project!
A couple things I can observe from your results:

What is the image pre-processing used during inference and training time?
What is your L2 reconstruction error for the validation set vs test set?
It is indeed strange if the model performs well on validation set but poorly on the test set. It could be a sign of overfitting. Is your test set similar to the validation set?

-Gaurav

Philharmy-Wang · 2024-05-27T09:11:57Z

Hi,

Thank you for you interest in this project! A couple things I can observe from your results:

What is the image pre-processing used during inference and training time?

What is your L2 reconstruction error for the validation set vs test set?

It is indeed strange if the model performs well on validation set but poorly on the test set. It could be a sign of overfitting. Is your test set similar to the validation set?

-Gaurav

Dear Gaurav,

Thank you very much for your prompt reply and suggestions! Here are the specific answers to the questions you raised:

During inference and training, I adopted the same image preprocessing methods used in your project for training the Fill50k dataset on pix2pix-turbo. These preprocessing steps are implemented by default in the src/train_pix2pix_turbo.py script.
Regarding the L2 reconstruction error, it was 0.147 for the training set (fs_rgb_ir) and 0.202 for the validation set.
Concerning dataset similarity, my test set is indeed quite different from the validation set. Though both are captured from a drone's perspective of forest fires, the test set includes synthesized images and our own drone-captured images, which differ significantly in terms of drone flight altitude and overall image style from the training and validation sets.

Here are the statistical details of the images in my training and test datasets:

Training set RGB images (train_A):

Mean: tensor([0.5097, 0.5172, 0.5023])
Std: tensor([0.1511, 0.1540, 0.1636])
Training set IR images (train_B):

Mean: tensor([0.5555, 0.0691, 0.3500])
Std: tensor([0.1448, 0.1327, 0.2469])
Test set RGB images (test_A):

Mean: tensor([0.5082, 0.5153, 0.5008])
Std: tensor([0.1526, 0.1554, 0.1663])
Test set IR images (test_B):

Mean: tensor([0.5495, 0.0680, 0.3495])
Std: tensor([0.1478, 0.1346, 0.2481])
Sample image from test set (test_img):

Mean: tensor([0.4136, 0.4193, 0.4006])
Std: tensor([0.2235, 0.2367, 0.2664])

I hope this information is helpful in resolving the issues. I look forward to your further guidance and advice!

Philharmy-Wang closed this as completed May 27, 2024

Philharmy-Wang reopened this May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some questions about model testing #45

Some questions about model testing #45

Philharmy-Wang commented May 21, 2024

GaParmar commented May 26, 2024 •

edited

Loading

Philharmy-Wang commented May 27, 2024

Some questions about model testing #45

Some questions about model testing #45

Comments

Philharmy-Wang commented May 21, 2024

GaParmar commented May 26, 2024 • edited Loading

Philharmy-Wang commented May 27, 2024

GaParmar commented May 26, 2024 •

edited

Loading