Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running Colab code does not give output. #4

Closed
samjoy opened this issue Dec 13, 2020 · 40 comments
Closed

Running Colab code does not give output. #4

samjoy opened this issue Dec 13, 2020 · 40 comments

Comments

@samjoy
Copy link

samjoy commented Dec 13, 2020

When I run the colab demo code, it does not produce the correct predictions. In the image titled 'Predictions', there are no bounding boxes or labels that appear.

@benihime91
Copy link
Owner

benihime91 commented Dec 13, 2020

@samjoy What was your IOU score? If it's low you need to set threshold to be low.
To increase IOU score you probably need to train for more epochs. For good results IOU score should be atleast above 0.4
Try using a different Optimizer/Scheduler , disable Early Stopping , tune hyperparameters for better results.
The demo was not created for best results it's just a notebook that shows how to use this repo.

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

Ok I am running your code once more. The dataset you you used in the demo is in Pascal XML VOC format right?

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

Ok my AP is 0 when running the demo. Any idea why?

@benihime91
Copy link
Owner

Can you give me a link to the colab notebook ?

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

I mean I did nothing but run your notebook directly. No change

@benihime91
Copy link
Owner

Wait let me check I think I made some changes to the API but forget to update the notebook

@benihime91
Copy link
Owner

Did you change these ?

# INSTANTIATE LIGHTNING-TRAINER with CALLBACKS :
# ============================================================ #
# NOTE: 
# For a list of whole trainer specific arguments see : 
# https://pytorch-lightning.readthedocs.io/en/latest/trainer.html

lr_logger  = LearningRateMonitor(logging_interval="step")
early_stop = EarlyStopping(mode="min", monitor="val_loss", patience=8, )

#instantiate LightningTrainer
trainer    = Trainer(precision=16, gpus=1, callbacks=[lr_logger, early_stop], max_epochs=50, weights_summary="full", )

@benihime91
Copy link
Owner

Whats your loss ?

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

I did make one change. Instead of litModel = RetinaNetModel(hparams=hparams), i used litModel = RetinaNetModel(hparams)

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

The loss is 5.2

@benihime91
Copy link
Owner

That's okay I changes the name of the argument to conf any ways...
I think it will be better if you give me a link to you colab

save colab as github gist and give me the link .... I wil get back to you

@benihime91
Copy link
Owner

5.2 after how many epochs ? It's too high...

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

In 10 epochs with early stopping

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

I am now running without early stopping but max epochs=50

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

I am using Pascal XML VOC format from roboflow.

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

I just ran it again without early stopping but max_epochs = 50 , I am getting loss of 5.28.

@benihime91
Copy link
Owner

Mine loss if less than 2 even in 1 epoch same basic params. Let it train for some more ill share the gist

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

Are you using the BCCD dataset in the Pascal VOC XML format?

@benihime91
Copy link
Owner

Yes

@benihime91
Copy link
Owner

benihime91 commented Dec 13, 2020

So i did a bit of tunning my optimizer config looks like this now

hparams.optimizer = {
    "class_name": "torch.optim.SGD", 
    "params"    : {"lr": 0.005, "weight_decay": 0.0001, "momentum":0.9},
    }

Current epoch 14 and loss=0.37

Im training for 40 epochs so once thats over ill share the notebook

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

ok thanks

@benihime91
Copy link
Owner

Please check this : https://colab.research.google.com/gist/benihime91/00996411c8174a81f6c1389750012103/github-retinanet-demo.ipynb

40 epochs, loss = 0.543 , classification_loss=0.233, regression_loss=0.153, val_loss=0.435

coco-evaluation results:

IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.322
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.751
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.154
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.333
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.321
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.491
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.272
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.405
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.451
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.433
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.398
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.567

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

I will run it now and let you know about the results

@benihime91
Copy link
Owner

If it still doesn't work try installing pytorch-lightning=1.0.0 (but i don't think that should be an issue 😌) and share me your notebook

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

Yeah sure :)

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

I ran your notebook exactly and I am getting poor results.
40 epochs, loss=2.91, v_num=0, classification_loss=2.39, regression_loss=0.518, val_loss=2.88]

IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000


DATALOADER:0 TEST RESULTS
{'AP': tensor(0., dtype=torch.float64),
'val_loss': tensor(2.8816, device='cuda:0')}

[{'AP': 0.0, 'val_loss': 2.8816070556640625}]

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

I just opened the link and ran the notebook and I am getting the above poor results

@benihime91
Copy link
Owner

Can u try with pytorch-lightning version 1.0.0.. Just to pip install pytorch-lightning=1.0.0

@benihime91
Copy link
Owner

If it doesn't work please share me your notebook.. Or else I'm afraid i won't be able to do anything more

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

Did you run your notebook on colab or someother platform?

@benihime91
Copy link
Owner

on colab itself

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

can you list the versions of all the essential libraries that you used such as pytorch lightining, pytorch, torchvision, etc?

@benihime91
Copy link
Owner

The only library that may cause conflicts in pytorch-lightning beacause they had some massive changes... So that why i am saying try with pytorch-lightning version 1.0.0. Other libraries are all deafult installed in colab and Omegaconf and albumentations should not cause conflicts

!pip install pytorch-lightning=1.0.0

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

ok I will do that now

@benihime91
Copy link
Owner

and also please share your notebook or i am out of options

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

Its working now. You are right. Its due to the pytorch-lightning version. Thanks for your help,
So I was comparing the original image and the predictions. Some of the bounding boxes do not align. Any tips how to fix this?

@benihime91
Copy link
Owner

That's to be expected you need to do hyper-parameter tuning try using Adam/AdamW optimizer , train more more epochs. As your AP increases the accuracy of the bounding boxes will also increase.

@samjoy
Copy link
Author

samjoy commented Dec 13, 2020

Ok I will look into that. Again thanks for your help

@benihime91
Copy link
Owner

I have updated the colab notebook and requirements.txt be sure to get the latest ones

@benihime91 benihime91 pinned this issue Dec 13, 2020
@eapolo
Copy link

eapolo commented Dec 2, 2021

Hey man:
I get this error
ValueError: Expected y_max for bbox (0.009765625, 0.94140625, 0.05859375, 1.001953125, 1) to be in the range [0.0, 1.0], got 1.001953125.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants