Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss graph during training #614

Open
abeyang00 opened this issue Apr 1, 2018 · 41 comments
Open

Loss graph during training #614

abeyang00 opened this issue Apr 1, 2018 · 41 comments

Comments

@abeyang00
Copy link

Is there a way to show loss graph during training like tensorflow?

@springkim
Copy link

Hi @abeyang00
Here! https://github.com/AlexeyAB/darknet.
He made loss plot for training.

@abeyang00
Copy link
Author

@springkim can you tell me where the plot is located in his folder? is it in .c file in src??

@ahsan856jalal
Copy link

AlexeyAB#504 (comment)
your answer is at the bottom

@Caroline1994
Copy link

can someone tell me how to show loss graph during training when i use pjreddie's darknet

@Sikandarkhan
Copy link

Any update on this thread?

@rbarman
Copy link

rbarman commented Mar 22, 2019

I found one solution here : https://github.com/Jumabek/darknet_scripts/#how-to-plot-yolo-loss

You basically need to save the output of ./darknet detector train <> into a log file and then python plot_yolo_log.py log_file.log

Note that the plot does not show in a jupyter-notebook even with %matplotlib inline. A work around is to copy all plot related code from https://github.com/Jumabek/darknet_scripts/blob/master/plot_yolo_log.py into a new function.

@AlexeyAB
Copy link
Collaborator

You can use repo https://github.com/AlexeyAB/darknet that shows Loss & mAP chart during Training:
chart_full_occlusion

@groszste
Copy link

@AlexeyAB is this plot of the training loss or validation loss? If training loss, do you have a way of viewing the validation loss?

@AlexeyAB
Copy link
Collaborator

@groszste
It is Training loss and Validation mAP.
For me it isn't necessary to see Validation loss, it is much better to see Validation mAP.

@JakupGuven
Copy link

@AlexeyAB
What commands do you use to display validation mAP during training?

@kschwethelm
Copy link

@JakupGuven

From README: https://github.com/AlexeyAB/darknet/blob/master/README.md

"Or just train with -map flag:

darknet.exe detector train data/obj.data yolo-obj.cfg darknet53.conv.74 -map

So you will see mAP-chart (red-line) in the Loss-chart Window. mAP will be calculated for each 4 Epochs using valid=valid.txt file that is specified in obj.data file (1 Epoch = images_in_train_txt / batch iterations)"

@yjdeveloper
Copy link

yjdeveloper commented Jun 20, 2019

I have followed the steps given by Mr. @AlexeyAB and got the red line but my problem is how to plot a mAP after every 100 iteration. In your documentation until 1000 iteration, but i want in every 100 iteration.

@devphilno
Copy link

@yjdeveloper have you figured out how to downscale the mAP calculation to a shorter interval?

@fcakyon
Copy link

fcakyon commented Sep 25, 2019

@yjdeveloper @snphnolt use this version with -map 0.02 for map calculation at every 0.02 epoch (starts after warmup iterations)

https://github.com/fcakyon/darknet

@neso613
Copy link

neso613 commented Feb 10, 2020

mAP-chart

Where this map graph has seen?

@neso613
Copy link

neso613 commented Feb 10, 2020

I have followed the steps given by Mr. @AlexeyAB and got the red line but my problem is how to plot a mAP after every 100 iteration. In your documentation until 1000 iteration, but i want in every 100 iteration.

How do you got the red line?

@ghost
Copy link

ghost commented Feb 25, 2020

@AlexeyAB

I am using your repo to detect a custom objects using yolov3. however I have get in to trouble. The predictions.jpg image do not draw the confidence score but it draws the class id.

i traced the image.c code and I have found that in the function definition

void draw_detections_v3(image im, detection *dets, int num, float thresh, char **names, image **alphabet, int classes, int ext_output)

how to resolev e the issue?

@ghost
Copy link

ghost commented Feb 26, 2020

please, anyone, help. which function I have to use in AlexeyAB yolo repository in order to get confidence score drawings on the predictions.jpg image file???? I have get only class Id using this

!./darknet detector test data/trainer.data cfg/yolov3.cfg backup/yolov3_last.weights -thresh 0.1 -iou_thresh 0.3 data/img/tb500.jpg

predictions

@Leprechault
Copy link

You can use repo https://github.com/AlexeyAB/darknet that shows Loss & mAP chart during Training:
chart_full_occlusion

The command ./darknet detector demo ... -json_port 8070 -mjpeg_port 8090 works very well, but is there any way to save the image in vectorial format like eg. *pdf, *svg, *ps?

@Leprechault
Copy link

I found one solution here : https://github.com/Jumabek/darknet_scripts/#how-to-plot-yolo-loss

You basically need to save the output of ./darknet detector train <> into a log file and then python plot_yolo_log.py log_file.log

Note that the plot does not show in a jupyter-notebook even with %matplotlib inline. A work around is to copy all plot related code from https://github.com/Jumabek/darknet_scripts/blob/master/plot_yolo_log.py into a new function.

@rbarman in the log.txt output what's the information about mAP?

@ak3509311
Copy link

How to save the loss graph on drive because i run the code on colab .

@harshkc03
Copy link

harshkc03 commented Jun 16, 2020

I'm training Yolov3-tiny on colab using the following command-
!./darknet detector train /content/obj.data /content/yolov3-tiny-obj.cfg backup/yolov3-tiny-obj_last.weights -dont_show -mjpeg_port 8090 -map

It shows MJPEG-stream sent in the output after every iteration and i know we have to use http://ip-address:8090 format to access the chart, but I'm unable to find the ip-address of my colab notebook. I tried using addresses from !ifconfig and !curl ipecho.net/plain but still no result.
Any help would be appreciated.

@himewel
Copy link

himewel commented Jul 5, 2020

@harshkc03 I found this quote in StackOverflow. I still not found a way to propagate the json and graph at same time, but you can try something like this to train and see your graph updating. It prints a url that you can access your loss graph with the follow commands:

!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip ngrok-stable-linux-amd64.zip

get_ipython().system_raw('./ngrok http 8090 &')

!curl -s http://localhost:4040/api/tunnels | python3 -c \
 "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

After this, start your training:

!./darknet detector train /content/obj.data /content/yolov3-tiny-obj.cfg backup/yolov3-tiny-obj_last.weights -dont_show -mjpeg_port 8090 -map

@francismontalbo
Copy link

Is there a way to produce the loss curve and mAP from an existing weight?

@harshkc03
Copy link

@francismontalbo you can obtain mAP of the existing weight using the command-
./darknet detector map data/obj.data yolo-obj.cfg backup\yolo-obj_last.weights
but you cannot generate the loss curve of an existing weight. Loss curve generates only during training.

@francismontalbo
Copy link

@francismontalbo you can obtain mAP of the existing weight using the command-

./darknet detector map data/obj.data yolo-obj.cfg backup\yolo-obj_last.weights

but you cannot generate the loss curve of an existing weight. Loss curve generates only during training.

Yes, I've been using that. I see, thank you for the response good sir.

@wc1997
Copy link

wc1997 commented Aug 4, 2020

You can use pyngrok python package to display loss graph

!pip install pyngrok
from pyngrok import ngrok# Open a HTTP tunnel on port 8090
public_url = ngrok.connect(port = '8090')

public_url

Then run your training with flags

-mjpeg_port 8090 -map

@harshkc03
Copy link

You can use pyngrok python package to display loss graph

!pip install pyngrok
from pyngrok import ngrok# Open a HTTP tunnel on port 8090
public_url = ngrok.connect(port = '8090')

public_url

Then run your training with flags

-mjpeg_port 8090 -map

Thankyou sir, it works as expected.

@himewel
Copy link

himewel commented Aug 25, 2020

You can use pyngrok python package to display loss graph

!pip install pyngrok
from pyngrok import ngrok# Open a HTTP tunnel on port 8090
public_url = ngrok.connect(port = '8090')

public_url

Then run your training with flags

-mjpeg_port 8090 -map

Seems much more elegant than my response, ty auhdsuahsduahs

@vishnuvardhan58
Copy link

vishnuvardhan58 commented Nov 2, 2020

Hello , I am getting the following error while using the command "!./darknet detector train data/obj.data cfg/yolov3_custom.cfg darknet53.conv.74 -dont_show -mjpeg_port 8090 -map".I am using google colab.

The connection to http://d80c91c46410.ngrok.io was successfully tunneled to your ngrok client, but the client failed to establish a connection to the local address localhost:8090.

Make sure that a web service is running on localhost:8090 and that it is a valid address.

The error encountered was: dial tcp 127.0.0.1:8090: connect: connection refused

@sercangokturk
Copy link

I have followed the steps given by Mr. @AlexeyAB and got the red line but my problem is how to plot a mAP after every 100 iteration. In your documentation until 1000 iteration, but i want in every 100 iteration.

Did you find a solution? Thanks in advance.

@shawntyshawny
Copy link

shawntyshawny commented Jan 8, 2021

You can use repo https://github.com/AlexeyAB/darknet that shows Loss & mAP chart during Training:
chart_full_occlusion

I've followed this tutorial but my output mAP seems to have started its line from 68% and there is a broken line between 0 - 68%, how do I resolve this?
119930805_385605669128251_2197230891200090292_n

@Doomleet
Copy link

You can use repo https://github.com/AlexeyAB/darknet that shows Loss & mAP chart during Training:
chart_full_occlusion

Hello! It is work only on Windows? I dont know how use it on Linux. I use tag -map

@Doomleet
Copy link

You can use repo https://github.com/AlexeyAB/darknet that shows Loss & mAP chart during Training:
chart_full_occlusion

Hello! It is work only on Windows? I dont know how use it on Linux. I use tag -map

Ok, i do make without OPENCV=1, now i do make with OPENCV=1 and its work :)

@mhdayub
Copy link

mhdayub commented Feb 3, 2021

how to show red line (percentage) and run on what file?

@oo92
Copy link

oo92 commented Feb 22, 2021

What is the y-axis? What do those numbers on the y-axis represent?

@khinmaunghtay4ah
Copy link

khinmaunghtay4ah commented Sep 16, 2021

how to show red line (percentage) and run on what file?

@mhdayub You just have to add -map flag at the end of the command used for training and you will see accuracy-mAP during training.

For e.g, darknet.exe detector train data/obj.data cfg/yolov4-obj.cfg backup/yolov4-obj_last.weights -map

@hainguyen201
Copy link

go to darket folder, you will see the chart image file 'chart.png'

@DikshitV
Copy link

Hi, I am new to yolo. Where can I find the training loss values stored in the darknet? Is the training loss values stored or they are just directly plotted?

@ZaynAlk
Copy link

ZaynAlk commented Jul 6, 2022

Hi, I am new to yolo. Where can I find the training loss values stored in the darknet? Is the training loss values stored or they are just directly plotted?

Go to darknet folder and you can find it there

@yeonmnim
Copy link

yeonmnim commented Apr 2, 2023

1
@AlexeyAB
Hi all, i am using the colab to run yolov4 on my custom data, but now it seems like the mAP does not coming out for me. I previously can get the loss graph with a mAP line graph but that was long time ago. So, i have write a txt file to store the log, but it seems like when reaching 1000th iterations, the mAP cannot be calculated. It outputs me the error as shown below :

cuDNN status Error in: file: ./src/convolutional_kernels.cu : () : line: 543 : build time: Apr 2 2023 - 12:39:35

cuDNN Error: CUDNN_STATUS_BAD_PARAM
Darknet error location: ./src/dark_cuda.c, cudnn_check_error, line #204
cuDNN Error: CUDNN_STATUS_BAD_PARAM: Success

and, is there any method to avoid the runtime to get stopped? i have 6000th iterations to run , but it will automatically stops when it reaches 3000 iterations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests