YOLOv8 Nightshift

Question: Does it make sense to deploy two image classification / object detection models to handle Day (RGB) / Night (IR) cameras? Is there a penalty for combining day and night footage into one big dataset?

Dataset

Teledyne FLIR Free ADAS Thermal Dataset v2: The Teledyne FLIR free starter thermal dataset provides fully annotated thermal and visible spectrum frames for development of object detection neural networks. This data was constructed to encourage research on visible + thermal spectrum sensor fusion algorithms ("RGBT") in order to advance the safety of autonomous vehicles. A total of 26,442 fully-annotated frames are included with 15 different object classes.

Labels

A modified MSCOCO label map was used with conventions that were largely inspired by the Berkeley Deep Drive dataset. The following classes are included:

Category Id 1: person
Category Id 2: bike (renamed from "bicycle")
Category Id 3: car (this includes pick-up trucks and vans)
Category Id 4: motor (renamed from "motorcycle" for brevity)
Category Id 6: bus
Category Id 7: train
Category Id 8: truck (semi/freight truck, excluding pickup truck)
Category Id 10: light (renamed from "traffic light" for brevity)
Category Id 11: hydrant (renamed "fire hydrant" for brevity)
Category Id 12: sign (renamed from "street sign" for brevity)
Category Id 17: dog
Category Id 37: skateboard
Category Id 73: stroller (four-wheeled carriage for a child, also called pram)
Category Id 77: scooter
Category Id 79: other vehicle (less common vehicles like construction equipment and trailers)

Annotation Counts

Thermal Image Annotations			Visible Image Annotations
Label	Train	Val	Label	Train	Val
`person`	50,478	4,470	`person`	35,007	3,223
`bike`	7,237	170	`bike`	7,560	193
`car`	73,623	7,133	`car`	71,281	7,285
`motor`	1,116	55	`motor`	1,837	77
`bus`	2,245	179	`bus`	1,879	183
`train`	5	0	`train`	9	0
`truck`	829	46	`truck`	1,251	47
`light`	16,198	2,005	`light`	18,640	2,143
`hydrant`	1,095	94	`hydrant`	990	126
`sign`	20,770	2,472	`sign`	29,531	3,581
`dog`	4	0	--	--	--
`deer`	8	0	--	--	--
`skateboard`	29	3	`skateboard`	412	4
`stroller`	15	6	`stroller`	38	7
`scooter`	15	0	`scooter`	41	0
`other vehicle`	1,373	63	`other vehicle`	698	40
Total	175,040	16,696	Total	169,174	16,909

Evaluation

I trained both a YOLOv8n and YOLOv8s model for each case - only RGB images, only IR images and for the combined image dataset. The following are the results for each model, each dataset split by classes:

Note that classes that are underrepresented in the dataset perform abysmal. Personally, I only consider the following classes to be representative for the result of this experiment:

Conclusions:

Given the quality of the images in this dataset it is easier to identify objects from the thermal / IR images.
The S-Model always outperforms the N-Model - as expected. It would be interesting to extend this experiment to include the more complex M, L to X-Model variations.
You need at least the S-Model to be able to work with the mixed (day+night) dataset. There is a penalty for a few classes. But this might not justify the added complexity of using 2 models instead.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
config		config
.gitignore		.gitignore
01_Dataset_Preprocessing.ipynb		01_Dataset_Preprocessing.ipynb
02_Model_Training.ipynb		02_Model_Training.ipynb
03_Model_Training_Combined.ipynb		03_Model_Training_Combined.ipynb
04_Model Evaluation.ipynb		04_Model Evaluation.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YOLOv8 Nightshift

Dataset

Labels

Annotation Counts

Evaluation

About

Releases

Packages

Languages

mpolinowski/yolov8-nightshift

Folders and files

Latest commit

History

Repository files navigation

YOLOv8 Nightshift

Dataset

Labels

Annotation Counts

Evaluation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages