Framework for object detection and instance segmentation models from the YOLOv8 family
Environment with Python 3.8 or greater (3.11 suggested) and PyTorch (>=1.8). On devices with CUDA-enabled graphics cards, Nvidia CUDA toolkit version 10.0 or higher and the corresponding version of PyTorch must be installed.
Other packages required:
- TorchVision (>=0.9.0)
- MatPlotLib (>=3.2.2)
- NumPy (>=1.22.2)
- OpenCV (>=4.6.0)
- Pillow (>=7.1.2)
- PyYaml (>=5.3.1)
- Requests (>=2.23.0)
- SciPy (>=1.4.1)
- tqdm (>=4.64.0)
- Pandas (>=1.1.4)
- Seaborn (>=0.11.0)
- psutil
- py-CPUinfo
as specified in requirements.txt. They can be installed using the following command:
pip install -r requirements.txt
YOLO-FRAMEWORK
|_ configuration
|_ training.yaml # training parameters
|_ models
|_ data.yaml # data configuration file
|_ model.pt # model weights file
|_ datasets
|_ inference_output
|_ training_output
|_ validation_results
|_ Code
|_ PrepareDataset.py
|_ Train.pydata.yaml.pt
|_ Validate.py
|_ Preview.py
|_ PreviewCamera.py
...
Framework can be used with custom yolo models. By default CModelML class loads weights from models directory - models/model.pt
.
Custom path can be used as well.
from ml_model.CModelML import CModelML as Model
# Default model initialization
c_Model = Model() # weights loaded from models/model.pt
# Model initialization with custom path
c_Model = Model('example_path\\my_model.pt')
# Model initialization with official YOLOv8 weights
c_Model = Model('yolov8n.pt')
Additional parameters of CModelML class can be tweaked:
f_Thresh
- confidence score threshold [float]s_ForceDevice
: force device (f.e. 'cpu', 'cuda:0') [str]b_SAMPostProcess
: enable additional post-processing with Segment Anything Model (SAM) [bool]
CModelML class takes as input ndarrays in a standard OpenCV format (shape=(H,W,3), dtype=np.uint8) or string with path to image in a '.jpg', '.jpeg' or '.png' format.
import cv2 as cv
from ml_model.CModelML import CModelML as Model
c_Model = Model(s_PathWeights='yolov8n.pt', f_Thresh=0.75) # Initialize model
# perform inference using ndarray
image = cv.imread('example_path\\example_image.jpg') # load image using openCV
results = c_Model.Detect(image)
# perform inference using path to image
results = c_Model.Detect('example_path\\example_image.jpg')
Model class returns results in a form of ImageResults class, which can be seen here.
When detecting or segmenting small objects in large images, tiling can be useful - it divides the input image into several smaller tiles, which are passed to the ML model. The results are merged and presented for the full resolution image.
from ml_model.CModelML import CModelML as Model
c_Model = Model(i_TileSize=500) # Initialize model with tiling enabled and tile shape of 500x500
Input data structure:
input_data_folder
|_ class_names.txt # list of class names in plain, each class in a new line
|_ data
|_ file1.txt # label file should have the '.txt' extension
|_ file1.jpg # image file should have '.jpg', '.jpeg' or '.png' extension
...
- Run PrepareDataset.py
- Select input folder with images and labels
- Select output dataset folder in desired directory - f.e. 'datasets/datastet-example'
- System will create a new dataset with the yaml configuration file and train, test, val subsets.
Output data structure:
output_data_folder
|_ data.yaml # dataset configuration file
|_ train
|_ file1.txt
|_ file1.jpg
...
|_ val
...
|_ test
...
- Run Train.py
- Select model size
- Select output dataset folder in desired directory - f.e. 'datasets/datastet-example'
- Training output is saved to the training_output
Parameters in Train.py:
i_Epochs
- number of training epochsi_BatchSize
- training batch sizef_ConfThreshTest
- confidence threshold during testing
Advanced parameters are stored in configuration/training.yaml.
training_output
|_ 20230101_000000 # Folder with training date
|_ plots # metrics
...
|_ test_inference # inference on test subset
...
|_ weights
|_ best.pt # best weights
|_ last.pt # last epoch weights
|_ data.yaml # dataset configuration file
...
...
- Run Validate.py
- Select dataset folder
- Valdiation output is saved to the validation_results
validation_results
|_ 20230101_000000 # Folder with validation date
|_ results.json # Validation numeric results
...
...
Output file structure:
{
"mean_ap": "mAP50:95",
"mean_ap50": "mAP50",
"ap50": {
"class_name": "AP50",
//...
},
"ap": {
"class_name": "AP50:95",
//...
},
"mean_precission": "MEAN_PRECISSION",
"mean_recall": "MEAN_RECALL",
"precission": {
"class_name": "PRECISSION",
//...
},
"recall": {
"class_name": "RECALL",
//...
},
"mean_f1": "F1",
"f1": {
"class_name": "F1",
//...
},
"speed": "TOTAL_INFERENCE_TIME_PER_IMAGE"
}
- Run Preview.py
- Select folder with input images
- Preview will be displayed in OpenCV GUI
- Preview output is saved to the inference_output as *.txt YOLO and *.json COCO results file
- Pressing 's' during preview will save the image file to disk, 'ESC' will close the script
Local files localization:
- Weights file:
models/model.pt
- Data configuration file:
models/data.yaml
Parameters in Preview.py:
f_Thresh
- confidence threshold value
- Run PreviewCamera.py
- Your camera feed will be displayed in OpenCV GUI
Parameters in PreviewCamera.py:
f_Thresh
- confidence threshold valuei_TargetFPS
- target FPS value
- Run CrossEval.py
- Select input folder with images and labels
- Select output dataset folder in desired directory - f.e. 'datasets/datastet-example'
- Select model size
- System will split data into N segments, prepare models and perform cross validation.
- Cross validation output is saved to the validation_results
validation_results
|_ CrossEval_20230101_000000 # Folder with validation date
|_ results_final.json # Validation numeric results
...
...
Parameters in CrossEval.py:
iNSegments
- number of sub-datasets used during cross validationi_Epochs
- number of training epochsi_BatchSize
- training batch sizef_ConfThreshTest
- confidence threshold during testing