Skip to content

Commit

Permalink
distributed training of yolov5 files (#1896)
Browse files Browse the repository at this point in the history
* distributed training of yolov5 files

* Add files via upload

editing connecting to workspace cells.

* Add files via upload

ran black on the notebook for formatting

* Add files via upload

edited the handle to workspace (ml_client)

* Add files via upload

renamed compute_name to "gpu-cluster"

* Delete sdk/python/jobs/single-step/pytorch/distributed-training-yolov5/yolov5/datasets directory

deleting dataset directory

* Delete sdk/python/jobs/single-step/pytorch/distributed-training-yolov5/yolov5/data directory

deleted data folder and is uploaded in https://azuremlexamples.blob.core.windows.net/datasets/yolov5/data/

* Add files via upload

Changed the input data path to azure blob storage

* Delete .pre-commit-config.yaml
  • Loading branch information
jyravi committed Dec 12, 2022
1 parent 753c4ca commit 00b4d27
Show file tree
Hide file tree
Showing 112 changed files with 25,132 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# This code is autogenerated.
# Code is generated by running custom script: python3 readme.py
# Any manual changes to this file may cause incorrect behavior.
# Any manual changes will be overwritten if the code is regenerated.

name: sdk-jobs-single-step-pytorch-distributed-training-yolov5-objectdetectionAzureML
# This file is created by sdk/python/readme.py.
# Please do not edit directly.
on:
workflow_dispatch:
schedule:
- cron: "0 */8 * * *"
pull_request:
branches:
- main
paths:
- sdk/python/jobs/single-step/pytorch/distributed-training-yolov5/**
- .github/workflows/sdk-jobs-single-step-pytorch-distributed-training-yolov5-objectdetectionAzureML.yml
- sdk/python/dev-requirements.txt
- infra/**
- sdk/python/setup.sh
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: check out repo
uses: actions/checkout@v2
- name: setup python
uses: actions/setup-python@v2
with:
python-version: "3.8"
- name: pip install notebook reqs
run: pip install -r sdk/python/dev-requirements.txt
- name: azure login
uses: azure/login@v1
with:
creds: ${{secrets.AZUREML_CREDENTIALS}}
- name: bootstrap resources
run: |
echo '${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}';
bash bootstrap.sh
working-directory: infra
continue-on-error: false
- name: setup SDK
run: |
source "${{ github.workspace }}/infra/sdk_helpers.sh";
source "${{ github.workspace }}/infra/init_environment.sh";
bash setup.sh
working-directory: sdk/python
continue-on-error: true
- name: setup-cli
run: |
source "${{ github.workspace }}/infra/sdk_helpers.sh";
source "${{ github.workspace }}/infra/init_environment.sh";
bash setup.sh
working-directory: cli
continue-on-error: true
- name: run jobs/single-step/pytorch/distributed-training-yolov5/objectdetectionAzureML.ipynb
run: |
source "${{ github.workspace }}/infra/sdk_helpers.sh";
source "${{ github.workspace }}/infra/init_environment.sh";
bash "${{ github.workspace }}/infra/sdk_helpers.sh" generate_workspace_config "../../.azureml/config.json";
bash "${{ github.workspace }}/infra/sdk_helpers.sh" replace_template_values "objectdetectionAzureML.ipynb";
[ -f "../../.azureml/config" ] && cat "../../.azureml/config";
papermill -k python objectdetectionAzureML.ipynb objectdetectionAzureML.output.ipynb
working-directory: sdk/python/jobs/single-step/pytorch/distributed-training-yolov5
- name: upload notebook's working folder as an artifact
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: objectdetectionAzureML
path: sdk/python/jobs/single-step/pytorch/distributed-training-yolov5

- name: Send IcM on failure
if: ${{ failure() && github.ref_type == 'branch' && (github.ref_name == 'main' || contains(github.ref_name, 'release')) }}
uses: ./.github/actions/generate-icm
with:
host: ${{ secrets.AZUREML_ICM_CONNECTOR_HOST_NAME }}
connector_id: ${{ secrets.AZUREML_ICM_CONNECTOR_CONNECTOR_ID }}
certificate: ${{ secrets.AZUREML_ICM_CONNECTOR_CERTIFICATE }}
private_key: ${{ secrets.AZUREML_ICM_CONNECTOR_PRIVATE_KEY }}
args: |
incident:
Title: "[azureml-examples] Notebook validation failed on branch '${{ github.ref_name }}' for notebook 'jobs/single-step/pytorch/distributed-training-yolov5/objectdetectionAzureML.ipynb'"
Summary: |
Notebook 'jobs/single-step/pytorch/distributed-training-yolov5/objectdetectionAzureML.ipynb' is failing on branch '${{ github.ref_name }}': ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
Severity: 4
RoutingId: "github://azureml-examples"
Status: Active
Source:
IncidentId: "jobs/single-step/pytorch/distributed-training-yolov5/objectdetectionAzureML.ipynb[${{ github.ref_name }}]"
1 change: 1 addition & 0 deletions sdk/python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,7 @@ Test Status is for branch - **_main_**
|jobs|pipelines|[rai_pipeline_sample](jobs/pipelines/2f_rai_pipeline_sample/rai_pipeline_sample.ipynb)|Create sample RAI pipeline|[![rai_pipeline_sample](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-pipelines-2f_rai_pipeline_sample-rai_pipeline_sample.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-pipelines-2f_rai_pipeline_sample-rai_pipeline_sample.yml)|
|jobs|single-step|[debug-and-monitor](jobs/single-step/debug-and-monitor/debug-and-monitor.ipynb)|Run a Command to train a basic neural network with TensorFlow on the MNIST dataset|[![debug-and-monitor](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-debug-and-monitor-debug-and-monitor.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-debug-and-monitor-debug-and-monitor.yml)|
|jobs|single-step|[lightgbm-iris-sweep](jobs/single-step/lightgbm/iris/lightgbm-iris-sweep.ipynb)|Run **hyperparameter sweep** on a Command or CommandComponent|[![lightgbm-iris-sweep](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-lightgbm-iris-lightgbm-iris-sweep.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-lightgbm-iris-lightgbm-iris-sweep.yml)|
|jobs|single-step|[objectdetectionAzureML](jobs/single-step/pytorch/distributed-training-yolov5/objectdetectionAzureML.ipynb)|*no description*|[![objectdetectionAzureML](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-pytorch-distributed-training-yolov5-objectdetectionAzureML.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-pytorch-distributed-training-yolov5-objectdetectionAzureML.yml)|
|jobs|single-step|[distributed-cifar10](jobs/single-step/pytorch/distributed-training/distributed-cifar10.ipynb)|*no description*|[![distributed-cifar10](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-pytorch-distributed-training-distributed-cifar10.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-pytorch-distributed-training-distributed-cifar10.yml)|
|jobs|single-step|[pytorch-iris](jobs/single-step/pytorch/iris/pytorch-iris.ipynb)|Run Command to train a neural network with PyTorch on Iris dataset|[![pytorch-iris](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-pytorch-iris-pytorch-iris.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-pytorch-iris-pytorch-iris.yml)|
|jobs|single-step|[train-hyperparameter-tune-deploy-with-pytorch](jobs/single-step/pytorch/train-hyperparameter-tune-deploy-with-pytorch/train-hyperparameter-tune-deploy-with-pytorch.ipynb)|Train, hyperparameter tune, and deploy a PyTorch model to classify chicken and turkey images to build a deep learning neural network (DNN) based on PyTorch's transfer learning tutorial.|[![train-hyperparameter-tune-deploy-with-pytorch](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-pytorch-train-hyperparameter-tune-deploy-with-pytorch-train-hyperparameter-tune-deploy-with-pytorch.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-jobs-single-step-pytorch-train-hyperparameter-tune-deploy-with-pytorch-train-hyperparameter-tune-deploy-with-pytorch.yml)|
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
### yolov5-AzureML
#### Training Yolov5 with custom data in Azure Machine Learning using Python SDK v2

This repo contains

1) **objectdetectionAzureML.ipynb** - notebook which helps in implementing pytorch distributed training of YoloV5 models using Azure ML services (python sdk v2). The data input is provided as a yaml file which consist of the location of the data in a specific format as required by the "train.py" file from the https://github.com/ultralytics/yolov5.

2) The data files provided by ultralytics-yolov5 will download the data and arrange them into different folders as required by the "train.py" file.

3) If you have a custom data set or would like to work on an open dataset which is not part of the data files provided by YoloV5, the repo also contains an example data processing python file, **dataprep_yolov5_format.py**, which helps in preparing the data in format required by yolov5.

4) Data is downloaded from https://cvbp-secondary.z19.web.core.windows.net/datasets/object_detection/odFridgeObjects.zip

5) The zipped folder contains two folders **annotations** and **images**. The annotations are in xml format is converted to the yolo required format and the split into train and validation folders. The details of the flow are provided in the **DataProcessingYolov5Format_example.png**

6) Subsequently a sample yaml file "fridge.yaml" is also provided which will use these datasets for training the model.



Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
["carton", "milk_bottle", "can", "water_bottle"]
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
import os
import urllib.request as request
from zipfile import ZipFile
import argparse
import json
import numpy as np
import PIL.Image as Image
import xml.etree.ElementTree as ET
import glob
import random
import shutil

url = "https://cvbp-secondary.z19.web.core.windows.net/datasets/object_detection/odFridgeObjects.zip"

data_folder = "./yolov5/data"
print("data_folder found")
data_file = "odFridgeObjects.zip"
f_loc = data_folder + "/" + data_file.split(".")[0]

input_dir = "./yolov5/data/odFridgeObjects/annotations/"
output_dir = "./yolov5/data/odFridgeObjects/labels/"
image_dir = "./yolov5/data/odFridgeObjects/images/"

processed_folder = "datasets"
split_ratio = 0.8


def downloaddata(url, data_folder, data_file):
os.makedirs("data", exist_ok=True)
fname = data_folder + "/" + data_file
# urllib.request.urlretrieve(url, filename=fname)
request.urlretrieve(url, filename=fname)
with ZipFile(fname, "r") as zip:
print("extracting files...")
zip.extractall(path=data_folder)
print("done")
# os.remove(data_file)


# ************************** xml2yolo from https://gist.github.com/wfng92/c77c822dad23b919548049d21d4abbb8#file-xml2yolo-py ****************


def xml_to_yolo_bbox(bbox, w, h):
# xmin, ymin, xmax, ymax
x_center = ((bbox[2] + bbox[0]) / 2) / w
y_center = ((bbox[3] + bbox[1]) / 2) / h
width = (bbox[2] - bbox[0]) / w
height = (bbox[3] - bbox[1]) / h
return [x_center, y_center, width, height]


def yolo_to_xml_bbox(bbox, w, h):
# x_center, y_center width heigth
w_half_len = (bbox[2] * w) / 2
h_half_len = (bbox[3] * h) / 2
xmin = int((bbox[0] * w) - w_half_len)
ymin = int((bbox[1] * h) - h_half_len)
xmax = int((bbox[0] * w) + w_half_len)
ymax = int((bbox[1] * h) + h_half_len)
return [xmin, ymin, xmax, ymax]


def xml2yolo(input_dir, output_dir, image_dir):
classes = []
# create the labels folder (output directory)
dirExists(output_dir)
# identify all the xml files in the annotations folder (input directory)
files = glob.glob(os.path.join(input_dir, "*.xml"))
# loop through each
for fil in files:
basename = os.path.basename(fil)
filename = os.path.splitext(basename)[0]
# check if the label contains the corresponding image file
if not os.path.exists(os.path.join(image_dir, f"{filename}.jpg")):
print(f"{filename} image does not exist!")
continue
result = []
# parse the content of the xml file
tree = ET.parse(fil)
root = tree.getroot()
width = int(root.find("size").find("width").text)
height = int(root.find("size").find("height").text)

for obj in root.findall("object"):
label = obj.find("name").text
# check for new classes and append to list
if label not in classes:
classes.append(label)
index = classes.index(label)
pil_bbox = [int(x.text) for x in obj.find("bndbox")]
yolo_bbox = xml_to_yolo_bbox(pil_bbox, width, height)
# convert data to string
bbox_string = " ".join([str(x) for x in yolo_bbox])
result.append(f"{index} {bbox_string}")

if result:
# generate a YOLO format text file for each xml file
with open(
os.path.join(output_dir, f"{filename}.txt"), "w", encoding="utf-8"
) as f:
f.write("\n".join(result))
# generate the classes file as reference
with open("classes.txt", "w", encoding="utf8") as f:
f.write(json.dumps(classes))


# *********************** rearranging folders for yolov5 https://stackoverflow.com/questions/66238786/splitting-image-based-dataset-for-yolov3 *******************


def dirExists(name):
if not os.path.isdir(name):
os.mkdir(name)


def move(paths, folder):
for p in paths:
shutil.copy(p, folder)


def formatFolderStruct(f_loc, processed_folder, split_ratio):

# Get all paths to your images files and text files

PATH = f_loc + "/"
img_paths = glob.glob(PATH + "images/*.jpg")
txt_paths = glob.glob(PATH + "labels/*.txt")

# Calculate number of files for training, validation

data_size = len(img_paths)
r = split_ratio
train_size = int(data_size * r)

# Now split them
train_img_paths = img_paths[:train_size]
train_txt_paths = txt_paths[:train_size]
valid_img_paths = img_paths[train_size:]
valid_txt_paths = txt_paths[train_size:]

# Move them to train, valid folders
dirExists("./yolov5/datasets")
# newpath='datasets/fridgedata/'
dirExists("./yolov5/datasets/fridgedata/")
newpath_images = "./yolov5/datasets/fridgedata/images"
dirExists(newpath_images)
newpath_labels = "./yolov5/datasets/fridgedata/labels"
dirExists(newpath_labels)

# newpath='datasets/fridgedata'
train_images = newpath_images + "/train/"
valid_images = newpath_images + "/valid/"
train_label = newpath_labels + "/train/"
valid_label = newpath_labels + "/valid/"

dirExists(train_images)
dirExists(valid_images)
dirExists(train_label)
dirExists(valid_label)

move(train_img_paths, train_images)
move(train_txt_paths, train_label)
move(valid_img_paths, valid_images)
move(valid_txt_paths, valid_label)


# downloaddata(url, data_folder,data_file)
# print("data downloaded")

# xml2yolo(input_dir,output_dir,image_dir)
# print("formated to yolo")

formatFolderStruct(f_loc, processed_folder, split_ratio)
print("formated folder structure")
Loading

0 comments on commit 00b4d27

Please sign in to comment.