Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register custom dataset -AssertionError #1647

Closed
andresviana opened this issue Jun 20, 2020 · 5 comments
Closed

Register custom dataset -AssertionError #1647

andresviana opened this issue Jun 20, 2020 · 5 comments

Comments

@andresviana
Copy link

andresviana commented Jun 20, 2020

❓ How to register custom dataset in detectron2

Hi guys, from Colab, I would like to learn how to register in detectron2 my custom dataset.
https://rosenfelder.ai/Instance_Image_Segmentation_for_Window_and_Building_Detection_with_detectron2/#prepare-the-data

Inputs
I used via.html to make annotations and save them in two json files (train - val)
Each image has street and hole labels.
images are located in Colab in train and val folders, inside of each folder there are images and a json file called

I found a function in beginer's tutorial for converts them into a format that is usable by detectron2.

I expect how outputs about that function converts images into a format that is usable by detectron2

Code implemented
def get_street_dicts(img_dir):
"""This function loads the JSON file created with the annotator and converts it to
the detectron2 metadata specifications.
"""
# load the JSON file
json_file = os.path.join(img_dir, "via_region_data.json")
with open(json_file) as f:
imgs_anns = json.load(f)

dataset_dicts = []
# loop through the entries in the JSON file
for idx, v in enumerate(imgs_anns.values()):
    record = {}
    # add file_name, image_id, height and width information to the records
    filename = os.path.join(img_dir, v["filename"])
    height, width = cv2.imread(filename).shape[:2]

    record["file_name"] = filename
    record["image_id"] = idx
    record["height"] = height
    record["width"] = width

    annos = v["regions"]

    objs = []
    # one image can have multiple annotations, therefore this loop is needed
    for annotation in annos:
        # reformat the polygon information to fit the specifications
        anno = annotation["shape_attributes"]
        px = anno["all_points_x"]
        py = anno["all_points_y"]
        poly = [(x + 0.5, y + 0.5) for x, y in zip(px, py)]
        poly = [p for x in poly for p in x]

        region_attributes = annotation["region_attributes"]["class"]

        # specify the category_id to match with the class.

        if "street" in region_attributes:
            category_id = 1
        elif "hole" in region_attributes:
            category_id = 0

        obj = {
            "bbox": [np.min(px), np.min(py), np.max(px), np.max(py)],
            "bbox_mode": BoxMode.XYXY_ABS,
            "segmentation": [poly],
            "category_id": category_id,
            "iscrowd": 0,
        }
        objs.append(obj)
    record["annotations"] = objs
    dataset_dicts.append(record)
return dataset_dicts

for d in ["train", "val"]:
DatasetCatalog.register("streets_" + d,lambda d=d: get_street_dicts("/content/potholes/"+ d))
street_metadata = MetadataCatalog.get("streets_train")
dataset_dicts = get_street_dicts("/content/potholes/train")

AssertionError
AssertionError Traceback (most recent call last)
in ()
59 from detectron2.data import DatasetCatalog, MetadataCatalog
60 for d in ["train", "val"]:
---> 61 DatasetCatalog.register("streets_" + d,lambda d=d: get_street_dicts("/content/potholes/", d))
62 street_metadata = MetadataCatalog.get("streets_train")
63 dataset_dicts = get_street_dicts("/content/potholes/train")
/content/detectron2_repo/detectron2/data/catalog.py in register(name, func)
38 assert callable(func), "You must register a function with DatasetCatalog.register!"
39 assert name not in DatasetCatalog._REGISTERED, "Dataset '{}' is already registered!".format(
---> 40 name
41 )
42 DatasetCatalog._REGISTERED[name] = func
AssertionError: Dataset 'streets_train' is already registered!

Thanks for check it.

@ppwwyyxx
Copy link
Contributor

As the error says, the dataset is already registered. Registering it again is expected to cause this error

@RishiMalhotra920
Copy link

How do you unregister the dataset. I'm creating a huge number of datasets because I can't unregister them.
Even better - how do I attach a new create_dataset_dicts function to a registered dataset

@ghazni123
Copy link

@RishiMalhotra920
I am facing same issue and there is no doc talking about it. did you manage to find how to unregister a dataset? thanks.

@rogertrullo
Copy link

@ghazni123 @RishiMalhotra920 I was having the same issue; after reading the code I found there is a clear method that you can use to unregister:
DatasetCatalog.clear()

@aniruddhakal
Copy link

@RishiMalhotra920 @rogertrullo @ppwwyyxx
You could find these 3 methods helpful:
DatasetCatalog.list() - lists all registered dataset instances.
DatasetCatalog.get('coco_instance_name')
DatasetCatalog.remove('coco_instance_name')

You could use something like this to remove and re-register:

dataset_name = 'coco_dataset'

if dataset_name in DatasetCatalog.list():
    DatasetCatalog.remove(dataset_name)

register_coco_instances(dataset_name, ...)

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants