-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RCNN blog #27
Comments
Topics
|
Why started working on RCNNI have been working as AI engineer for 5 years. Most of my projects are mostly related to Computer vision, Almost all of them are on Object Detection. I have trained, fine-tuned, collected data, and deployed Object Detection models many times. But I could not get any chance to implement any one object detection model from scratch! Because implementing object detection models from scratch is not efficient when you already have open source implementations available! Is it necessary to implement?I do not know about others. But for me, implementing any model/algorithm from scratch helps me understand the issue far better than just reading from other sources! For production, we do not need to implement it from scratch. We can reuse available implementations/libraries. But for own self, I find implementing is the best way to know any architecture. Why an old algorithm like RCNN?For several reasons,
What kind of skills it may show?If I had implemented a SOTA algorithm, it might have shown my skills in the latest algorithms. I get that. But due to my less time in hand with full-time work and other stuff (family, learning other stuff on AI, sports, etc), I needed to start with an old and easy one. All-and-all, this implementation has taught me several things
TODO
|
TODO 2023/03/07
|
Update 2023/03/08
TODO 2023/03/08
|
Update 2023/03/09RCNN descriptionThe RCNN model is not an end-to-end model. i.e. we cannot feed the dataset as annotated, at one end and expect the model to figure out the rest. Rather it has a multi-step process for training. The processes are described below, Dataset PreparationFor the dataset preparation, we extract regions based on selective search and then filter out the regions with IoU greater than a certain threshold (here most likely one region might have overlap with multiple classes. For example, if there is a picture of both dog and cat, there is a chance that the regions of dog and cats will have common overlap. in that case, we consider the maximum region iou. [ Add flowchart for data preparation] psuedocode for image, bboxes,labels from dataset
for bbox, label in bboxes , labels
regions <- selective_search(image)
for region in regions
max_region_iou <- 0
for bbox in bboxes:
region_iou <- get_iou(region, bbox)
if region_iou > max_region_iou
max_region_iou <- region_iou
max_region_label <-label corresponding to bbox
if max_region_iou > upper_iou_threshold
save_the_region_in_the_respective_dir(region, max_region_label)
elif max_region_iou < lower_iou_threshold
save_the_region_as_background(region)
The code is inefficient. I hope to optimize it later. command python3 /src/prepare_data.py {voc2007,voc2012} --ss_method SS_METHOD --num_rects NUM_RECTS --output OUTPUTDIRECTORY --data_batch_size DATABATCHSIZE --split {train,test,validation} --upper_iou_thresh UPPER_IOU --lower_iou_thresh LOWER_IOU --minimum_bg MINIMUM_SIZE_OF_BACKGROUND_IMAGE TODO
|
Update 2023/03/10
TODO
if revision done
|
Update 2023/03/13ModelIn the original RCNN paper. they used
my implementation
DatasetThe original paper trained the model on VOC2007, and VOC2012 [confirm it]. I started to train on VOC2012, but the evaluation metrics didn't seem good at the beginning. It was very poor for several reasons (will explain later the challenges faced). But after some time I realized from the confusion matrix (add confusion matrix) that the model was working well on only pictures of dogs and cats. So I decided to only work on them from all of the classes. for simplicity, later maybe I will increase the complexity. steps
TODO
|
Update 2023/03/15
TODO
|
Update 2023/03/27Training model
The training model is simple as training a CNN model. We feed in the images per class . TODO
|
Update 2023/04/04Training modelIn the paper, they selected the
[Add block diagram] TODO
|
Update 2023/04/08Block diagramTODO
|
Update 2023/04/12Initial Result, challenges facedAfter training on If you check the confusion matrix properly, most objects were not classified correctly! There were biases for certain classes! This seemed problematic. So I tried to debug the issue. To make things easier, I chose only two classes,
Also, another problem seemed that due to one iou threshold, many background images having very similar iou (like 0.45) were selected as Background. to make sure background were really background
TODO
|
Update 2023/04/17
TODO
|
Update 2023/04/19
TODO
|
Objective
This issue is to work on RCNN blog.
Tasks
The text was updated successfully, but these errors were encountered: