Skip to content
This repository has been archived by the owner on Feb 15, 2020. It is now read-only.

Getting started with Vision #7

Open
schristian opened this issue Jan 8, 2013 · 8 comments
Open

Getting started with Vision #7

schristian opened this issue Jan 8, 2013 · 8 comments
Labels

Comments

@schristian
Copy link
Contributor

So, up until now I've been mostly working with the AI code. I've been meaning to learn how to work with the vision code for a while, but I've been too busy working with the new software people and doing things for classes. Now that it's winter break, I can actually devote time to learning how the vision system works. Specifically, I was going to try and implement the underwater vision filter from the article one of you brought up some time ago. I've already looked into OpenCV a little bit. As for our code, Eliot told me that you were the people to go to for advice.

So, my question is, what should I do to get started with learning about the vision system? What kind of things should I look at to get a feel for how it works?

@jwonders
Copy link
Contributor

jwonders commented Jan 9, 2013

The vision system as a whole can be confusing. It is entirely possible to implement the algorithm in that paper without knowing anything about the vision system. I would suggest initially writing the code in a sandbox and later integrating it into the system. It will be easier to test and play around with if it stands alone at first.

If you want to know about the architecture, I can try and describe it in some more detail later. One of the main things to note is that the vision system has a set of detectors which can individually be turned on and off. When a detector is on, it is fed images from one of the cameras. It processes this image and publishes results as events which can be picked up by the python code, the estimation code, etc. The set of available detectors is hard coded. The detectors are configurable though.

Code to look at in order to understand the architecture.

  • VisionSystem - the actual subsystem implementation. Sets up the cameras, detectors, and vision runners.
  • VisionRunner - does the grunt work for turning on and off detectors and getting images from a single camera to the detectors that are on.
  • Camera - an image source
  • Recorder - an image sink
  • Detector - an object that takes in an image and publishes output about what it finds

Code to look at to understand how we currently find things from a technical standpoint.

  • ColorFilter - filters out rectangular regions of the color space
  • BlobDetector - finds connected components in binary images (black and white)
  • BasicGladiatorDetector - uses a naive bayesian approach to selecting the best symbol
  • BinDetector - ties together some of these components and publishes bin locations and symbol types

Keep asking questions as you run into them. There are many places where architectural improvements could be made. I personally dislike the Image interface. I think you should use the new OpenCV API i.e. cv::Mat. Somewhere, we should add a conversion function to go between cv::Mat and Image.

@gsulliva
Copy link

gsulliva commented Jan 9, 2013

If you want to look at how a detector works and uses all those together, it might be good to trace through the BuoyDetector code (start at the update() function), since that is a good standard example of how we currently use the color filter/blobs.

In general how it works is it grabs an image, then processes it for red blobs and publishes an event if it found anything from there, and will go through the same process for green, and then yellow. What that process consists of is first grabbing the config values for the "positive" range of pixel values (YUV in this case, but other color spaces like RGB can be used), these config values are the ones we find using the VisionTool, and are essentially user-defined pixel value ranges for what we will consider "acceptable" as that color buoy we are looking for. With those config values, we will iterate through each pixel of the image and determine if each color channel for that pixel falls within the range, and create a binary/boolean image (an image with 1 bit corresponding to each pixel). This is done using the ColorFilter class. We then take this binary image and toss it into the BlobDetector class, which iterates through the image and groups together any positive/white pixels that are touching. The BlobDetector will then spit out several "blobs", which is data for each group of positive/white pixels, such as x-coordinate of the left-most pixel, y-coordinate of the highest-pixel, how many positive pixels are there, etc. With these blobs, the detector looks through each one and sees if it fits with other config values (the blob must be of a certain size, and much have a certain percentage of pixels). If it fits all of them, then we accept that as a buoy and publish information for it (note: if i remember correctly, we search by largest first, and only publish one of each kind of buoy. So we publish the largest one we find and ignore any other blobs that may be positive.)

The Bin's vision also does some similar stuff (at least in that it uses a basic color filter), but it also does some extra stuff. I'll explain that in detail too in a later post.

And yes, definitely work in the sandbox at first, you don't want the added challenge of system integration when trying something new. Also the more C++-compliant OpenCV API (cv;:Mat and the like) is extremely easier to use, but you may have trouble finding examples using it. Avoid using IplImage, in my opinion. I updated the build system to use a new API version a while ago (I think currently we're stuck on 2.1?) and a lot has changed, so when I get around to double-checking my work and committing that, be prepared that there may be some hiccups.

@schristian
Copy link
Contributor Author

Alright, sounds good. I'll give BuoyDetector as well as those other files a look over soon, and get some practice with the sandbox.

@jwonders
Copy link
Contributor

I've made a wiki page for the important information that comes out of this discussion. https://github.com/robotics-at-maryland/tortuga/wiki/Vision-System

@jsternberg
Copy link
Contributor

A suggestion of a good project to get familiarity with OpenCV that would be really helpful is to update the code base to work with a newer version of OpenCV.

Compile and work with either 2.3 or 2.4. Install them as a dependency to /opt/ram/local and upload them to the site. Modify the bootstrap script to not download the ones from the Ubuntu repository and modify CMake to use these files instead. You can just hard code this for now. I can likely help more with the CMake files at some point, but doing anything non-trivial in CMake is a pain that requires digging through documentation for 3 hours and then killing yourself, which I wasn't planning to do this week.

Honestly, familiarity with OpenCV will likely get you pretty far. Although you do need to understand some of the concepts that exist in OpenCV, you don't need to code them yourself (unless that helps you understand them).

@schristian
Copy link
Contributor Author

Yeah, what I've mostly been doing is looking through the tutorials on the OpenCV site to get a familiarity with all the concepts. I'll give updating to OpenCV 2.4 a go at some point. Are there any other good resources that you would suggest?

@jsternberg
Copy link
Contributor

I would stay the steps are as follows:

  1. Compile OpenCV 2.4 with -DCMAKE_INSTALL_PREFIX=/opt/ram/local. This should create your dependency so you can even start using it.
  2. Figure out how to compile a simple program using OpenCV. Maybe load an image, convert it to grayscale, and then save it. You'll want to look for load, save, and cvtColor. The C++ interface is a lot easier to deal with than the C interface, but both will work (most of the older code uses the C interface).
  3. After you've figured out compiling on the command line, you'll need to mess with CMake to get it to cooperate with giving the correct command line flags. I'll be your friend here.
  4. Start compiling the code and note wherever it breaks. That's where you're fixing. Make sure to fork the project and ask for code reviews as you go through this process so that we can nitpick and give you advice on style and function.

@jwonders
Copy link
Contributor

Compiling OpenCV 2.4 should be a breeze. You probably just want a vanilla version without worrying about adding optional libraries like Eigen or CUDA. If you have FFMpeg installed from the ubuntu repository, it should pick it up.

git clone git://code.opencv.org/opencv.git
git checkout -b 2.4.3
cd opencv
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=/opt/ram/local
make
make install

That should just work. It will place a FindOpenCV.cmake script into /opt/ram/local/share

There aren't many good tutorial-like resources for the new API. The API documentation itself is pretty decent though. Please use the new API everywhere possible.

cv::imread(fileName) - this is what Sternberg means by load
cv::imwrite(mat, fileName) - this is what Sternberg means by save

I'll transfer the tutorials i've put on the old wiki to this one. If you can't figure out how to do something with the new API that you think you should be able to do, just ask. It will help me expand the tutorials.

As a note that will probably help you get the current code working with 2.4, CV_RGB can be found in "highgui.h"
Make sure not to replace it with cvScalar because cvScalar expects (b, g, r). CV_RGB is just a macro that changes the order of b and r.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants