Skip to content
This repository has been archived by the owner on Sep 21, 2018. It is now read-only.

Algorithm improvements to KD-Tree module: applications to ICP (Pranjal Kumar Rai) #1

Closed
jlblancoc opened this issue May 4, 2017 · 137 comments

Comments

@jlblancoc
Copy link
Member

jlblancoc commented May 4, 2017

Initial description

Nanoflann is a C++ library provided by MRPT for building KD-Trees. The library is a fork of the widely-used FLANN C++ library and is mostly optimized for lower dimensional point clouds. Most ICP algorithms for robot SLAM and localization require finding nearest neighbor in 2D/3D point clouds. Support for dynamic point clouds and nearest neighbour query in non flat spaces are two paramount features which are missing in the current nanoflann library.

See attached PDF.

5589797390254080_1491229396_MRPT_GSOC_Proposal.pdf

Final results:

@jlblancoc
Copy link
Member Author

Student: @pranjal-rai

Main mentor: @jlblancoc
Backup mentors: @feroze

@pranjal-rai
Copy link

Hi @jlblancoc and @feroze, Thank you for giving me the opportunity to work on this project. I am planning to start reading the existing code of nanoflann from the next week.
Please share any resource/pointer which might be helpful in understanding the library.

@jlblancoc
Copy link
Member Author

👍 !!

Since this project will be all about efficiency, I would start preparing, in your repo/fork, one subdirectory with a suite of benchmarks: executable programs that repeat certain operations and measure the time ellapsed, in microseconds or milliseconds; typically, one operation is repeated N times, then the time divided by N to have a better estimate. In our case, tasks to be benchmarked are, at least, building a KD-tree, and querying it for some random points. A couple pointcloud datasets could be added from some public repos as test cases.

You can take a look at the sources of mrpt-performance for "inspiration", but that is probably too complex for what we need here, which should be kept as simple as possible ;-)

@pranjal-rai
Copy link

While playing with the code I found a possible bug, I have created a pull request.

@jlblancoc
Copy link
Member Author

Great! I've just answered you there.

@pranjal-rai
Copy link

@jlblancoc

  • Suppose I require some new module to incorporate dynamic support in the library. If this requirement can be easily met by changing some existing module, should I change the existing module itself or create a new module for the requirement, so that it does not drop the performance when the library is being used in static point cloud mode. eg: I need a module which returns the address of the closest node in the kd-tree to a given query point, this task can be done by modifying existing findNeighbors(). So should I create another function for the task or include another parameter in the existing function itself?
  • I had to implement two approaches for rebuild threshold method and one approach for logarithmic method . So should I code all this in different branches and then benchmark them later to keep the best method or include all the approaches in the library and leave it upto the user to pick the required method.
  • We need executables to measure the time and validate correctness of the dynamic insertion approaches because the current tests do not handle dynamic clouds.

@jlblancoc
Copy link
Member Author

I think that the current KD-Tree index class should remain more or less as is, and tailored to static datasets. A new class should be created for the dynamic algorithm, since, if I recall it right, it requires having an array/vector of kd-tree indices, so this later class would use "composition" with the former one.

On the two approaches: I would prefer having this new class being unique. Then having some enum of possible alternatives. I think this way will be the best in terms of code reuse. We would change this decision later on if it seems impractical, but in principle, I always prefer having one single code whose behavior can be modified simply via a parameter/variable, without code changes.

We need executables to measure the time and validate correctness of the dynamic insertion approaches because the current tests do not handle dynamic clouds.

Yes, sure!

@jlblancoc
Copy link
Member Author

Hi @pranjal-rai !

How are you doing with the preliminary work on the datasets and/or performance tests? Do you need any help?

PS: Please, keep your fork up-to-date to the latest accepted pull-requests in the main nanoflann repo.

@pranjal-rai
Copy link

pranjal-rai commented May 22, 2017

Hi @jlblancoc

I forked flann and experimented with the code a bit. I noticed that there are two codes of our interest at this time.

  • First is the general flann code which is not specifically optimized for lower dimensional point clouds. In this code there is an existing addPoints module which is basically an implementation of the rebuild threshold method used for dynamic insertions.

  • Second is the modified code which is specifically optimized for lower dimensional point clouds. This is the source of nanoflann as well. The addPoints module in this code just rebuilds the entire tree after each insertion. It does not exploit any threshold for rebuild as such.

I quickly looked through the tree building procedure of both the codes. I think they are using different approaches for building and searching the tree. I coded the rebuild threshold approach for nanoflann but I found some cases where the proposed algorithm fails for nanoflann but passes for the general flann code. I am not sure if rebuild threshold approach can be applied to nanoflann. Right now I am stuck with this part.

I also wrote a rough code for logarithmic approach, the performance of this method is much better than the one in the general flann code for large number of dynamic insertions.

I have updated my forked repository.

@pranjal-rai
Copy link

Hi @jlblancoc

I have implemented logarithmic approach for dynamic support. I have created a new branch with proposed changes. I have tested it using random point clouds. For now I have added just one example file, I will add more once you review it.

@pranjal-rai
Copy link

pranjal-rai commented May 31, 2017

Hi @jlblancoc

I compared the index building time requirement for nanoflann and flann for inserting points dynamically. Flann is based on the rebuild threshold approach whereas nanoflann uses logarithmic approach.

plot

@pranjal-rai
Copy link

Hi @jlblancoc

Is there any comment on the logarithmic approach implementation?
I have experimented with the following four libraries to compare their performance with nanoflann in later phase.

Out of these only FLANN supports dynamic insertions/deletions. So I plan to illustrate the dynamic support comparison only between FLANN and nanoflann. I will illustrate other comparison results for all the four libraries along with nanoflann.

@jlblancoc
Copy link
Member Author

Hi!
Sorry for the gap... two weeks ago my second baby daughter was born so these have been really busy days! ;-)

You made good progress, congrats! If you go on at this pace, I'm sure we'll achieve the project goals.

Some quick answers:

I compared the index building time requirement for nanoflann and flann for inserting points dynamically. Flann is based on the rebuild threshold approach whereas nanoflann uses logarithmic approach.
(+ the figure)

Good results!
If you are using octave or matlab for plotting, I recommend using something like:
plot(x,y,'r-'); hold on; plot (x,y,'r.');
so we can see "dots" on each data point and clearly see how many samples do those graphs contain.

Another issue is: inserting points one by one is a good test case in general, but in mobile robotics it's far more common to insert them in batches of 100s to 1000s (for 2D or 3D LiDAR scanners, respectively). Probably the general trend of the graphs won't change but we will be reporting more realistic total time costs (many seconds seem too much, probably that's only because of the 1-by-1 approach). Give it a try to see how much results are affected...

Is there any comment on the logarithmic approach implementation?

I haven't compiled and tested it, just visually inspected the code for now.

Firstly, the approach of defining the new classes seems reasonable. 👍
However, I can't easily tell how much of the code in the new classes is identical to the static index version. How much would you say it is?
I'm asking because code duplication is evil and a source of maintenance nightmares and bugs!
What do you think about refacting the shared parts of static & dynamic in one common shared base class/struct? Perhaps it's not possible for some reason and you discarded it, let me know.

In any case, common code could be refactored anyway via the CRTP. Let me know if you want to learn more on this method.

One more thing: The vector here is expected to grow as new points are added, right? When we move to the range of millions of points, resizing a std::vector has an important cost due to reallocations. Consider using std::deque instead. Try running a benchmark using both options to see if it makes any significant difference but, unless std::deque ends up being worse in experimental measurements, we should prefer it over std::vector.

Also: in connection with the robotics applications, please consider adding an alternative addPoints() if that helps efficiency vs. calling N times addPoint() since, normally, maps are updated with batches of many points at once.

Out of these only FLANN supports dynamic insertions/deletions. So I plan to illustrate the dynamic support comparison only between FLANN and nanoflann. I will illustrate other comparison results for all the four libraries along with nanoflann.

👍

@pranjal-rai
Copy link

@jlblancoc thanks for the pointers. I will start working on all the pointers and get them done ASAP.

@pranjal-rai
Copy link

Hi @jlblancoc

Another issue is: inserting points one by one is a good test case in general, but in mobile robotics it's far more common to insert them in batches of 100s to 1000s (for 2D or 3D LiDAR scanners, respectively).
plot

The results change significantly when points are added in chunks of size 100, our time is nearly the same because as per theory the points in logarithmic approach are still added one by one but in flann time improves because now it requires much lesser number of rebuilds.

Consider using std::deque instead. Try running a benchmark using both options to see if it makes any significant difference but, unless std::deque ends up being worse in experimental measurements, we should prefer it over std::vector.

I compared the time taken for std::deque and std::vector. The latter performs slightly better. I am not sure about the reason for this.

Also: in connection with the robotics applications, please consider adding an alternative addPoints() if that helps efficiency vs. calling N times addPoint() since, normally, maps are updated with batches of many points at once.

I have removed addPoint and added a new function addPoints to insert multiple elements at a time. I have committed the changes in my branch.

However, I can't easily tell how much of the code in the new classes is identical to the static index version. How much would you say it is?

Most of the functions are common between the two classes. I duplicated the functions as I had to modify some of the existing functions to include deletion support because this feature was missing in static index adaptor. I am still working on this part and I will try to create one common shared base class for the functions which are exactly same.

Rebuild Threshold Approach

Regarding this method, is there any paper/resource which validates its correctness. I coded this part but it fails for some cases. The failure itself seems genuine. The thing is when a new point is added dynamically without rebuilding the index it is not assured that it will be at the same location in the tree where it would have been if the index was rebuilt from scratch. So when we perform the nearest neighbour query on the tree this particular branch which contains the dynamically inserted point might get ignored even though it was closest to the query point. I came across many such examples.

@jlblancoc
Copy link
Member Author

Good! Lots of results in a short time.

The results change significantly when points are added in chunks of size 100, our time is nearly the same because as per theory the points in logarithmic approach are still added one by one but in flann time

What method of flann are you comparing to? The "A"NN stands for approximate, so that may be also the answer to:

Rebuild Threshold Approach

Regarding this method, is there any paper/resource which validates its correctness. I coded this part but it fails for some cases. The failure itself seems genuine.

I don't have right now any reference at hand, but thresholding may be valid only to generate approximate trees. In that case, we can't compare an "exact vs approximate" algorithms, at least, not without making clear that they are different.

I will try to create one common shared base class for the functions which are exactly same.

👍 Remember trying to use CRTP to avoid virtual functions, which may (depends on compiler and many details) come with a runtime cost.

@pranjal-rai
Copy link

Hi @jlblancoc

  • In FLANN, if instead of using flann::SearchParams(128) we use flann::SearchParams(-1) the code performs exact searching (source). Right now I am using this mode for comparison. In this class the addPoints() method uses rebuild threshold approach.

  • Further there is another class in flann, KDTreeSingleIndex, which is the source of nanoflann as well, for this class addPoints() does not use rebuild threshold method, for every insertion it just rebuilds the entire tree again. We can compare our results with this approach.

I don't have right now any reference at hand, but thresholding may be valid only to generate approximate trees.

Even I think that rebuild threshold method can only be applied to approximate nearest neighbour searches.

@pranjal-rai
Copy link

pranjal-rai commented Jun 15, 2017

Hi @jlblancoc

  • I have created a new class KDTreeBaseClass which now has the functions common to static and dynamic adaptors. This class is inherited by KDTreeSingleIndexAdaptor and KDTreeSingleIndexDynamicAdaptor_. I have commited the changes to my branch. I have not used virtual functions.

  • For rebuild threshold approach, I have coded up both the approaches but I need to clean up the code. I still think that rebuild threshold approach works for approximate nearest neighbour searches. What should I do about this?

@jlblancoc
Copy link
Member Author

Hi @pranjal-rai !

Apart of a comment I made regarding the documentation of classes, I'm happy with your approach! No virtual methods means efficient code... 👍

I have coded up both the approaches

Sorry, I missed something... what two methods? Logarithm method and threshold?

I still think that rebuild threshold approach works for approximate nearest neighbour searches. What should I do about this?

Since we aim at exact NN, just remove the method that is an approximation.

Also: how are you doing with the bencharking application from real pointcloud datasets? I couldn't find it in your branch... I saw your new example, and it's good.

Note that since we don't want large files in Git, if we want to use a dataset it might be better to either: (a) edit the dataset so only a small file is added to the repo, or (b) use CMake to download the dataset files from Internet.

If you hasn't found datasets, let me know and I'll give you some.

Cheers.

@jlblancoc
Copy link
Member Author

This is a kind reminder to all GSoC students and mentors

According to GSoC FAQ, students are supposed to devote 30+ hours per week to their assigned project. This amounts to 6h/day (Mon-Fri) or 4.3h/day (Mon-Sun), or any equivalent distribution given the flexibility to work on your own schedule. Clearly failing to such a commitment may be a reason to fail in the monthly evaluations, since that information was clear at the time of applying to GSoC.

It's difficult to convert hours into GitHub commits, but as a rule of dumb, there should at least a substantial daily commit during each week working day, that is, roughly 5 substantial commits per week. This is a general rule, and mentors may be flexible depending on the difficulty and nature of each project.

The first evaluation will start June 26-30.

@pranjal-rai
Copy link

Hi @jlblancoc

I have added the documentation for all the classes.

Sorry, I missed something... what two methods? Logarithm method and threshold?

I proposed two different implementations for rebuild threshold method. I have implemented both but it seems that the rebuild threshold approach is valid only for approximate nearest neighbour searches. So this will not work for our purpose.

Note that since we don't want large files in Git, if we want to use a dataset it might be better to either: (a) edit the dataset so only a small file is added to the repo, or (b) use CMake to download the dataset files from Internet.

I think it will be better to use CMake to download the dataset files from Internet because we need to run tests on large files and editing the dataset won't be a feasible solution. I will update you with the real pointcloud datasets ASAP. If you have any suggestion for datasets, please share it with me.

@jlblancoc
Copy link
Member Author

@pranjal-rai
Copy link

pranjal-rai commented Jun 20, 2017

Hi @jlblancoc

We now have the support for dynamic point clouds. I have tested the implementation and it looks good to me. We can verify it again once you are done with executables for testing.

Also: As the rebuilding threshold approach does not work for us, so should I push these changes to my repo in a new branch or just leave it?

Also: how are you doing with the bencharking application from real pointcloud datasets?

I am now planning to create a GUI based application to perform all the benchmarking tests. This can either be done in MATLAB or Python with Tkinter (preferably latter, as I don't have much experience with Tkinter and I want to learn more about it). I will first design the tool with all the basic functionalities and then start with the coding part after the first evaluation. We can also add an option for running the codes directly for users who wish to perform intensive testing with their own parameters.
Do you have any other suggestion or should I proceed with the proposed plan?

Also, do I need to make a report for the first evaluation?

@jlblancoc
Copy link
Member Author

Also: As the rebuilding threshold approach does not work for us, so should I push these changes to my repo in a new branch or just leave it?

You could push it to a separate branch so we can take a look at it just for evaluation purposes for this first evaluation period; then remove it after some more weeks if will be definitively useless.

Ok, go on with the GUI for benchmarking. Please, let me know the list of exact benchmarks you plan to support in it. I think we've discussed this in the past, but it's good to revisit it again ;-)

Also, do I need to make a report for the first evaluation?

Nope, we'll just review based on public discussion (this thread!) and your commit. Might come back to ask you for some doubts if we need it.

@pranjal-rai
Copy link

Hi @jlblancoc

I have added both approaches for rebuild threshold method as separate branches [1], [2]. Right now I have added the support for dynamic insertions by modifying the existing class itself without duplicating the class. I will structure the code properly if we plan to use it in the future.

List of benchmarks:

@jlblancoc
Copy link
Member Author

jlblancoc commented Jun 27, 2017 via email

@pranjal-rai
Copy link

Hi @jlblancoc

Sure. I will go ahead and merge the Logarithm Method branch into master. I will then create another branch from master for benchmarking which can be merged later.

@pranjal-rai
Copy link

Hi @jlblancoc

I have merged Logarithmic Method into master and created another branch Benchmarking from master in which I will start working on the benchmarking tool.

I had some doubts regarding the work now.
As we planned to compare our result with 4 other kd-tree libraries, each of these libraries is needed to be first set up on the user's system before running the benchmarking tool. For example if the user asks to generate plots for time performance of flann and nanoflann, then the user has to make sure that flann is already set up on the system.
I was planning to write a script which will first check the availability of the required libraries and will go on to download the libraries which are not installed beforehand. We can do the same thing to check the availability of real point cloud datasets too. This script will be executed every time the tool is started.
Is this solution reasonable or do you have any other suggestion?

@jlblancoc
Copy link
Member Author

jlblancoc commented Jun 29, 2017 via email

@jlblancoc
Copy link
Member Author

jlblancoc commented Aug 20, 2017

Hi @pranjal-rai !
I just added a couple of PRs to your fork, take a look at them.

Also, I noticed that the code refactoring to move all shared code down to the base class KDTreeBaseClass could be even more efficient: these are duplicated items in the derived classes (Single vs. Dynamic) that can be (if I don't miss something important?) moved to the base class:

  • std::vector<IndexType> vind;
  • size_t m_leaf_max_size;
  • DatasetAdaptor &dataset;
  • m_size, m_size_at_index_build, dim, ...
  • NodePtr root_node;
  • BoundingBox root_bbox;
  • PooledAllocator pool;
  • Distance distance; (which, by the way, could be renamed to a better name like metric)

In general, all functions (init_vind(), etc. etc.) and variables shared by the two children classes should be moved to the base to avoid maintenance becoming a hell, and to be efficient ;-)

@jlblancoc
Copy link
Member Author

jlblancoc commented Aug 20, 2017

One fundamental thing that seems to be missing in your contributions so far is this: we have a great benchmark to test the computational efficiency of different metrics / algorithms; that's all good. But there are no unit tests that verify that the output of those methods is correct (!!).

Please, consider adding new unit tests to test_main.cpp for (see existing code there and use them as a template):

Comparing the nearest point according to KD-tree queries for:

  • SO2
  • SO3 (with SO3_InnerProdQuat_Adaptor only)
  • L2 with the dynamic algorithm.

against the output of the brute force algorithm (so, please use a relatively small number of points). As you can see, the current code checks this a number of times for random datasets to assess that the kd-tree always outputs the correct answer, for the L2 metric.

Duplicated code like definitions of auxiliary dataset classes, functions to generate a random point cloud (of the different kinds: XYZ, Quaternion,...) could be better refactored from their current examples/*.cpp files into a new examples/utils.h and examples/utils.cpp files, so those files could be #included in both, examples, and the unit tests. This can be done, naturally, adding those files to the CMakeLists.txt files.

Again, avoid code duplication at all cost!! It's evil! 😈

@jlblancoc
Copy link
Member Author

Again on code duplication:
It seems to me that SO2_Adaptor::evalMetric() can be rewritten as a call to SO2_Adaptor::accum_dist(), right? (or the other way around)

@jlblancoc
Copy link
Member Author

Now, regarding the topic of accum_dist(): the good news is that, unless I missed something, it seems we have a solution, but it will have a cost...

The point is that metrics must be able to be obtained element-wise for the kd-tree to work.
You can see this in the last argument idx of:

inline DistanceType accum_dist(const U a, const V b, int idx) const

In short: summing accum_dist(a[i],b[i],i) for i=0...dim-1 should be equal to evalMetric(a,b,dim) for all metrics.

So, this unfortunately means that the "acos()" metric must be ruled out, since I can hardly think of a way to compute it element-wise...

For L1, L2, SO(2) the implementation of accum_dist() seems obvious.

For SO(3)-InnerProdQuat: I thought of this solution:

  • Impose the user to always use positive (qx,qy,qz) and reflect the rotation sense via the sign of qr. This means adding a comment about this constraint in the doxygen docs; we could add a "for()" checking for this condition and launching an exception if data are not compliant, but that could have a large runtime cost, so just let's forget about it for now...
  • Let's see the formula:
		inline DistanceType evalMetric(const T* a, const size_t b_idx, size_t size) const {
			DistanceType result = DistanceType();
			for (int i=0; i<size; ++i) {
				const DistanceType diff = a[i] * data_source.kdtree_get_pt(b_idx, i);
				result += diff;
			}
			result = 1 - std::abs(result);

The accum_dist() for idx=0 could include the 1-std::abs(diff[0]), then for 1:3 we could now assume that all numbers will be positive, so each accum_dist() should return -a[i]*b[i] (or alike).
Please, give it a thought... it makes sense to me.

As a replacement for the acos-innerProduct metric, we could implement the Frobenius norm:

frobenius_norm

Please, unroll the expression (with paper and pencil... or with a symbolic package of Matlab, Mathematica or any other software) for the trace(R^t * Ri) you will see it's the sum of 3 terms, each of them a function of the 4 quaternion components. I think that this metric is suitable for being added up element by element, although some refactoring might be needed in the function parameters to pass not only the (a[i],b[i]) elements, but the entire vectors (a,b).
Please, give it a thought too...

PS: If time permits, please take note of this task that would be another great add-on: add more unit tests, one per metric class (L1,L2,SO2,...) and that, instead of building a complex dataset, etc. it just verifies that the sum of accum_dist() is equal to evalMetric(), as expected, for some random "points". For this, use gtest's EXPECT_NEAR().

PS2: Ask in case you have doubts... which would be more than understandable at this point! ;-)

@jbriales
Copy link

Hi everyone!
I must say I've been following this thread for the last days, curious on the accum_dist( ) issue and how you'd solve it. So far I didn't have any straight comment to do (specially since I don't understand too well the inner working of the KD-Tree and what's the role of accum_dist( )), but regarding the last suggestion of @jlblancoc in using the Frobenius norm: I think the actual relation is

Then I wonder, must the distance be an actual distance (positive)? I ask this because if you use alone this value can go from -3 to +3. A simple solution if this is an issue would be then to take each term in instead to have a distance that goes from 0 to +6.

Best,
Jesus

@pranjal-rai
Copy link

HI @jlblancoc

It seems to me that SO2_Adaptor::evalMetric() can be rewritten as a call to SO2_Adaptor::accum_dist()

Fixed

Now moving on to SO(3)-InnerProdQuat

The accum_dist() for idx=0 could include the 1-std::abs(diff[0]), then for 1:3 we could now assume that all numbers will be positive, so each accum_dist() should return -a[i]*b[i] (or alike).
Please, give it a thought... it makes sense to me.

I think there is a problem in this solution.
Consider two quaternions (w1, x1, y1, z1) and (w2, x2, y2, z2).
Now as per our metric, dis = 1 - abs(w1 * w2 + x1 * x2 + y1 * y2 + z1 * z2)
Let us assume that x, y, z are positive for both quaternions. Now dis is not necessarily equal to
1 - abs(w1 * w2) + x1 * x2 + y1 * y2 + z1 * z2 because even though the terms involving x, y and z are positive they cannot be taken outside abs().

Example:
q1 = (-.075, 0.12, 0.64, 0.07), q2 = (0.03, 0.80, 0.18, 0.56)
In this case eval_metric() gives 0.77 while summation of accum_dist() as per the proposed solution gives 0.72

Please correct me if I am wrong.

@pranjal-rai
Copy link

I have added accum_dist() for SO(3)-InnerProdQuat, in case you need to test it.

@pranjal-rai
Copy link

Also, I noticed that the code refactoring to move all shared code down to the base class KDTreeBaseClass could be even more efficient: these are duplicated items in the derived classes (Single vs. Dynamic) that can be (if I don't miss something important?) moved to the base class

Fixed

@jlblancoc
Copy link
Member Author

@jbriales thanks a lot! 👍
Yes, in fact I realize now that we need distances to be "good metrics", so they can't be negative...

@pranjal-rai :

Example:
q1 = (-.075, 0.12, 0.64, 0.07), q2 = (0.03, 0.80, 0.18, 0.56)

I noticed your quaternions are not unitary (|q1| = 0.659 ??) and think I found a bug in your code generation for random quaternions, please check my comment on your commit.

Giving it a second thought... I spoke a senseless thing yesterday: we CAN'T force (qx,qy,qz) to always be positive! It's the axis of rotation, we must cover quadrants with negative values no matter qw (the first quaternion element) is allowed to be negative or not.

This means that, apparently, there is no straighforward valid SO3 metric that can be built up element-by-element, and still complying with the kd-tree algorithm of assuming that adding the distance increment for one dimension always make distances to increase.

There is a final, valid alternative for SO(3): using the L2 (square norm) of the unit quaternion itself. That is, working with the quaternion as if it was a common 4-length vector in Euclidean space. This implies enforcing qw to be always positive, so there is no chances of having the same rotation described by different quaternions.

Please, rename and rewrite the SO3 metric so it works like this (yes: it would be identical to L2 for now!!). We could think of something better in the future...
Also, make sure of changing the generation of random quaternions such that the (x,y,z) part can be positive or negative (each component independently) while (qw) is always positive (or zero).

If you are done with this, then the next most important thing to do before the end of GSoC would be IMO the unit tests I mentioned above (including the new tests, and the refactoring of common code among examples and tests).

Keep pushing! ;-)

@pranjal-rai
Copy link

Hi @jlblancoc

I noticed your quaternions are not unitary (|q1| = 0.659 ??)

Actually I made a typo in comments, w in q1 is actually -0.75 not -0.075. Sorry.
I referred this resource (page 3 - Algorithm 2) to generate unit quaternions.

I have now updated the quaternion generation code to use the approach you suggested. Right now I am using any random angle between [0, PI] so that w remains positive and x, y, z can be of any sign. I ensured that I am generating unit quaternions.
I have added the support for SO3_adatptor using L2_Simple_adaptor and removed SO3_InnerProdQuat.
Related Commits: 1, 2

I will now start working on following tasks.

  • Write Unit Tests (Earlier, I wrote some tests locally to test my implementation, but did not follow the template. I will now read the code in tests/ and write tests accordingly.)
  • Remove duplicate codes
  • Further I will go through our previous conversation so that we do not miss any discussed task.

@jlblancoc
Copy link
Member Author

jlblancoc commented Aug 21, 2017 via email

@pranjal-rai
Copy link

Hi @jlblancoc

I have added unit tests to validate the following:

I will now start working on removing the duplicate codes.

@pranjal-rai
Copy link

Duplicated code like definitions of auxiliary dataset classes, functions to generate a random point cloud (of the different kinds: XYZ, Quaternion,...) could be better refactored from their current examples/*.cpp files into a new examples/utils.h and examples/utils.cpp files, so those files could be #included in both, examples, and the unit tests.

Fixed

@pranjal-rai
Copy link

I have added unit tests and also removed duplicate codes from examples and unit tests. I will clean and document all the codes by tomorrow.
What should I do next? What is the process of submission?

@pranjal-rai
Copy link

Hi @jlblancoc

I have cleaned the codes and moved some common functions to base class.
Related commits: 1, 2

What do we do now?

@jlblancoc
Copy link
Member Author

jlblancoc commented Aug 24, 2017 via email

@pranjal-rai
Copy link

Hi @jlblancoc

I have created a PR but building flann requires administrative permissions and the tests fail because of this reason. What should I do for this?

@pranjal-rai
Copy link

I fixed the previous issue. All other tests except this complete successfully now.

@jlblancoc
Copy link
Member Author

Good!

I think it's a problem of the external project, so let's ignore it.
To make travis-ci clean, please add an "if" like this one to the travis.sh function build() to add an argument to CMake to instruct it NOT to build the external projects (hence, neither the benchmarking app). I mean, something like:

  cmake $SRC_DIR -DBUILD_BENCHMARKING=OFF

Obviously, you should add some "if()"s as well inside the CMake scripts to detect that new CMake variable and skip the external projects and the benchmarking app if it's set.

@pranjal-rai
Copy link

Hi @jlblancoc

Thanks for the suggestion. Right now all the tests pass without any fail.

@pranjal-rai
Copy link

Hi @jlblancoc

Can you please add my contribution to the initial description of the project? I need to fill it in the final evaluations.

@jlblancoc
Copy link
Member Author

Done! 👍

Thanks for all the hard work during GSoC!
We mentors have some more days for filling our evaluations, so expect more comments and/or feedback as the code is tested.
Cheers

jlblancoc added a commit to jlblancoc/nanoflann that referenced this issue Aug 28, 2017
GSOC 2017 [Algorithm improvements to KD-Tree module]

Discussion on implemented features can be reached [here](MRPT/GSoC2017-discussions#1).
@jlblancoc
Copy link
Member Author

BTW: You can also create a gist (see for example this one) as a summary of your work, so you can be sure the content "belongs" to you and you can edit it at any time in the future...

It will be OK for me to use either this page, or a custom gist (in that case, please post a link to it here so we can quickly review it).

@pranjal-rai
Copy link

Hi @jlblancoc

I have created a report as you suggested. Please review it. I am adding this link in the evaluation for now.

@jlblancoc
Copy link
Member Author

jlblancoc commented Aug 30, 2017 via email

@jlblancoc
Copy link
Member Author

Hi @pranjal-rai ,

It was a pleasure to have you in GSOC-2017! 👍

I'm now closing this issue ticket, since the changes were already merged upstream. New issues or ideas discussion could go on either here, or in the nanoflann repo, ok? I still have to do a more careful test of your recent changes, so expect feedback in the next week or so.

Cheers!

@pranjal-rai
Copy link

Hi @jlblancoc

I'm now closing this issue ticket, since the changes were already merged upstream. New issues or ideas discussion could go on either here, or in the nanoflann repo, ok?

👍

It was a pleasure to have you in GSOC-2017!

Thank you very much.

Thank you being an awesome mentor throughout and giving me the opportunity to work on this project. I learned a lot from you while working on this project. I would love to contribute to MRPT in future too.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants