-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about applying SPG to our own point cloud data #11
Comments
Hi, thank you for your interest in our project.
to preprocess the data set. Then
to train on semantic with no l̶a̶b̶e̶l̶ color. And then run your test on your own dataset (with 000 color) using
Note that in order for this command to work you should first write have partitoned your data set using an adpated partition function, and have a perprocessing function for your dataset, see issue#6 . Note that as long as the format of your data is the same as Semantic3D or ply, the changes will be minimal (mostly changing directory names). I will soon code a function to help launch the code on new datasets, but not before a couple weeks.
Let us know if you encounter problems in this endeavor. loic |
Hi loic, Thank you for your prompt reply! I will give it a try in the near future. There are two things that I would like to make sure I understand it correctly:
Best, |
Hi, 1 - if you have the exact same format and you just want to do inference it should work yes. You will still be required to partition, then preprocess, then test (with Again, it will likely perform badly since the embedings intrinsically depend on color. 2 - I made a typo, I meant no color. |
Hello loic, Thanks for your answer! I tried your code and made it run successfully. Currently I follow all the dataset structure of Semantic3D (although I only have one txt file, I copy it for 3 times and put them into train/test_full/test_reduced). I only changed the first two lines in get_datasets to match our txt file name. And one question that I have now is that where can I find the segment result of the whole point cloud? I try to run write_Semantic3d and it indeed produce a _pred.ply. However, the point number is drastically reduced (originally we have one million points, but there are only 300 points in _pred.ply). Is it possible to get the semantic label for each original point? |
Hi, good to know that it somehow worked! if you run if you run: if you want to visualize the results (on a pruned point cloud), use:
with output_type a: 'i' orginal image, 'f' the geometric features, 'p' the partition (i highly suggest you run this one to make sure the partition worked) and 'r' for the results file. if you don't want to work with a pruned cloud set EDIT: I realize that you might have done that already. In which case, I think the problem is that you overpruned the initial data when partioning. Is your scan of very small extent? For now, you could try to fix it by relaunching your partition with |
Yes, I have already got the _pred.ply file (which only contains 300 points) in my root/data folder after I run Update: Another thing that confuses me is that after running write_Sementic3d.py I can get one _pred.ply under $SEMA3D_DIR/data/test_full/, and after running visualize.py I can get four .ply files under $SEMA3D_DIR/clouds/test_full/. All of these 5 ply files (1 under $SEMA3D_DIR/data/test_full/ and 4 under $SEMA3D_DIR/clouds/test_full/) contain 300 superpoints. Could you please tell me the difference between them? |
Hi,
All of these files are for the pruned dataset. It seem like your point cloud only contains 300 points after subsampling with a 5cm grid, so I assume it is very small??
|
Yes, I just choose a small part of our point cloud to test. Could you please tell me what does the color of points in _pred.ply mean? Are you assigning each label with a different color (e.g. [255 0 0] for man-made terrain)? |
SPG is designed for semantic segmentatin of large scenes, and is not really well suited for object or small scenes. You could directly use PointNet for that. See Figure 3 in the paper, or the function |
Hi, First of all: @loicland : Great work! Really appreciate the effort. I was following the exact same approach as @sycmio to try to feed data from the KITTI odometry dataset (LiDAR) to SPG. I registered sets of 50 single LiDAR shots in order to get a 'Scene' with a more dense, richer representation of the environment - especially for a more 'roundup' representation of objects using shots of different view angles as the car proceeds through the street. (sample scene in link below) I trained SPG as instructed in your comment above withholding RGB. 1.2) Do you make use of the fourth value in Sema3D data? I guess it is intensity. In KITTI I have reflectance values but they seem not comparable and are on a completely different scale so I was withholding these (set all to 0) for now. 2.1) With a model trained on those attributes I get less performance on Semantic3D test_full dataset but still reasonably well accuracy - can you please specify what exactly you mean by 'embedings intrinsically depend on color' ? Is this relevant to generating the SPG or PointNet training? As far as I am educated, PointNet does only optinally include color 2.2) With that same model trained I get extremely bad results on KITTI Scenes - almost everything is classified Building whereas some Trees are classified correctly (result .ply file in link below) but it is the minority - all other classes are missing entirely. 3.1) Partitioning works reasonably well on KITTI data. Generally, the partitioned areas are to large and sometimes overlap different classes but overall I guess it is quite usefull (maybe with some parameter tuning) (partitioned sample also in link below) 3.2) Partitioning quality degraded dramatically when the data was subsampled using a voxelgrid filter (tried different sizes) At this point I am trying to investigate what causes this performance drop - some ideas are:
Since the dependency on RGB appears not to be a showstopper and the partitioning shows fair results I am in good hope the framework will adapt to KITTI. Any comment is greatly appreciated. Best, PS: Regarding my Scenes: As mentioned it is 50 shots registered into one scene - no downsampling or voxelgrid filter since this has harmed the partitioning performance a lot. (EDIT: Just to be clear: I didnt use a voxelgrid filter before partitioning with the SPG framework - the filter in the partitioning script of SPG is still set to 5cm) Attached is the original input PC (with intensity and RGB set to 0) as a .txt file and all intermediate/end results from SPG. |
Hi maximilian Very interesting stuff here.
So you retrain on Semantic without RGB from scratch right? No fine tuning on your data set though? 1.1) yes for xyzlpvs. e is actually the z coordinate divided by 100 (see line 91 of 1.2) We do not use the intensity, as it was very noisy on semantic3d. 2.1) By ''embedings intrinsically depend on color' I discourage trying to apply directly the models trained on Semantic3D with RGB on data without RGB. I encourage to train a model from scratch on the semantic dataset without RGB. It seems to me that it is what you did? The original PointNet does only include color but ours does, as well as lpsv values. This can be altered though, by using 2.2) The bad performance is IMO due to the elevation and possibly the scale as well. Do you know what units the x y and z correspond to? There are some artifacts below the road which might throw off the scaling. A quick on dirty fix would be to rescale your z to that the road is around 0, and the average height of buildings is around 0.2 (=20m/100). Ideally I should implement a RANSAC based groundplane extraction + smart normalization. Will try to do it next week. 3.1) Partitioning seems okay-ish, but I would try to decrease the 3.2) Partitioning might not mesh well with velodyne 64
Could be, but the subsampling of superpoints should mostly mitigate that
Yes, the problem stems most likely from the normalization of z. And yes, also the lack of ground plane detection algorithm. Let me know if it gets better once you rescaled z properly and decrease the reg_strength! loic |
Hi Loic,
indeed interesting! I think we are on a good track here generalizing SPG to a (much more common) velodyne LiDAR. Further steps to think of might be more GPU support in partitioning to make the code run faster for e.g. robotics applications.
That is correct, Sir. 1.1) As far as I know, the scale is 1m = 1 unit. However, the data was in a camera coordinate frame so the Z-axis that you were inspecting was actually pointing along the street. With your explanation of the height dependencies, the poor performance makes perfectly sense. If you look at the _geof.ply on Google Drive it actually recognizes planes normal to driving direction as ground planes. At this point: Can you explain in detail the exact color coding of the _geof.ply files? 1.2) OK 1.3) Yes, that is what I did. 2.2) I think in first place the problem stem from the wrong coordinate frame as mentioned above, yes. I can try to further increase the performance by rescaling but I guess more potential lies in different ideas mentioned for now.
Cool. 3.1) I will try playing around with the reg_strength (had no time yet but seems promising) 3.2) One may observe that the performance of the partitioning works better in areas further from the center. So possible reasons for now might be:
I will try to tune the registration algorithm even further to reduce noise and play around with how many PCs I register (maybe 10 instead of 50 works better) Now some good news: With just the transformation of the coordinate frame (rotation into global frame + manually set ground plane to z=0) and removal of artetacts below ground plane we have a working baseline for optimization (see picture + files). As for the partitioning, the performance of segmentation performs much better in sparse regions. (see pictures) I think the large low vegetation area in the middle is due to high noise / poor segmentation mentioned in 3.2. Unfortunately the segmentation of cars and road performs particularly bad. Any thoughts? I will keep tuning parameters for now and get in touch once I have any further questions/results. I guess there is high potential in this endeavour. Thank you very much for the help up to this point, though. Best, Files: https://drive.google.com/open?id=19iYOlx5zKMHQPkkvjciREOIisZRpcUgZ |
We are working to CPU-parallelizing the cut pursuit algorithm, but it should take a few month at least. Running it on a GPU might be possible but tricky and beyond my abilities. Other partition algorithm could be used however.
Aah well there it is then. The z axis is particularized in many different step of the algorithm : computation of the verticality feature, of the elevation, of the superedge features, as a rotation axis of the SPG random rotation augmentation scheme etc... It will impact the features computation, the partition, the embeding and the edge filters, i.e. everything. To be honest I am surprised it was working at all! You absolutely should switch the rows 2 and 3 in your .txt, or in your data reading function.
Yes, I will complete the help on the subject. Red = linearity, Green = planarity, Blue = Verticality. Consequently, the road should be lime green in _geof.ply, and not the walls, I should have noticed that.
So you switched y and z already? Did you divide the elevation by 100 as well as in sema3d_dataset.py l91? Can you post the geof.ply file so I can check the geof are as should be?
Maybe you should try voxelizing again. Semantic has 5 cm voxelization! Or you could fine tune the model if you have some ground truth available.
I will be able to tell you more if I can see the _geof.ply and the _partition.ply |
Hey Loic,
Nice to hear, lets not get too much into detail here - just one last question: What is the most time consuming step according to your experience? I guess it is the nearest neighbour search? I found some works on GPU optimized NNS already - maybe worth a shot.
Yes, definetly. To be honest, I just forgot about that transformation while dealing with tons of conversions and registration.
Yes, I did that already.
I guess you mean Blue = Verticality? Please confirm.
I switched the axes, yes, but did not do the division by 100, yet. Can you explain in more detail what is the benefit of that scaling(PointNet takes elements on unit scale as an input?) ? As you can see, my buildings are mostly just around 5m high due to the limitations of the velodyne LiDAR - if I understood correctly, you would try rescaling with parameter 25 (5/25 = 0.2) in line 91 ?
The files are attached already in the comment above (Google Drive Link) Does the Link work for you?
What's the thought behind this? Do you mean the raw Semantic3D has 5cm voxelization before being fed into SPG framework or while partitioning within the framework already?
Unfortunately, I don't. It would be nice if you leave some thoughts about the _geof and _partition! Best, |
We are about to release a new version of the article with extended studies of the computation times. The computation of the nn and Voronoi neighborhood do take a significant part of the time, and are not optimized at all. Improving it would be great! The pruning and feature computation are quite fast, the partition is slowish but will be 10x faster once parallelized. If speed is really an issue, sub-sampling works great, and tends to increase the accuracy as well by decreasing the geometric and radiometric noise.
Confirmed and corrected above.
Our PointNet can potentially take more than just xyz rgb but elevation, and lpsv as well.
Semantic3D has more range in height, so we divided by 100. Might not have been the smartest choice, but the value line 91 should be the same when you train on Semantic and infer on your cloud.
No, our code starts by pruning (with the --voxel_width argument). Since you didn't change it and its default is 5cm for
So I've been looking at it in details. If you check a _geof.ply file from Semantic3D and from your cloud you can see they are quite different. Since Semantic uses a highly precise fixed LiDAR, its acquisition is way more precise. As a results, the road is very "flat" whereas yours is almost 40 cm "deep". Same for the buildings. Consequently, the planarity of roads and façades is too low, the scattering too high and the verticality too high/low respectively. Hence, the algorithm thinks it is seeing low volumetric slightly vertical objects everywhere : bushes, the yellow class. Without any kind of ground truth for fine tuning, the only thing I can think of would be to only use 'xyze' for the pointnets ( EDIT: after a quick and dirty test it works slightly better, see ply file. However, we have the following mistakes, wich will be hard to overcome: a) confusion between road and grass (no color + volumetric road = its thinks its grass) for a) you could just merge the classes road and grass directly in the training on semantic3D. For the rest, I think only fine tuning will help. of interest: the partition makes the quick annotation of data sets easier. |
super-quick update on this one since I have to leave now: I tried without registering pointclouds at all (raw lidar shots) and it seems that the cavehat is in the registration step (where the blurriness comes into play) - crisp ground detection and fair segmentation. The background on the registration was that pointnet seemed very robust to uniform distributed point-dropout, but dropped performance a lot when removing faces (e.g. LiDAR view angle problem - only one face of object visible) However, I think for this particular framework that does not play much of a role (not sure though) - attached is a trial on a single LiDAR shot - no more parameters optimized https://drive.google.com/open?id=1JwpblgPuXixvtnESPSFzwQUKW3rQZCUI not to mention that the performance is super fast on such few points (full pipeline incl. visualization~ 20sec on i5 4670k) |
Great. You could try to increase |
Hello,
Thank you very much for your great work! It really impressed us and we want to try your code on our own large scale outdoor point cloud data. Here I have some questions:
Looking forward to your reply!
The text was updated successfully, but these errors were encountered: