Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having some issues with running on a custom dataset #38

Closed
lostcolony opened this issue Jun 8, 2018 · 19 comments
Closed

Having some issues with running on a custom dataset #38

lostcolony opened this issue Jun 8, 2018 · 19 comments

Comments

@lostcolony
Copy link

I'm trying to just get things working with a custom dataset right now, one that has no RGB values. The amount of data is small (if that matters), and so I'm not really looking for accuracy, just getting segmentation/training/testing working.

I've attempted to follow the steps listed for both custom and no-RGB data, but I'm still having an issue I'm uncertain of. Partitioning goes fine, reorganizing goes fine, but the training step fails with the following -

Traceback (most recent call last):
File "learning/main.py", line 383, in
main()
File "learning/main.py", line 286, in main
acc, loss, oacc, avg_iou = train()
File "learning/main.py", line 183, in train
embeddings = ptnCloudEmbedder.run(model, *clouds_data)
File "/home/ubuntu/dev/spgraph/learning/pointnet.py", line 135, in run_full_monger
out = model.ptn(Variable(clouds, volatile=True), Variable(clouds_global, volatile=True))
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/dev/spgraph/learning/pointnet.py", line 96, in forward
input = self.convs(input)
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 176, in forward
self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight[64, 8, 1], so expected input[3, 5, 128] to have 8 channels, but got 5 channels instead

Any suggestion as to what I've missed/am doing wrong?

@lostcolony
Copy link
Author

Oh, and I should add, that's with a different value of --ptn_nfeat_stn (but anything less than 6 is set to 8).

@loicland
Copy link
Owner

loicland commented Jun 8, 2018

Hi,

It is due to an insufficient documentation of my part. This was solved in this issue. I will add it in the readme very soon.

Let me know if this does not fix it for you.

@lostcolony
Copy link
Author

So hardcoding to what is at the end of that issue, I get -

RuntimeError: Given groups=1, weight[64, 8, 1], so expected input[69, 7, 128] to have 8 channels, but got 7 channels instead

Just poking and prodding, given how we're indexing, I figured for lpsv, it may have needed to be 4:8 (if we're to pick up 4 indices), which does indeed move me to a different error, mentioned below.

Traceback (most recent call last):
File "learning/main.py", line 383, in
main()
File "learning/main.py", line 286, in main
acc, loss, oacc, avg_iou = train()
File "learning/main.py", line 199, in train
acc_meter.add(o_cpu, t_cpu)
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torchnet/meter/classerrormeter.py", line 38, in add
pred = torch.from_numpy(output).topk(maxk, 1, True, True)[1].numpy()
RuntimeError: dimension out of range (expected to be in range of [-1, 0], but got 1)

@loicland
Copy link
Owner

loicland commented Jun 8, 2018

Can you run main.py in debug mode and tell me the dimensions of o_cpu and t_cpu when at line 199?

@lostcolony
Copy link
Author

So they're both empty; label_mode_cpu is an array only of -100 values.

@loicland
Copy link
Owner

loicland commented Jun 8, 2018

Doo you have ground truth labels available on your small dataset? Are you trying to fine tune?

If you just want to do inference, set the option --epochs to -1.

Ps: a common source for poor quality results is the scaling of the elevation variable. Are you using the retrained weight of semantic3d or s33dis?

@lostcolony
Copy link
Author

I largely used the semantic3d dataset as a model for arranging data and such; I have a set of labeled train data, and a few sets of unlabeled test data.

Now, that said, the data itself could be -completely- inappropriate for this. I wanted to see if this would be useful for training/testing against a single full rotational sweep of a 3d lidar, i.e., object recognition in semi-real time. So I'm assuming that rather than an entire 3d scene, I only have point projections from a single location in space. While eventually those would be 360 degrees, to start with (just to get some data to play with), I used a Kinect, so point projections from a single viewport. I snapped pointclouds with that for a number of objects, across a number of scenes/angles, labeled them by hand, and that's what I'm using.

@loicland
Copy link
Owner

loicland commented Jun 8, 2018

If your shots are indoor I would advise you to use the s3dis model.

It seems that the labels are not correctly seen by the algorithms.
1 - is inference working fine?
2 - is the label field in the h5 file in /features correct?
3 - if not, check what is happening around line 127 of /partition/partition.py

@lostcolony
Copy link
Author

Well, I'm not using either model, but looking to train one from scratch. I just meant I used the sema3d structure as a model, since it had unlabeled test data (and the s3dis used k-fold validation, splitting training into train/test sets; the former more closely matches what I'm trying to do). Setting epochs to -1 does execute (with a not unexpected 0% accuracy).

Looking into an object at random from features, I see 939 points in xyz, and 939 labels.

@loicland
Copy link
Owner

Hi,

can you post the content of the labels value in a /features/ h5 file? or post here a link to a .h5 file?

@lostcolony
Copy link
Author

Here's a zip with all the h5s (not many, again, very small data set to start with) - https://s3.amazonaws.com/kinect-toy-data/data.zip

@loicland
Copy link
Owner

Ok thanks. So it seems like none of the labels of your training dataset are read correctly.

In /partition/partition.plyaround line 128, is has_labels equal to true for files in the training set? If so what are the labels like?

@lostcolony
Copy link
Author

I should clarify; I'm actually using the elif condition for custom dataset. Here's what I changed that implementation to -

            elif args.dataset=='custom_dataset':
                #implement in provider.py your own read_custom_format outputing xyz, rgb, labels
                #example for ply files
                if (os.path.isfile(label_file)):
                    xyz, labels = read_ply(data_file, label_file)
                    if(len(list(filter(lambda x: x!=0, labels))) == 0):
                       raise Exception("LABELS ARE ALL ZERO")
                    print("LABELS:", str(labels))
                else:
                    print("HAS NO LABELS:" + data_file)
                    if(data_file.find("test") == -1):
                       raise Exception("DATA FILE" + data_file + " HAS NO LABELS")
                    xyz = read_ply(data_file, label_file)
                    labels = []
                print("XYZ", xyz)
                #another one for las files without rgb
                #xyz = read_las(data_file)
                if args.voxel_width > 0:
                    #an example of pruning without labels
                    xyz, rgb, labels = libply_c.prune(xyz, args.voxel_width, np.zeros(xyz.shape, dtype='u1'), np.array(1,dtype='u1'), 0)
                    #another one without rgb information nor labels
                    #xyz = libply_c.prune(xyz, args.voxel_width, np.zeros(xyz.shape,dtype='u1'), np.array(1,dtype='u1'), 0)[0]
                #if no labels available simply set here labels = []
                #if no rgb available simply set here rgb = [] and make sure to not use it later on
                rgb = []

I also changed read_ply to take in both file and label, and to handle them appropriately, but the raised errors as above should indicate that we have some sort of label info being read in at that stage.

@loicland
Copy link
Owner

Do the line xyz, labels = read_ply(data_file, label_file) allows you to retrieve nonzeros labels? If so what happens after xyz, rgb, labels = libply_c.prune(xyz, args.voxel_width, np.zeros(xyz.shape, dtype='u1'), np.array(1,dtype='u1'), 0)?

@lostcolony
Copy link
Author

Ah-hah. Okay. I added some split logic to handle things differently if it was an unlabeled file, vs a labeled one (trying, again, to use the sema3d provider as an example, since that is a case which has both), and it ran.

I'm now running into a different issue. If I prevent pruning on the test data, I get some mismatched arrays when calculating the confusion matrix; if I allow it, it all works, but attempting to upsample the prediction to the original test files leads to failure for the lack of RGB data in the feature file, specifically -

1 / 3---> test2
reading the subsampled file...
Traceback (most recent call last):
File "partition/write_custom.py", line 69, in
geof, xyz, rgb, graph_nn, l = read_features(fea_file)
File "/home/ubuntu/dev/spgraph/partition/provider.py", line 382, in read_features
rgb = data_file["rgb"][:]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/h5py/_hl/group.py", line 167, in getitem
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (object 'rgb' doesn't exist)"

@lostcolony
Copy link
Author

And following that to do the obvious, replacing the rgb line with rgb=[], I'm getting -

1 / 3---> test2
reading the subsampled file...
upsampling...
read lines 0 to 5000000
Traceback (most recent call last):
File "partition/write_custom.py", line 76, in
labels_ups = interpolate_labels_batch(data_file, xyz, labels_full, args.ver_batch)
File "/home/ubuntu/dev/spgraph/partition/provider.py", line 488, in interpolate_labels_batch
, skip_header=i_rows)
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 1713, in genfromtxt
first_line = _decode_line(next(fhd), encoding)
File "/home/ubuntu/anaconda3/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 630: invalid start byte

So...maybe it wrote out something wonky?

@loicland
Copy link
Owner

Right. These are bugs, I'm getting to it now, will push a commit when it's fixed.

@loicland
Copy link
Owner

loicland commented Jun 14, 2018

Hi,

I did not have time to provide a fully tested solution and I don't want to break the repo just before leaving for 10 days for a conference. Here are some of the (not fully tested) changes I have started to make:

new read_semantic3d_format in \partition\provider.py

def read_semantic3d_format(data_file, n_class, file_label_path, voxel_width, ver_batch):
    """read the format of semantic3d. 
    ver_batch : if ver_batch>0 then load the file ver_batch lines at a time.
                useful for huge files (> 5millions lines)
    voxel_width: if voxel_width>0, voxelize data with a regular grid
    n_class : the number of class; if 0 won't search for labels (test set)
    implements batch-loading for huge files and pruning""" 
    pruning = voxel_width > 0 #subsample the point cloud
    batch_loading = ver_batch>0 #load the point cloud by batch
    has_labels = n_class > 0 #will load the label file
    
    i_rows = 0
    xyz = np.zeros((0, 3), dtype='float32')
    rgb = np.zeros((0, 3), dtype='uint8')
    
    if pruning: #each voxel has an array of points encoding the labels of the points it contains
        labels = np.zeros((0, n_class+1), dtype='uint32') #first column is for unannotaed point
    else: #each point has a label
        labels = np.zeros((0,), dtype='uint32')

    while True: #loop until the whole point cloud is loaded
        
        if batch_loading:
            try:
                vertices = np.genfromtxt(data_file
                         , delimiter=' ', max_rows=ver_batch
                         , skip_header=i_rows)
            except StopIteration:
                #end of file
                break
            if len(vertices)==0:
                 break
        else: #load the whole file at once
            vertices = np.genfromtxt(data_file, delimiter=' ')
                
        xyz_full = np.array(vertices[:, 0:3], dtype='float32')
        rgb_full = np.array(vertices[:, 4:7], dtype='uint8')
        del vertices
        
        if has_labels:
            if batch_loading:
                labels_full = np.genfromtxt(file_label_path, dtype="u1", delimiter=' '
                            , max_rows=ver_batch, skip_header=i_rows)
            else:
                labels_full = np.genfromtxt(file_label_path, dtype="u1", delimiter=' ')
                
        if pruning: #subsampling
            if has_labels:
                xyz_sub, rgb_sub, labels_sub = libply_c.prune(xyz_full, voxel_width
                                             , rgb_full, labels_full, n_class)
                labels = np.vstack((labels, labels_sub))
            else:
                xyz_sub, rgb_sub, l = libply_c.prune(xyz_full, voxel_width
                                    , rgb_full, np.zeros(1, dtype='uint8'), 0)
            del xyz_full, rgb_full
            
            xyz = np.vstack((xyz, xyz_sub))
            rgb = np.vstack((rgb, rgb_sub))
            
        else: #no subsampling
            xyz = np.vstack((xyz, xyz_full))
            rgb = np.vstack((rgb, rgb_full))
            labels = np.hstack((labels, labels_full))
            
        i_rows = i_rows + ver_batch
        
        if not batch_loading: break #no need to loop if no batch loading
        
    if has_labels:
        return xyz, rgb, labels
    else:
        return xyz, rgb

new read_features in \partition\provider.py

def read_features(file_name):
    """read the geometric features, clouds and labels from a h5 file"""
    data_file = h5py.File(file_name, 'r')
    #fist get the number of vertices
    n_ver = len(data_file["linearity"])
    has_labels = len(data_file["labels"])
    #the labels can be empty in the case of a test set
    if has_labels:
        labels = np.array(data_file["labels"])
    else:
        labels = []
    if "rgb" in data_file:
        rgb = data_file["rgb"][:]
    else:
        rgb = []
    #---create the arrays---
    geof = np.zeros((n_ver, 4), dtype='float32')
    #---fill the arrays---
    geof[:, 0] = data_file["linearity"]
    geof[:, 1] = data_file["planarity"]
    geof[:, 2] = data_file["scattering"]
    geof[:, 3] = data_file["verticality"]
    xyz = data_file["xyz"][:]
    source = data_file["source"][:]
    target = data_file["target"][:]
    distances = data_file["distances"][:]
    #---set the graph---
    graph_nn = dict([("is_nn", True)])
    graph_nn["source"] = source
    graph_nn["target"] = target
    graph_nn["distances"] = distances
    return geof, xyz, rgb, graph_nn, labels

change the last lines of partition/vizualize.py to:

   del rgb_up
   #
   if args.ver_batch == 0:
       pred_up = interpolate_labels(xyz_up, xyz, pred_full, args.ver_batch)
   else:
       pred_up = interpolate_labels_batch(xyz_up, xyz, pred_full, args.ver_batch)
   print("writing the upsampled prediction file...")
   prediction2ply(ply_file + "_pred_up.ply", xyz_up, pred_up+1, n_labels, args.dataset)

As for your last error I am not sure. You can try to use partition/vizualize.py with -- upsample 1 to vizualize the result on the full input cloud.

Let me know if this fixes some of your errors. Otherwise I will provide a new patch in about 2 weeks.

cheers,

@loicland
Copy link
Owner

Hi,

did you have a chance to run those changes, and did it fix your problems?

lalital added a commit to lalital/superpoint_graph that referenced this issue Jul 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants