[Question] questions on detailed about dataset #2

Kin-Zhang · 2022-03-23T06:20:25Z

Thanks for providing the codes, it's amazing work. 🤩

Here are some questions after I read the paper and code readme:

Is the dataset provided in the repo the whole dataset as in the paper Table 7 which has 399776 frames?
Is the expert agent from lbc repo as the link is shown here? or as the paper said the CARLA behavior agent as link shown here? Since I didn't see the code for collecting the dataset.
As the dataset said in the paper is collected in all towns, as I knew leaderboard official public routes just have Town01-06 routes file? and I didn't find any additional routes files in this repo? would you mind to release that the route file you collected for data? since if we want to compare with the method it should have the same training routes for fair. If not, the all towns in the paper said, is that includes other towns like Town07 and Town10HD as said in paper table 7? or you also build another map for training?

Looking forward to your reply, and thanks again for this paper and codes.

dotchen · 2022-03-23T15:11:52Z

Thank you for your interest in our project.

Yes
It is a slightly modified version of the behavior agent.
All towns include Town07 and Town10HD. You may follow these instructions to install them in CARLA.

Kin-Zhang · 2022-03-23T15:15:15Z

Thank you for your interest in our project.

Yes

It is a slightly modified version of the behavior agent.

All towns include Town07 and Town10HD. You may follow these instructions to install them in CARLA.

Thanks for replying, and question 3 is about the route file, I knew there are additional maps but the CARLA leaderboard public routes didn't provide the route files about these two. That's why I am curious about it.

Thanks again.

dotchen · 2022-03-23T15:16:37Z

We use randomized routes the collect our dataset. This is similar to our previous project World on Rails.

Kin-Zhang · 2022-03-23T15:17:40Z

We use randomized routes the collect our dataset. This is similar to our previous project World on Rails.

Oh, I see. I will try to check again. Thanks for answering. Really appreciate.

Related to: issue comment

Kin-Zhang · 2022-03-23T16:19:56Z

Sorry to bother you, is there any method that I can download the dataset through python script without box account ( I searched for a while which need box SDK and login as subscribed account)
Since when I click the page download... It cannot download at once.

dotchen · 2022-03-23T16:30:50Z

Yes, I also just realized this issue and I am currently compressing the trajectories into .gz files. I will upload them in a few hours.

Kin-Zhang · 2022-03-23T16:39:00Z

ha! thanks for replying so quickly.

btw, I notice even for the business box it still have the maximum upload size which is 150G here

dotchen · 2022-03-23T18:21:35Z

I will split the gz file so each file is around 8G

Kin-Zhang · 2022-03-23T18:36:58Z

I will split the gz file so each file is around 8G

Thanks! and is that possible for once download button to download all? will appear the error as shown above the selected items exceed the download size limit?

dotchen · 2022-03-23T18:53:01Z

Yes should be possible, the files split format are going to be same as the world on rails dataset, except the format is in lmdb and can be directly used with the repo. If not separating them as like two downloads should also work.

Kin-Zhang · 2022-03-24T03:04:56Z

Yes should be possible, the files split format are going to be same as the world on rails dataset, except the format is in lmdb and can be directly used with the repo. If not separating them as like two downloads should also work.

Thanks! I will wait for updating. Thanks agin!

Watson52 · 2022-03-24T03:19:12Z

Yes should be possible, the files split format are going to be same as the world on rails dataset, except the format is in lmdb and can be directly used with the repo. If not separating them as like two downloads should also work.

Hi, may I ask how to open the .mdb file? I want to have a look on the data.

Kin-Zhang · 2022-03-24T04:47:05Z

Hi, may I ask how to open the .mdb file? I want to have a look on the data.

you should read code on dataset here, the repo show how to see data. For example:

LAV/lav/utils/datasets/basic_dataset.py

Lines 34 to 46 in 23e2f1b

    
           for full_path in glob.glob('{}/**'.format(self.data_dir)): 
        
               # Toss a coin 
        
               if np.random.random() > self.percentage_data: 
        
                   continue 
        
               txn = lmdb.open( 
        
                   full_path, 
        
                   max_readers=1, readonly=True, 
        
                   lock=False, readahead=False, meminit=False).begin(write=False) 
        
               n = int(txn.get('len'.encode())) 
        
               town = str(txn.get('town'.encode()))[2:-1]

dotchen · 2022-03-25T05:52:25Z

Hi @Kin-Zhang , the dataset is now on box: https://utexas.box.com/s/fcj52g9juilnp4mt5k5fsqcqkxae77cb

Let me know if you encounter any issues downloading or using the dataset. Thanks!

Watson52 · 2022-03-25T09:01:13Z

Hi @Kin-Zhang , the dataset is now on box: https://utexas.box.com/s/fcj52g9juilnp4mt5k5fsqcqkxae77cb

Let me know if you encounter any issues downloading or using the dataset. Thanks!

Hi @dotchen, I also see the new dataset link, thank you very much! btw, may I ask how long did it take to train LAV on 4 Titan Pascal? And how long will it take to collect the data for about 400K frames?

penghao-wu · 2022-03-25T16:24:04Z

Hi, I wonder how I could decompress the downloaded files? I have removed the postfix, but it says that they are not in gzip format.

dotchen · 2022-03-25T18:19:08Z

Hi @Watson52 ,

Each stage takes different amount of time, but they are all around 2-3 days of time with 4 Titan pascal. It might be faster if you have better GPUs.

dotchen · 2022-03-25T18:20:12Z

Hi, I wonder how I could decompress the downloaded files? I have removed the postfix, but it says that they are not in gzip format.

The files are split tar.gz files, please download them all and then decompress them, no need to remove the postfix.

Kin-Zhang · 2022-03-26T05:17:12Z

Here is stack answer for how to extract split file

you can try following command:

zcat LAV-full.gz.* | tar -x

processing figure:

Kin-Zhang · 2022-03-27T04:19:36Z

Hi @dotchen. I have some questions before training:

[num of epoch on every step] there are four steps to the training. and does the epoch shows the same config with your train result on paper in this repo? Since I saw the train_eg.py just use the num of the epoch as 1 as default, it's quite weird. I'd like to know this setting on your training since I didn't see this config setting in the paper appendix like you illustrate other hyperparameters on training.

If possible, could you please tell me the num epoch you set for these four training?
[lidar sem data] the step of Point Painting, is to write the lidar_sem_ to the dataset, is the dataset provided here already have the lidar_sem data?

dotchen · 2022-03-27T17:05:41Z

[num of epoch on every step] there are four steps to the training. and does the epoch shows the same config with your train result on paper in this repo? Since I saw the train_eg.py just use the num of the epoch as 1 as default, it's quite weird. I'd like to know this setting on your training since I didn't see this config setting in the paper appendix like you illustrate other hyperparameters on training.

The provided weight is the 45 DS entry in the ablations. The number in the file names corresponds to the number of epoch they are trained in.

[lidar sem data] the step of Point Painting, is to write the lidar_sem_ to the dataset, is the dataset provided here already have the lidar_sem data?

~~Yes, it is already provided in the released dataset.~~
EDIT: Please use the point painting script

Kin-Zhang · 2022-03-28T09:54:20Z

The provided weight is the 45 DS entry in the ablations.

Thanks for letting me know.

Yes, it is already provided in the released dataset.

        self.seg_model = RGBSegmentationModel(self.seg_channels).to(self.device)
        self.seg_model.load_state_dict(torch.load(self.seg_model_dir, map_location=self.device))
        self.seg_model.eval()

@dotchen is this seg model use the seg_1.th with only one epoch trained?

[lidar sem data] the step of Point Painting, is to write the lidar_sem_ to the dataset, is the dataset provided here already have the lidar_sem data?

I also found the whole datasets may not provide all lidar_sem data in it since when I tried trained all towns here lack of NoneType, just check with you to ensure I didn't miss something:

  File "/LAV/lav/utils/datasets/lidar_painted_dataset.py", line 27, in __getitem__
    lidar_painted = self.__class__.access('lidar_sem', lmdb_txn, index, 1).reshape(-1,len(self.seg_channels))
  File "/LAV/lav/utils/datasets/basic_dataset.py", line 83, in <listcomp>
    return np.stack([np.frombuffer(lmdb_txn.get((f'{tag}_{t:05d}{suffix}').encode()), dtype) for t in range(index,index+T)])
TypeError: a bytes-like object is required, not 'NoneType'

dotchen · 2022-03-29T07:55:13Z

I will have to take a deeper look. In the mean-time you can relabel the dataset by running the point painting script.

dotchen · 2022-03-29T08:17:11Z

Ok looks like I might have overwritten some of the lmdb while testing the refactored code, causing it to miss some of the frames...
Please relabel the dataset by running the point painting script, sorry for inconvenience.

Kin-Zhang · 2022-03-30T09:23:06Z

Thanks for letting me know.

keishihara · 2022-06-07T02:53:46Z

Thanks for all discussions done here. It was very helpful to catch up the work.
I've read through them and here are some trivial comments for those working on this repo.

Extracting separate gzip files with zcat LAV-full.gz.* | tar -x didn't work for me. Instead, just cat LAV-full.gz.* | tar -xz worked.
I found that when loading datasets atefbmmouv causes core dump, therefore I had to skip that one to dismiss this error mentioned here

File "/LAV/lav/utils/datasets/lidar_painted_dataset.py", line 27, in __getitem__
  lidar_painted = self.__class__.access('lidar_sem', lmdb_txn, index, 1).reshape(-1,len(self.seg_channels))
File "/LAV/lav/utils/datasets/basic_dataset.py", line 83, in <listcomp>
  return np.stack([np.frombuffer(lmdb_txn.get((f'{tag}_{t:05d}{suffix}').encode()), dtype) for t in range(index,index+T)])
TypeError: a bytes-like object is required, not 'NoneType'

Kin-Zhang · 2022-06-07T03:11:10Z

I fixed the problem, I forgot to do a pull request that can let others know.
Here is the commit: Kin-Zhang@fa045b6
@keishihara

keishihara · 2022-06-07T04:39:41Z

@Kin-Zhang Thank you for your comment!
I will check it out :)

JianLiMech · 2022-11-08T16:58:59Z

Thanks for all discussions done here. It was very helpful to catch up the work. I've read through them and here are some trivial comments for those working on this repo.

1. Extracting separate gzip files with `zcat LAV-full.gz.* | tar -x` didn't work for me. Instead, just `cat LAV-full.gz.* | tar -xz` worked.

2. I found that when loading datasets `atefbmmouv` causes core dump, therefore I had to skip that one to dismiss this error mentioned [here](https://github.com/dotchen/LAV/issues/2#issuecomment-1080442621)

File "/LAV/lav/utils/datasets/lidar_painted_dataset.py", line 27, in __getitem__
  lidar_painted = self.__class__.access('lidar_sem', lmdb_txn, index, 1).reshape(-1,len(self.seg_channels))
File "/LAV/lav/utils/datasets/basic_dataset.py", line 83, in <listcomp>
  return np.stack([np.frombuffer(lmdb_txn.get((f'{tag}_{t:05d}{suffix}').encode()), dtype) for t in range(index,index+T)])
TypeError: a bytes-like object is required, not 'NoneType'

Hello thank you for your information!

I had a problem when I tried to download and extract the dataset here

I downloaded 2 parts of the copressed file (16GB), and ran zcat LAV-full.gz.* | tar -x and cat LAV-full.gz.* | tar -xz, but it didn't work. The error was:


gzip: LAV-full.gz.ab: not in gzip format

gzip: LAV-full.gz.ac: not in gzip format
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors

Do you know how I should extract the file?

Kin-Zhang · 2022-11-09T15:10:46Z

Do you know how I should extract the file?

You must download all the dataset, only have two of them, cannot extract correctly.

Kin-Zhang closed this as completed Mar 23, 2022

dotchen reopened this Mar 25, 2022

Kin-Zhang closed this as completed Mar 30, 2022

Kin-Zhang mentioned this issue Jun 7, 2022

fix: when data folder have data.mdb missing #17

Merged

viola521 mentioned this issue Jul 8, 2023

ran the train_full.py load data error TypeError: a bytes-like object is required, not 'NoneType' viola521/LAV-main#3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] questions on detailed about dataset #2

[Question] questions on detailed about dataset #2

Kin-Zhang commented Mar 23, 2022 •

edited

Loading

dotchen commented Mar 23, 2022

Kin-Zhang commented Mar 23, 2022

dotchen commented Mar 23, 2022

Kin-Zhang commented Mar 23, 2022 •

edited

Loading

Kin-Zhang commented Mar 23, 2022

dotchen commented Mar 23, 2022

Kin-Zhang commented Mar 23, 2022

dotchen commented Mar 23, 2022

Kin-Zhang commented Mar 23, 2022

dotchen commented Mar 23, 2022 •

edited

Loading

Kin-Zhang commented Mar 24, 2022 •

edited

Loading

Watson52 commented Mar 24, 2022

Kin-Zhang commented Mar 24, 2022 •

edited

Loading

dotchen commented Mar 25, 2022

Watson52 commented Mar 25, 2022 •

edited

Loading

penghao-wu commented Mar 25, 2022

dotchen commented Mar 25, 2022

dotchen commented Mar 25, 2022

Kin-Zhang commented Mar 26, 2022 •

edited

Loading

Kin-Zhang commented Mar 27, 2022 •

edited

Loading

dotchen commented Mar 27, 2022 •

edited

Loading

Kin-Zhang commented Mar 28, 2022 •

edited

Loading

dotchen commented Mar 29, 2022

dotchen commented Mar 29, 2022

Kin-Zhang commented Mar 30, 2022

keishihara commented Jun 7, 2022

Kin-Zhang commented Jun 7, 2022

keishihara commented Jun 7, 2022

JianLiMech commented Nov 8, 2022

Kin-Zhang commented Nov 9, 2022

[Question] questions on detailed about dataset #2

[Question] questions on detailed about dataset #2

Comments

Kin-Zhang commented Mar 23, 2022 • edited Loading

dotchen commented Mar 23, 2022

Kin-Zhang commented Mar 23, 2022

dotchen commented Mar 23, 2022

Kin-Zhang commented Mar 23, 2022 • edited Loading

Kin-Zhang commented Mar 23, 2022

dotchen commented Mar 23, 2022

Kin-Zhang commented Mar 23, 2022

dotchen commented Mar 23, 2022

Kin-Zhang commented Mar 23, 2022

dotchen commented Mar 23, 2022 • edited Loading

Kin-Zhang commented Mar 24, 2022 • edited Loading

Watson52 commented Mar 24, 2022

Kin-Zhang commented Mar 24, 2022 • edited Loading

dotchen commented Mar 25, 2022

Watson52 commented Mar 25, 2022 • edited Loading

penghao-wu commented Mar 25, 2022

dotchen commented Mar 25, 2022

dotchen commented Mar 25, 2022

Kin-Zhang commented Mar 26, 2022 • edited Loading

Kin-Zhang commented Mar 27, 2022 • edited Loading

dotchen commented Mar 27, 2022 • edited Loading

Kin-Zhang commented Mar 28, 2022 • edited Loading

dotchen commented Mar 29, 2022

dotchen commented Mar 29, 2022

Kin-Zhang commented Mar 30, 2022

keishihara commented Jun 7, 2022

Kin-Zhang commented Jun 7, 2022

keishihara commented Jun 7, 2022

JianLiMech commented Nov 8, 2022

Kin-Zhang commented Nov 9, 2022

Kin-Zhang commented Mar 23, 2022 •

edited

Loading

Kin-Zhang commented Mar 23, 2022 •

edited

Loading

dotchen commented Mar 23, 2022 •

edited

Loading

Kin-Zhang commented Mar 24, 2022 •

edited

Loading

Kin-Zhang commented Mar 24, 2022 •

edited

Loading

Watson52 commented Mar 25, 2022 •

edited

Loading

Kin-Zhang commented Mar 26, 2022 •

edited

Loading

Kin-Zhang commented Mar 27, 2022 •

edited

Loading

dotchen commented Mar 27, 2022 •

edited

Loading

Kin-Zhang commented Mar 28, 2022 •

edited

Loading