Index out of bounds in Aggression Query #4

InkosiZhong · 2022-11-26T07:06:05Z

I follow the Reproducing Experiments, and get a index is out of bounds error when executing NightStreetAggregateQuery and NightStreetAveragePositionAggregateQuery in night_street_offline.py.

Target DNN Invocations: 100%|██████████████████████| 7000/7000 [00:00<00:00, 2924606.83it/s]
Propagation:   0%|                                               | 0/973136 [00:00<?, ?it/s]/home/inkosizhong/Lab/VideoQuery/tasti/tasti/query.py:37: RuntimeWarning: invalid value encountered in true_divide
  weights = weights / weights.sum()
Propagation: 100%|███████████████████████████████| 973136/973136 [00:25<00:00, 38122.22it/s]
r 1
Traceback (most recent call last):
  File "tasti/examples/night_street_offline.py", line 276, in <module>
    query.execute_metrics(err_tol=0.01, confidence=0.05)
  File "/home/inkosizhong/Lab/VideoQuery/tasti/tasti/query.py", line 86, in execute_metrics
    res = self._execute(err_tol, confidence, y)
  File "/home/inkosizhong/Lab/VideoQuery/tasti/tasti/query.py", line 69, in _execute
    estimate, nb_samples = sampler.sample()
  File "/home/inkosizhong/Lab/VideoQuery/blazeit/blazeit/aggregation/samplers.py", line 58, in sample
    sample = self.get_sample(Y_pred, Y_true, t)
  File "/home/inkosizhong/Lab/VideoQuery/blazeit/blazeit/aggregation/samplers.py", line 105, in get_sample
    yt_samp = Y_true[nb_samples]
IndexError: index 973136 is out of bounds for axis 0 with size 973136

This error raise at Sampler.sample() in blazeit/aggregation/samplers.py, where the index variable t increases unlimited.
I guess that under normal situation EBS will make the sampling stop before reaching the upper bound (len(Y_true)), but I don't know why it has been sampling until the last frame during the reproduction process.

The text was updated successfully, but these errors were encountered:

ddkang · 2022-11-26T14:59:19Z

Are you using the correct datasets and dates?

ddkang · 2022-11-26T14:59:45Z

This branch should reproduce SIGMOD, if you're using the correct data: https://github.com/stanford-futuredata/tasti/tree/sigmod

InkosiZhong · 2022-11-28T07:59:56Z

I have switch to the signed branch, but it behaves the same.
Besides, I find that the hyper-parameters in night_street_offline.py and night_street_online.py are different. For example:

nb_train=3000 and nb_buckets=7000 in NighStreetOfflineConfig, and nb_train=1000 and nb_buckets= 1000 in NighStreetOnlineConfig.
NightStreetAggregateQuery: err_tol=0.01 and confidence=0.05 at offline while err_tol=0.1 and confidence=0.1 at online.
I'm wondering is that correct?

InkosiZhong · 2022-11-28T08:02:25Z

I am using the 2017-12-14.zip and 2017-12-17.zip downloaded from here, and the DNN outputs from here, just as your guidance.

InkosiZhong · 2022-11-28T08:27:26Z

My complete process is as follows,

build up environment

# here I use master branch as the sigmod branch has no tasti.yml
git clone https://github.com/stanford-futuredata/tasti.git 
cd tasti
conda env create -f tasti.yml
conda activate tasti3
cd ..

git clone https://github.com/stanford-futuredata/swag-python.git
cd swag-python/
conda install -c conda-forge opencv
pip install -e .
cd ..

git clone https://github.com/stanford-futuredata/blazeit.git
cd blazeit/
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
conda install -c conda-forge pyclipper
pip install -e .
cd ..

git clone https://github.com/stanford-futuredata/supg.git
cd supg/
pip install pandas feather-format
pip install -e .
cd ..

git clone -b sigmod https://github.com/stanford-futuredata/tasti.git tasti_sigmod
cd tasti_sigmod/
pip install -r requirements.txt
pip install -e .
mkdir cache # will be written in night_street_offline.py

download dataset(2017-12-14.zip and 2017-12-17.zip) and dnn outputs (jackson-town-square-2017-12-14.json and jackson-town-square-2017-12-17.json)
create a folder named datasets, unzip and move the 2017-12-14, 2017-12-17 and 2 json files into it
modify the ROOT_DATA_DIR in tasti/examples/night_street_offline.py to the .../datasets
run python tasti/examples/night_street_offline.py
tasti performs do_mining/do_training/do_infer/do_bucketting and fails at query.execute_metrics(err_tol=0.01, confidence=0.05) (NightStreetAggregateQuery)

ddkang · 2022-11-28T18:14:27Z

Try an error tolerance of 0.05

InkosiZhong · 2022-11-29T02:14:51Z

Unfortunately, It doesn't work. I even tried an error tolerance of 0.9 but still failed.
I print the prediction value y_pred[i] and true value float(y_true[i]) in the propagation() after line 39 of query.py, and I find them very different (such as y_pred[i]=3 while y_true[i]=0).
Besides, top_distances even contains negative values.
Is that means the something wrong during the index generation. Should I try other configurations in the night_street_online.py, such as nb_buckets=1000?

ddkang · 2022-11-29T03:07:24Z

Yes I'm pretty sure something is wrong.

What are the hashes of the CSV files? This is what I see

e878ca724fd42d490dcc5d4ad8aa16cc  jackson-town-square-2017-12-14.csv
a72522b880023dfafea34e81448692b2  jackson-town-square-2017-12-17.csv

InkosiZhong · 2022-11-29T03:21:28Z

Oh, the MD5 values are different.

MD5 (jackson-town-square-2017-12-14.csv) = 8abae92a0ac3b9f6513ca23ab2549430
MD5 (jackson-town-square-2017-12-17.csv) = b51637ba2b45b9eea37f9cfc81b562d8

But I re-download from the google driver and the MD5 values are still different with yours but same with mine.

ddkang · 2022-11-29T03:27:36Z

Try downloading them again, I may have uploaded the wrong version

InkosiZhong · 2022-11-29T03:36:11Z

Thank you very much! Now the MD5 values are correct. I will try again.

InkosiZhong · 2022-11-29T14:52:35Z

Unfortunately, the error still exists. Here are the MD5 values of the datasets downloaded from the google drive
Can you please verify if it is correct?
Besides, I find that there is a branch called tasti-compatability for the blazeit project. Should I switch to that branch?

MD5 (2017-12-14-001.zip) = 11e1f424127a2463d0908fedd86719fd
MD5 (2017-12-17-002.zip) = bea086f82bdcc5f40a91d0ec2fbde4dd

ddkang · 2022-11-29T15:39:19Z

Try the branch

InkosiZhong · 2022-11-30T02:14:46Z

I have tried all branches of the blazeit including bugfix and tasti-compatibility. But they don't work too.
Unfortunately, I find it might be a bug of blazeit. Here are the similar issues,
Error on reproduce the aggregation experiments (step 3 of Reproducing experiments section)
README is outdated. Please update.
However, blazeit seems to have stopped maintenance and none of these issues have been resolved. Could you please share a working version of blazeit that you used in your experiment? Thank you so much.

ddkang · 2022-11-30T09:08:09Z

Are you sure you used the correct package versions of all packages in the SIGMOD branch?

InkosiZhong · 2022-11-30T12:33:06Z

I'm using the tasti.yml in the master branch to create the conda environment and the requirements.txt in the SIGMOD branch (same as the master branch).
The only thing I have modified in the requirements.txt is

numba==0.50.1 -> numba==0.51

This is because of an error below,

(tasti3)$ pip install -r requirements.txt
...
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behavior is the source of the following dependency conflicts.
datashader 0.13.0 requires numba>=0.51, but you have numba 0.50.1 which is incompatible.

while datashader=0.13.0 is specified by the tasti.yml
Besides, I have skipped PyTorch installation step in the README.md,

conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

because these packages are already specified in the tasti.yml,

pytorch=1.12.1=py3.8_cuda10.2_cudnn7.6.5_0
torchaudio=0.12.1=py38_cu102
torchvision=0.13.1=py38_cu102

ddkang · 2022-11-30T17:46:23Z

Commit 90461d5ad6af279d12f6589a90c1a987f261e352 on tasti with commit 8c147f3a0b42e3dcc0b7718b2c078b3dafaba7bf on blazeit works for me when run from scratch

InkosiZhong · 2022-12-01T02:01:25Z

I have no idea which step goes wrong.
May you please share the following files to help me check if the problem happens at tasti part or the blazeit part,

cache
  |- embeddings.npy
  |- model.pt
  |- reps.npy
  |- topk_dists.npy
  |- topk_reps.npy

Thank you very much.

InkosiZhong · 2022-12-02T02:25:09Z

I have tried the following configurations,

skip the conda environment creation (from tasti.yml). And I successfully installed numba==0.50.1
run python tasti/examples/night_street_online.py

Unfortunately, the same error still exists.

InkosiZhong · 2022-12-05T03:02:15Z

I rechecked the correspondence between the data and the labels and I see a delay between them.
The code is established based on the VideoDataset and the way you read .csv. You can see the visualization at here.

# read .csv file
len_14 = 973489
len_17 = 973136
df = pd.read_csv('../datasets/jackson-town-square/jackson-town-square-2017-12-14.csv')
df = df[df['object_name'].isin(['car'])]
frame_to_rows = defaultdict(list)
for row in df.itertuples():
    frame_to_rows[row.frame].append(row)
labels = []
for frame_idx in range(len_14):
    labels.append(frame_to_rows[frame_idx])

# prepare video dataset
video = VideoDataset(
    video_fp='../datasets/jackson-town-square/2017-12-14'
)

# visualization
cnt = 0
for i, frame in enumerate(video):
    if i % 8 != 0:
        continue
    frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
    if cnt > 60:
        break
    if labels[i] != []:
        for label in labels[i]:
            print(label)
            print((int(label[4]), int(label[5])), (int(label[6]), int(label[7])))
            frame = cv2.rectangle(frame, (int(label[4]), int(label[5])), (int(label[6]), int(label[7])), (0,255,0), 2)
        cnt += 1
    if len(labels[i]) > 1:
        pass
    frame = cv2.resize(frame, None, fx=0.25, fy=0.25)
    cv2.imwrite(f'annotation/{i}.png', frame)

ddkang · 2022-12-05T19:25:10Z

Oops, sorry about the video issue. Thank you for investigating

Christosc96 · 2022-12-21T10:58:51Z

Does that mean that the given version of night_street dataset is flawed? If so, is there a fix or a correct version of the data?

InkosiZhong · 2022-12-21T15:52:37Z

For me, I corrected the data by subtracting 300 (estimated value) from all the frame numbers in the json files. However, the offset of dataset is one of the reasons. I later found out that the real cause of this problem was the numba version. Using numpy functions with njit decorator and prange in numba==0.50.1 will lead to abnormal results （usually all 0). This bug is fixed in the later version. Now I use the latest numba and TASTI works well.

InkosiZhong closed this as completed Dec 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Index out of bounds in Aggression Query #4

Index out of bounds in Aggression Query #4

InkosiZhong commented Nov 26, 2022

ddkang commented Nov 26, 2022

ddkang commented Nov 26, 2022

InkosiZhong commented Nov 28, 2022

InkosiZhong commented Nov 28, 2022

InkosiZhong commented Nov 28, 2022

ddkang commented Nov 28, 2022

InkosiZhong commented Nov 29, 2022

ddkang commented Nov 29, 2022

InkosiZhong commented Nov 29, 2022

ddkang commented Nov 29, 2022

InkosiZhong commented Nov 29, 2022

InkosiZhong commented Nov 29, 2022

ddkang commented Nov 29, 2022

InkosiZhong commented Nov 30, 2022 •

edited

Loading

ddkang commented Nov 30, 2022

InkosiZhong commented Nov 30, 2022

ddkang commented Nov 30, 2022

InkosiZhong commented Dec 1, 2022

InkosiZhong commented Dec 2, 2022

InkosiZhong commented Dec 5, 2022

ddkang commented Dec 5, 2022

Christosc96 commented Dec 21, 2022

InkosiZhong commented Dec 21, 2022

Index out of bounds in Aggression Query #4

Index out of bounds in Aggression Query #4

Comments

InkosiZhong commented Nov 26, 2022

ddkang commented Nov 26, 2022

ddkang commented Nov 26, 2022

InkosiZhong commented Nov 28, 2022

InkosiZhong commented Nov 28, 2022

InkosiZhong commented Nov 28, 2022

ddkang commented Nov 28, 2022

InkosiZhong commented Nov 29, 2022

ddkang commented Nov 29, 2022

InkosiZhong commented Nov 29, 2022

ddkang commented Nov 29, 2022

InkosiZhong commented Nov 29, 2022

InkosiZhong commented Nov 29, 2022

ddkang commented Nov 29, 2022

InkosiZhong commented Nov 30, 2022 • edited Loading

ddkang commented Nov 30, 2022

InkosiZhong commented Nov 30, 2022

ddkang commented Nov 30, 2022

InkosiZhong commented Dec 1, 2022

InkosiZhong commented Dec 2, 2022

InkosiZhong commented Dec 5, 2022

ddkang commented Dec 5, 2022

Christosc96 commented Dec 21, 2022

InkosiZhong commented Dec 21, 2022

InkosiZhong commented Nov 30, 2022 •

edited

Loading