Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rwjs/session fix m11 - Fix for making sure that sessions are removed from session_map #1048

Merged
merged 13 commits into from
Mar 16, 2022

Conversation

robsimmonds
Copy link
Contributor

No description provided.

… that

a pointer to an end-of-life session can be removed from the
SessionManager session map.
Modefined the way that sessions get deleted to make sure they all get
removed from the SessionManager session map.
Made change to session deletion so that it is consistent in both cases when
a WebSocket closes and when the last OnMessageTask referencing a session
is deleted.
@confluence
Copy link
Collaborator

What's the easiest way to reproduce this bug for testing purposes?

The CI unit tests are all crashing (with a segfault in MomentTest.CheckConsistency). I will check if I can reproduce this locally.

@confluence
Copy link
Collaborator

I get a segfault in IcdTest.RegionRegister when I run the tests locally.

@robsimmonds
Copy link
Contributor Author

This problem is likely due to the SessionManager pointer not being set in the testing backend. It looked to me that only some of the tests used the SessionManager so either one will have to be created for the testing backend, or an extra test will have to be put in the code to prevent that shared_ptr having the method called on it in ~OnMessageTask. When I tried the tests locally they seemed to be working, even though I was not expecting them to. If I can get the tests to fail I would likely be able to fix the problem.

src/Session/OnMessageTask.h Outdated Show resolved Hide resolved
@ajm-ska
Copy link
Collaborator

ajm-ska commented Mar 12, 2022

@robsimmonds I know the Moment ICD tests on Ubuntu 20.04 and 21.10 appear to continuously fail on this branch and the rwjs/session_fix_m11 branch for some reason. It may not be due to any changes in these branches because I did notice an occasional failure of the Moment tests occurring in other branches recently. I'm just surprised that they are consistently failing here for some reason, plus only failing on 2 out of 7 systems. We (@acdo2002!) can investigate more on Monday. @acdc2002 recently reported that the Ubuntu contour tests on the 'dev' branch take 10% longer than before. I wonder if whatever caused that is having the same effect here. Maybe we just need to increase a timeout on the Moment ICD tests to make them less strict.

@robsimmonds
Copy link
Contributor Author

@robsimmonds I know the Moment ICD tests on Ubuntu 20.04 and 21.10 appear to continuously fail on this branch and the rwjs/session_fix_m11 branch for some reason. It may not be due to any changes in these branches because I did notice an occasional failure of the Moment tests occurring in other branches recently. I'm just surprised that they are consistently failing here for some reason, plus only failing on 2 out of 7 systems. We (@acdo2002!) can investigate more on Monday. @acdc2002 recently reported that the Ubuntu contour tests on the 'dev' branch take 10% longer than before. I wonder if whatever caused that is having the same effect here. Maybe we just need to increase a timeout on the Moment ICD tests to make them less strict.

OK, thanks it is strange. Is there a good way for me to download the complete set of test files so I can also see if they run on my ubuntu test system. I know they used to be in google, which I have problems downloading to from our research cluster, but could download them to my home machine.

@ajm-ska
Copy link
Collaborator

ajm-ska commented Mar 12, 2022

@robsimmonds Unfortunately we don't have a definitive list of the image files currently used by the ICD tests. We read from a folder called set_QA. It is ~270GB in size so it is not practical to share it. But I'm fairly sure that not all of the images are actually used, so the folder could be significantly smaller.

I will create a new issue on the ICD-test repository for us to make a list of the files that are actually used in the tests as it would be useful to know.

For now I just checked the Moment tests. I put the images they use in a folder, and temporarily uploaded it to our ASIAA webserver (we can find a better location later). It is 28GB (20GB zipped)! It unzips to set_QA.
You can download it here:
wget https://alma.asiaa.sinica.edu.tw/_downloads/ICD_Moments_images.tgz

I tried running the Moment tests on an Ubuntu 20.04 server natively and the tests pass fine using your branch. So maybe the Ubuntu Docker containers don't have enough resources for some reason, even though they have been working fine before.

If you would like to try it yourself, here is a recap on how to run the ICD tests:

  1. Prepare the ICD tests
git clone https://github.com/CARTAvis/carta-backend-ICD-test.git
cd carta-backend-ICD-test
git submodule init
git submodule update
npm install
cd protobuf
./build_proto.sh
cd ..

Modify src/test/config.json to set the correct serverURL. e.g
"serverURL": "ws://127.0.0.1:3002",

  1. Start the carta-backend pointing towards the parent folder that contains the set_QA folder. e.g
    ./carta_backend /images --port 3002 --debug_no_auth --no_frontend

  2. In the carta-backend-ICD-test folder, run the Moment ICD tests:

CI=true npm test src/test/MOMENTS_GENERATOR_CASA.test.ts
CI=true npm test src/test/MOMENTS_GENERATOR_EXCEPTION.test.ts
CI=true npm test src/test/MOMENTS_GENERATOR_FITS.test.ts
CI=true npm test src/test/MOMENTS_GENERATOR_HDF5.test.ts
CI=true npm test src/test/MOMENTS_GENERATOR_SAVE.test.ts
CI=true npm test src/test/MOMENTS_GENERATOR_CANCEL.test.ts

@robsimmonds
Copy link
Contributor Author

@ajm-asiaa thanks for looking into this. It would be good if all the ICD test files were available for download, but that might be difficult given their size. It might be good to mention on the ICD tests github page that the ICD tests need a set of files that are not currently in a public repository just so people don't try to run the tests and think they are broken.

@veggiesaurus veggiesaurus mentioned this pull request Mar 15, 2022
@confluence confluence self-assigned this Mar 16, 2022
@confluence confluence merged commit 8ac76aa into dev Mar 16, 2022
@confluence confluence deleted the rwjs/session_fix_m11 branch March 16, 2022 10:22
@robsimmonds robsimmonds mentioned this pull request Mar 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants