Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 maintenance/flaky tests #2867

Closed

Conversation

pcrespov
Copy link
Member

@pcrespov pcrespov commented Mar 2, 2022

What do these changes do?

Tackles this flaky tests:

pytest --pdb --setup-show tests/unit/with_dbs/10/test_resource_manager.py::test_interactive_services_removed_after_logout  

producing (sometimes) this error

services/web/server/tests/unit/with_dbs/10/test_resource_manager.py:439: in test_interactive_services_removed_after_logout
    mocked_director_v2_api["director_v2_core.stop_service"].assert_awaited_with(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

attr = 'assert_awaited_with', args = ()
kwargs = {'app': <Application 0x7f04b08d11c0>, 'save_state': False, 'service_uuid': 'a19df349-e903-4ce9-92f1-8f267b08fa60'}

    def wrapper(attr, /, *args, **kwargs):
>       return getattr(mock.mock, attr)(*args, **kwargs)
E       AssertionError: expected await not found.
E       Expected: stop_service(app=<Application 0x7f04b08d11c0>, service_uuid='a19df349-e903-4ce9-92f1-8f267b08fa60', save_state=False)
E       Actual: stop_service(app=<Application 0x7f04b08d11c0>, service_uuid='a19df349-e903-4ce9-92f1-8f267b08fa60', save_state=True)
E       
E       pytest introspection follows:
E       
E       Kwargs:
E       assert equals failed
E         {                                {                               
E           'app': <Application 0x7f04b08    'app': <Application 0x7f04b08 
E         d11c0>,                          d11c0>,                         
E           'save_state': True,              'save_state': False,          
E           'service_uuid': 'a19df349-e90    'service_uuid': 'a19df349-e90 
E         3-4ce9-92f1-8f267b08fa60',       3-4ce9-92f1-8f267b08fa60',      
E         }                                }

/opt/hostedtoolcache/Python/3.8.12/x64/lib/python3.8/unittest/mock.py:256: AssertionError

Highlights on changes

  • ♻️ splits modules in the projects plugin:
    1. core functionality internal to the plugin moved in _core_* modules. This was carefully setup to reduce coupling (see new pydeps tooling added). See (1).
    2. splits projects.projects_handler modules in two submodules: one for standard methods and the rest (see modules docs).. See (3).
    3. projects.projects_api is a facade module that exposes which functions can be used from other plugins . See (2).
      image
  • ♻️ deleting a project core moved to projects._core_delete and reimplemented. NOTE thet projects_handlers_crud.delete_project first marks the project as invisible and creates a fire-and-forget task that now deletes the project from the db at the end.
  • Adds scripts/pydeps tools to visualize dependencies
  • 🐛 save_state flag logic: Using UserRole comparison instead of is_guest_user to check whether the state shall be saved or not. The logic of is_guest_user is not very clear when raise ProgrammationError ... --> marked as deprecated.
    -[x] 🗑️ removed models_library/settings/services_common.py and ♻️ moved to DirectorV2Settings:
 "WEBSERVER_DIRECTOR_V2": {
  "DIRECTOR_V2_HOST": "director-v2",
  "DIRECTOR_V2_PORT": 8000,
  "DIRECTOR_V2_RESTART_DYNAMIC_SERVICE_TIMEOUT": 60,
  "DIRECTOR_V2_STOP_SERVICE_TIMEOUT": 3610,
  "DIRECTOR_V2_STORAGE_SERVICE_UPLOAD_DOWNLOAD_TIMEOUT": 3600,
  "DIRECTOR_V2_VTAG": "v2"
 },
  • ♻️ renamed director_v2_api functions including service as dynamic_service because now directorv2 service handles other type of services (and more to come):
    • this renaming revealed that retrieve and request_retrieve_dyn_service had the same implementation: now safe_/retrieve_dynamic_service_inputs !!
  • ♻️ unified director_v2_service_responses_mock fixtures and adjusted auto_use only where justified (explained with NOTE)
  • 🗑️ removed some type: ignore because of pre-commit error
  services/web/server/tests/unit/with_dbs/conftest.py:27:0 UnsupportedCase: an import statement inlined with ':'. ⛔
  • 🐛fixes syntax of CODEOWNERS file
  • ⬆️ updates pycln in pre-commit hooks
  • 🐛 @GitHK I would like to review the logic of remove_project_interactive_service as used here with you
    • renames as remove_project_dynamic_services
    • check ProjectLockError exception policy?

Related issue/s

How to test

CI tests on postgres and webserver

Checklist

  • restore pycln --all
  • Openapi changes? make openapi-specs, git commit ... and then make version-*)
  • Database migration script? cd packages/postgres-database, make setup-commit, sc-pg review -m "my changes"
  • Unit tests for the changes exist
  • Runs in the swarm
  • Documentation reflects the changes
  • New module? Add your github username to .github/CODEOWNERS

@codecov
Copy link

codecov bot commented Mar 2, 2022

Codecov Report

Merging #2867 (d1e9f4b) into master (6c63bef) will decrease coverage by 2.3%.
The diff coverage is 83.4%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #2867     +/-   ##
========================================
- Coverage    78.2%   75.8%   -2.4%     
========================================
  Files         673     680      +7     
  Lines       27602   27707    +105     
  Branches     3218    3218             
========================================
- Hits        21596   21026    -570     
- Misses       5222    5990    +768     
+ Partials      784     691     -93     
Flag Coverage Δ
integrationtests 48.0% <44.3%> (-17.7%) ⬇️
unittests 72.8% <77.2%> (-1.5%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...r/src/simcore_service_webserver/activity/plugin.py 100.0% <ø> (ø)
...erver/src/simcore_service_webserver/director_v2.py 100.0% <ø> (ø)
...r/src/simcore_service_webserver/director_v2_api.py 100.0% <ø> (ø)
...simcore_service_webserver/projects/project_lock.py 94.4% <ø> (-0.3%) ⬇️
...imcore_service_webserver/garbage_collector_core.py 30.3% <19.2%> (-38.8%) ⬇️
packages/service-library/src/servicelib/utils.py 75.4% <20.0%> (+1.8%) ⬆️
.../server/src/simcore_service_webserver/users_api.py 88.4% <66.6%> (-3.6%) ⬇️
.../src/simcore_service_webserver/director_v2_core.py 68.0% <71.4%> (+1.1%) ⬆️
.../simcore_service_webserver/projects/_core_nodes.py 79.2% <79.2%> (ø)
...base/src/simcore_postgres_database/models/users.py 93.1% <80.0%> (-2.6%) ⬇️
... and 99 more

@pcrespov pcrespov self-assigned this Mar 2, 2022
@pcrespov pcrespov added the t:maintenance Some planned maintenance work label Mar 2, 2022
@pcrespov pcrespov added this to the R.Schumann+1 milestone Mar 2, 2022
@pcrespov pcrespov requested review from sanderegg, GitHK and odeimaiz and removed request for sanderegg March 2, 2022 20:39
Copy link
Contributor

@GitHK GitHK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please find some questions and comments below.

In the end I did not understand where the issue with the flaky test was. Could you point it out please?

Copy link
Member

@sanderegg sanderegg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for taking care of this flakyness(es)... please just have a look at my comments

@pcrespov pcrespov requested a review from GitHK March 3, 2022 17:58
Copy link
Contributor

@GitHK GitHK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about how you use suppress. Please check my comment below.

The rest seems fine. 👍

@pcrespov pcrespov requested a review from GitHK March 9, 2022 11:02
Copy link
Contributor

@GitHK GitHK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Please check my comments.


# devenv
RUN pip install \
pydeps
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, this looks nice and helpful. Maybe also pin this version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO a tool (in contrast with a library) should by default always get the latest and greatest ... except if there is a problem, in which we might add constraints.
This is the approach we follow with e.g. package managers like pip etc

# Examples:
# - SEE https://pydeps.readthedocs.io/en/latest/#usage
#
# pydeps services/web/server/src/simcore_service_webserver --only "simcore_service_webserver.projects" --no-show --cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it produces an output, were is that file placed and how would you view it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It creates a svg in the working dir (more details in the doc)
I will attach to the PR the one I was generating to decouple the modules in the project plugin

resource_value,
err,
)
can_remove_all = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set it to True here, it's already True. What am I missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the default is true, then depending on the handler is becomes True or False.

Actually the problem is that I forgot to remove the with suppress(ProjectLockError) in remove_project_dynamic_services. I agreed with @sanderegg that it should be raised so the caller can
identify the error instead of silently failing

@@ -188,37 +190,48 @@ async def remove_disconnected_user_resources(
dead_key,
)

can_remove_all = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me the name is not helpful? Why not can_remove_project?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a project what your remove, but resources.
Rationale: the parent function is called remove_disconnected_user_resources and this variable is can_remove_all. I think it is pretty clear from the context that it refers to the "resource of a disconnected user"

logger.debug("Deleting user %s with %s", f"{user_id=}", f"{user_role=}")
await remove_user(app=app, user_id=user_id)

except UserNotFoundError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this is silent. Is this correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.

Until we review the logic in the garbage-collector, the concurrent context in which this task cleans services is pretty wide.

One of the situations is that when this function is called, the user might not even exists .. in which case either get_user_role or remove_all_projects_for_user will raise that and therefore, there is nothing else to do...

# Even if any of the steps below fail, the project will remain invisible
# TODO: see https://github.com/ITISFoundation/osparc-simcore/pull/2522

await db.set_hidden_flag(f"{project_uuid}", enabled=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can basically hide a project while it is locked. Is this intended? Would it not cause other issues?

Copy link
Member Author

@pcrespov pcrespov Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is the basic idea.

Yes, is intended on a temporary basis until the trash feature is in place (as explained in the note)

I did not see any but cannot be 100% sure. In any case it is part of the trash feature so we can even use it as a prototype of it. In addition, it also partially solves the problem of deletion of data since the project is removed from the database last (instead of first as it happened until now).

user_name_data = await get_user_name(app, user_id)
user_role: UserRole = await get_user_role(app, user_id)

with suppress(ProjectLockError):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, if the project is locked, this will prevent stopping it since an error will break raised and caught by suppress.

Copy link
Member Author

@pcrespov pcrespov Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After analyzing how remove_project_dynamic_services is used (and due to it's intrinsic design limitations) I decided to change this behavior in the latest version and avoid silent failures if locked. I leave this way the caller to handle that case. SEE doc of this function for details.

I also agreed on this with @sanderegg

SEE notes in lock_project_and_notify_state_update

@pcrespov
Copy link
Member Author

@ITISFoundation/dev-team After several reviews, this PR will be split in smaller ones. The concept of project_api or in general a "plugin api" has to be reviewed and agreed upon before submitting a new PR.

@pcrespov pcrespov closed this Mar 17, 2022
@sanderegg sanderegg removed this from the E.Shackleton milestone Apr 1, 2022
@pcrespov pcrespov deleted the maintenance/flaky-tests branch June 22, 2022 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t:maintenance Some planned maintenance work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants