Skip to content
This repository has been archived by the owner on Apr 7, 2022. It is now read-only.

[1LP][RFR] Fix HTTP 503 everywhere #10141

Merged
merged 3 commits into from
Jul 1, 2020
Merged

[1LP][RFR] Fix HTTP 503 everywhere #10141

merged 3 commits into from
Jul 1, 2020

Conversation

jarovo
Copy link
Contributor

@jarovo jarovo commented May 22, 2020

Introduce wait_for_miq_ready, remove wait_for_web_ui

We were hitting HTTP 503 when accessing MIQ API right after the server
got restarted. We were waiting for Web UI availability, but api, though
served on similar URL, was handled probably by separate process and it could
happen the requests to it returned HTTP 503 when the rest of the UI was working.

This patch renames wait_for_web_ui to wait_for_miq_ready and adds the
blocking call for waiting for the API to it.

No work though has been done to remove any other stale API objects somewhere
else if there are any!

Test results advocacy:

  • I ran the tests I have in my FAs. I saw no problem related to this PR.
  • I ran tests related to v2v where @sshveta saw HTTP 503. I saw quite a lot of failed tests (about 50% success rate), but I don't think the failures were related. I suspect the tests are failing even without the code in this PR in effect.
  • smoke tests were all good, except unrelated problem with wait_for in test: cfme/tests/test_appliance.py/test_codename_in_log
# Dont mind this. This was used to test the v2v, now deactivated to see the smoke tests results.
#{#{ py#test: cfme/tests/v2v/ -v --use-provider complete --long-running --provider-limit 4 }}

Copy link
Contributor

@sshveta sshveta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @JaryN .
Can we run some tests like v2v_migrations.py so that we can validate that the HTTP 503 error is gone .

@jarovo jarovo changed the title Fix HTTP 503 everywhere [RFR]Fix HTTP 503 everywhere May 27, 2020
@jarovo jarovo changed the title [RFR]Fix HTTP 503 everywhere Fix HTTP 503 everywhere May 27, 2020
@dajoRH
Copy link
Contributor

dajoRH commented May 27, 2020

I detected some fixture changes in commit 6c9b997c4d819557502b633104404829322a6654

Show fixtures

The global fixture configure_auth was changed, but I didn't find where it's used.
The global fixture candu_db_restore is used in the following files:

  • cfme/tests/candu/test_graph_groupbytag.py
    • test_tagwise

The global fixture ext_appliances_with_providers is used in the following files:

  • cfme/tests/cli/test_appliance_update.py
    • test_update_distributed_webui

The global fixture setup_global_appliance was changed, but I didn't find where it's used.
The global fixture setup_remote_appliances was changed, but I didn't find where it's used.
The local fixture appliance_police is used in the following files:

  • cfme/test_framework/appliance_police.py

The local fixture get_appliances_with_providers is used in the following files:

  • cfme/tests/cli/test_appliance_console_db_restore.py
    • test_appliance_console_dump_restore_db_local

The local fixture get_appliance_with_ansible is used in the following files:

  • cfme/tests/cli/test_appliance_console_db_restore.py
    • test_appliance_console_restore_pg_basebackup_ansible

The local fixture get_ext_appliances_with_providers is used in the following files:

  • cfme/tests/cli/test_appliance_console_db_restore.py
    • test_appliance_console_restore_db_external

The local fixture get_ha_appliances_with_providers is used in the following files:

  • cfme/tests/cli/test_appliance_console_db_restore.py

The local fixture start_evmserverd_after_module is used in the following files:

  • cfme/tests/cli/test_evmserverd.py

The local fixture configured_external_appliance is used in the following files:

  • cfme/tests/configure/test_log_depot_operation.py

The local fixture db_restore is used in the following files:

  • cfme/tests/optimize/test_bottlenecks.py
    • test_bottlenecks_report_event_groups
    • test_bottlenecks_report_show_host_events
    • test_bottlenecks_report_time_zone
    • test_bottlenecks_summary_event_groups
    • test_bottlenecks_summary_show_host_events
    • test_bottlenecks_summary_time_zone

The local fixture temp_appliance_global_region is used in the following files:

  • cfme/tests/test_db_migrate.py
    • test_db_migrate_replication

The local fixture setup_replication is used in the following files:

  • cfme/tests/test_replication.py
    • test_replication_powertoggle
    • test_replication_appliance_add_single_subscription
    • test_replication_re_add_deleted_remote
    • test_replication_delete_remote_from_global
    • test_replication_remote_to_global_by_ip_pglogical
    • test_replication_global_region_dashboard
    • test_replication_global_to_remote_new_vm_from_template

Please, consider creating a PRT run to make sure your fixture changes do not break existing usage 😃

@jarovo jarovo requested a review from sshveta May 28, 2020 21:24
@sshveta
Copy link
Contributor

sshveta commented May 29, 2020

No HTTP 503 failure in test run 👍

@dajoRH
Copy link
Contributor

dajoRH commented Jun 5, 2020

Would you mind rebasing this Pull Request against latest master, please? :trollface:
CFME QE Bot

@jarovo jarovo changed the title Fix HTTP 503 everywhere [WIPTEST] Fix HTTP 503 everywhere Jun 9, 2020
@jarovo jarovo changed the title [WIPTEST] Fix HTTP 503 everywhere [RFR] Fix HTTP 503 everywhere Jun 15, 2020
@jarovo
Copy link
Contributor Author

jarovo commented Jun 16, 2020

@sshveta Is it normal that those v2v tests are failing so much?

@jarovo jarovo requested a review from john-dupuy June 25, 2020 12:11
Copy link
Contributor

@john-dupuy john-dupuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but it looks like no tests ran in the most recent PRT run?

@jarovo
Copy link
Contributor Author

jarovo commented Jun 25, 2020

Nothing really changed since I ran this with results I was happy with.

It is not easy to do proper testing. I was asked to test with V2V tests, but many of them are failing for reasons I believe are not caused by this PR. At least we saw no HTTP errors this should be fixing.

This is touching many tests but it is quite trivial change. Except of the name change, I added just some wait_for.

I believe the smoke tests likely did hit the changed function.

@john-dupuy john-dupuy changed the title [RFR] Fix HTTP 503 everywhere [1LP][RFR] Fix HTTP 503 everywhere Jun 26, 2020
@@ -23,6 +23,7 @@
from manageiq_client.api import APIException
from manageiq_client.api import ManageIQClient as VanillaMiqApi
from urllib3.exceptions import ConnectionError
from wait_for import _get_timeout_secs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This private method is triggering an exception within a test on PRT, why is it being imported?

it looks like its being used to process a timeout kwarg, before passing to wait_for as num_sec, can't we just directly pass the timeout and let wait_for process it?

>       appliance.wait_for_miq_ready()

cfme/tests/test_appliance.py:443: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
cfme/utils/log.py:177: in newfunc
    return func(*args, **kwargs)
cfme/utils/appliance/__init__.py:1486: in wait_for_miq_ready
    num_secs = _get_timeout_secs(timeout)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

kwargs = 900

    def _get_timeout_secs(kwargs):
>       if "timeout" in kwargs and kwargs["timeout"] is not None:
E       TypeError: argument of type 'int' is not iterable

.cfme_venv/lib64/python3.7/site-packages/wait_for/__init__.py:36: TypeError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DOH! How could I miss this?

I removed the _get_timeout_secs as it is private, but I had to make the wait_for_miq_ready accepting num_sec, not timeout

I did check all usages in pycharm. All uses of timeout should be now replaced with num_sec

@mshriver mshriver merged commit b2c0140 into ManageIQ:master Jul 1, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants