Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPE-4239] Warn user when deploying charm with wrong architecture #613

Merged
merged 15 commits into from
Aug 12, 2024

Conversation

lucasgameiroborges
Copy link
Member

@lucasgameiroborges lucasgameiroborges commented Aug 6, 2024

Issue

When deploying a revision that doesn't correspond to the underlying architecture (amd/arm) the charm deployment bugs out with import errors instead of properly warning the user.

Solution

Not very pretty - Try-catching the first platform specific library (psycopg2 in our case) and creating a fake charm blocked with the error message.

I personally don't like it, but that's the only working solution we have so far, since the charm code will run the import statements before everything else. Let me know if you think it's something worth adding.

Tests

Unit and Integration tests added

Copy link

codecov bot commented Aug 6, 2024

Codecov Report

Attention: Patch coverage is 81.48148% with 5 lines in your changes missing coverage. Please review.

Project coverage is 68.76%. Comparing base (e284b6c) to head (8a94e3f).
Report is 5 commits behind head on main.

Files Patch % Lines
src/arch_utils.py 84.21% 3 Missing ⚠️
src/charm.py 75.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #613      +/-   ##
==========================================
+ Coverage   68.65%   68.76%   +0.10%     
==========================================
  Files          10       11       +1     
  Lines        2983     3009      +26     
  Branches      564      568       +4     
==========================================
+ Hits         2048     2069      +21     
- Misses        818      822       +4     
- Partials      117      118       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lucasgameiroborges lucasgameiroborges changed the title WIP testing arch check [Proposal?][DPE-4239] Warn user when deploying charm with wrong architecture Aug 7, 2024
@lucasgameiroborges lucasgameiroborges marked this pull request as ready for review August 7, 2024 13:39
src/charm.py Show resolved Hide resolved
src/charm.py Outdated
Comment on lines 30 to 37
manifest_path = f"{os.environ.get('CHARM_DIR')}/manifest.yaml"
if os.path.exists(manifest_path):
with open(manifest_path, "r") as manifest:
charm_arch = yaml.safe_load(manifest)["bases"][0]["architectures"][0]
hw_arch = os.uname().machine
if (charm_arch == "amd64" and hw_arch != "x86_64") or (
charm_arch == "arm64" and hw_arch != "aarch64"
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to move this check at the top of the exception handling and create and run the fake charm only when we detect wrong arch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm: Let's suppose the code enters this error handling block (psycopg2 is borked) but the check doesn't pass (architecture seems to match) - Then we'll skip creating the fake charm and let the real charm fail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this check will be redundant in 99.9% of the cases (why would psycopg2 import fail if not due to wrong arch?) but better safe than sorry

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we split it in a different file we can do something like:

class PostgresqlOperatorCharm(CharmBase):
    self.unit.status = BlockedStatus(f"Cannot install: {charm_arch} charm not compatible with {hw_arch} machine")

def check_arch():
    if wrong_arch:
        main(PostgresqlOperatorCharm(), use_juju_for_storage=True)
        sys.exit(0)

And then import and run check_arch() in the ModuleNotFoundError exception handler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just commited a refactored version, LMK how it looks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fake_charm is not a meaningful name for the module. Otherwise the refactor LGTM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about arch_utils? that was the best I could think of tbh, LMK if you have better suggestions

Copy link
Member

@marceloneppel marceloneppel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

src/charm.py Outdated Show resolved Hide resolved
@lucasgameiroborges lucasgameiroborges changed the title [Proposal?][DPE-4239] Warn user when deploying charm with wrong architecture [DPE-4239] Warn user when deploying charm with wrong architecture Aug 8, 2024
@lucasgameiroborges
Copy link
Member Author

Added unit tests, working on integration tests now.

src/charm.py Outdated Show resolved Hide resolved
Copy link
Member

@marceloneppel marceloneppel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks for the tests.

@dragomirp dragomirp merged commit 0e7c405 into main Aug 12, 2024
89 of 90 checks passed
@dragomirp dragomirp deleted the lucas/check-arch branch August 12, 2024 14:32
@taurus-forever
Copy link
Contributor

taurus-forever commented Aug 16, 2024

I have tried to install ARM revision on AMD:

juju deploy postgresql-k8s postgresql-k8s-arm --trust --channel 14/edge --revision 357

The error:

postgresql-k8s-arm           error       1  postgresql-k8s  14/edge  357  10.152.183.142  no       hook failed: "install"


unit-postgresql-k8s-0: 23:56:32 INFO juju.worker.uniter awaiting error resolution for "install" hook
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install Traceback (most recent call last):
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install   File "/var/lib/juju/agents/unit-postgresql-k8s-0/charm/./src/charm.py", line 18, in <module>
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install     import psycopg2
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install   File "/var/lib/juju/agents/unit-postgresql-k8s-0/charm/venv/psycopg2/__init__.py", line 51, in <module>
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install     from psycopg2._psycopg import (                     # noqa
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install ModuleNotFoundError: No module named 'psycopg2._psycopg'
unit-postgresql-k8s-0: 23:56:33 ERROR juju.worker.uniter.operation hook "install" (via hook dispatching script: dispatch) failed: exit status 1
unit-postgresql-k8s-0: 23:56:33 INFO juju.worker.uniter awaiting error resolution for "install" hook

The story of installing AMD revision on ARM hardware is even worth: all stuck on installing agent because scheduler has no matching node to launch AMD pod => 0/1 nodes are available:....

CC: @dragomirp @lucasgameiroborges (on Monday!)

@dragomirp
Copy link
Contributor

I have tried to install ARM revision on AMD:

juju deploy postgresql-k8s postgresql-k8s-arm --trust --channel 14/edge --revision 357

The error:

postgresql-k8s-arm           error       1  postgresql-k8s  14/edge  357  10.152.183.142  no       hook failed: "install"


unit-postgresql-k8s-0: 23:56:32 INFO juju.worker.uniter awaiting error resolution for "install" hook
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install Traceback (most recent call last):
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install   File "/var/lib/juju/agents/unit-postgresql-k8s-0/charm/./src/charm.py", line 18, in <module>
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install     import psycopg2
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install   File "/var/lib/juju/agents/unit-postgresql-k8s-0/charm/venv/psycopg2/__init__.py", line 51, in <module>
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install     from psycopg2._psycopg import (                     # noqa
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install ModuleNotFoundError: No module named 'psycopg2._psycopg'
unit-postgresql-k8s-0: 23:56:33 ERROR juju.worker.uniter.operation hook "install" (via hook dispatching script: dispatch) failed: exit status 1
unit-postgresql-k8s-0: 23:56:33 INFO juju.worker.uniter awaiting error resolution for "install" hook

The story of installing AMD revision on ARM hardware is even worth: all stuck on installing agent because scheduler has no matching node to launch AMD pod => 0/1 nodes are available:....

CC: @dragomirp @lucasgameiroborges (on Monday!)

Not pushed to edge yet.

@dragomirp
Copy link
Contributor

I have tried to install ARM revision on AMD:

juju deploy postgresql-k8s postgresql-k8s-arm --trust --channel 14/edge --revision 357

The error:

postgresql-k8s-arm           error       1  postgresql-k8s  14/edge  357  10.152.183.142  no       hook failed: "install"


unit-postgresql-k8s-0: 23:56:32 INFO juju.worker.uniter awaiting error resolution for "install" hook
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install Traceback (most recent call last):
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install   File "/var/lib/juju/agents/unit-postgresql-k8s-0/charm/./src/charm.py", line 18, in <module>
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install     import psycopg2
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install   File "/var/lib/juju/agents/unit-postgresql-k8s-0/charm/venv/psycopg2/__init__.py", line 51, in <module>
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install     from psycopg2._psycopg import (                     # noqa
unit-postgresql-k8s-0: 23:56:32 WARNING unit.postgresql-k8s/0.install ModuleNotFoundError: No module named 'psycopg2._psycopg'
unit-postgresql-k8s-0: 23:56:33 ERROR juju.worker.uniter.operation hook "install" (via hook dispatching script: dispatch) failed: exit status 1
unit-postgresql-k8s-0: 23:56:33 INFO juju.worker.uniter awaiting error resolution for "install" hook

The story of installing AMD revision on ARM hardware is even worth: all stuck on installing agent because scheduler has no matching node to launch AMD pod => 0/1 nodes are available:....

CC: @dragomirp @lucasgameiroborges (on Monday!)

Checked again, it's released and I managed to replicate on Juju 3.5

@lucasgameiroborges
Copy link
Member Author

Hi guys! Just saw this issue now, currently investigating what's happening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants