Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] SSL error: decryption failed or bad record mac during firmware upgrade #244

Open
minhng99 opened this issue Jul 12, 2023 · 6 comments

Comments

@minhng99
Copy link

I'm getting this issue during assigning firmware to a device, running on Debian 11 and installed using the ansible script, using psql 13 as db backend.
This resulted in the firmware upgrade process on OpenWISP being stuck at status in progress and never timed out.

Wed Jul 12 03:00:15 UTC 2023 upgrade: Saving config files...                                                                                                                                                                                
Wed Jul 12 03:00:15 UTC 2023 upgrade: Commencing upgrade. Closing all shell sessions.                                                                                                                                                       
                                                                                                                                                                                                                                            
[2023-07-12 03:00:16,074: INFO/ForkPoolWorker-1] # SSL error: decryption failed or bad record mac                                                                                                                                           
                                                           
[2023-07-12 03:00:16,078: ERROR/ForkPoolWorker-1] Task openwisp_firmware_upgrader.tasks.upgrade_firmware[b2648d2f-04b7-4b82-98d7-97b6ea497818] raised unexpected: InterfaceError('connection already closed')
Traceback (most recent call last):
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
psycopg2.OperationalError: SSL error: decryption failed or bad record mac
                                                                                                                      
                                                           
The above exception was the direct cause of the following exception:                                        
                                                           
Traceback (most recent call last):                                                                                    
  File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/base/models.py", line 554, in upgrade
    upgrader.upgrade(self.image.file)                                                                                                                                                                                                       
  File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/upgraders/openwrt.py", line 52, in upgrade
    self._reflash(remote_path)                                                                                                                                                                                                              
  File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/upgraders/openwrt.py", line 283, in _reflash
    self.upgrade_operation.refresh_from_db()                                                                                                                                                                                                
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/base.py", line 716, in refresh_from_db
    db_instance = db_instance_qs.get()                                                                                
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/query.py", line 492, in get
    num = len(clone)                                                                                                  
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/query.py", line 302, in __len__
    self._fetch_all()                                                                                                 
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/query.py", line 1507, in _fetch_all
    self._result_cache = list(self._iterable_class(self))                                                             
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/query.py", line 57, in __iter__
    results = compiler.execute_sql(                                                                                   
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1361, in execute_sql
    cursor.execute(sql, params)                                                                                       
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/utils.py", line 67, in execute
    return self._execute_with_wrappers(                                                                               
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/utils.py", line 80, in _execute_with_wrappers
    return executor(sql, params, many, context)                                                                       
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)                                                                           
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value                                             
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
django.db.utils.OperationalError: SSL error: decryption failed or bad record mac


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/base/base.py", line 262, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/postgresql/base.py", line 256, in create_cursor
    cursor = self.connection.cursor()
psycopg2.InterfaceError: connection already closed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/openwisp2/env/lib/python3.9/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/celery/app/autoretry.py", line 34, in run
    return task._orig_run(*args, **kwargs)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/tasks.py", line 32, in upgrade_firmware
    operation.upgrade(recoverable=recoverable)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/base/models.py", line 579, in upgrade
    self.log_line(cause)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/base/models.py", line 524, in log_line
    self.save()
  File "/opt/openwisp2/env/lib/python3.9/site-packages/openwisp_firmware_upgrader/base/models.py", line 604, in save
    result = super().save(*args, **kwargs)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/base.py", line 806, in save
    self.save_base(
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/base.py", line 857, in save_base
    updated = self._save_table(
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/base.py", line 970, in _save_table
    updated = self._do_update(
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/base.py", line 1034, in _do_update
    return filtered._update(values) > 0
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/query.py", line 885, in _update
    return query.get_compiler(self.db).execute_sql(CURSOR)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1783, in execute_sql
    cursor = super().execute_sql(result_type)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1359, in execute_sql
    cursor = self.connection.cursor()
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/base/base.py", line 284, in cursor
    return self._cursor()
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/base/base.py", line 262, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/base/base.py", line 262, in _cursor
    return self._prepare_cursor(self.create_cursor(name))
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/utils/asyncio.py", line 26, in inner
    return func(*args, **kwargs)
  File "/opt/openwisp2/env/lib/python3.9/site-packages/django/db/backends/postgresql/base.py", line 256, in create_cursor
    cursor = self.connection.cursor()
django.db.utils.InterfaceError: connection already closed

@minhng99
Copy link
Author

this issue get fixed by adding the sslmode: disable into settings.py

...
DATABASES = {
    'default': {
        'ENGINE': 'django.contrib.gis.db.backends.postgis',
        'NAME': 'openwisp2',
        'USER': 'openwisp',
        'PASSWORD': 'openwisp',
        'HOST': 'localhost',
        'PORT': '5432',
        'OPTIONS': {
            'sslmode': 'disable',
        },
    }
}
...

@nemesifier
Copy link
Member

@minhng99 is this happening with the latest master or which version?

@nemesifier
Copy link
Member

nemesifier commented Jul 12, 2023

@minhng99
Copy link
Author

It's the released Ansible version 22.05.3

openwisp-controller         1.0.3
openwisp-firmware-upgrader  1.0.1
openwisp-ipam               1.0
openwisp-monitoring         1.0.4
openwisp-notifications      1.0.3
openwisp-users              1.0.2
openwisp-utils              1.0.4
postgresql-13-postgis-3-scripts/oldstable,now 3.1.1+dfsg-1+deb11u1 all [installed,automatic]
postgresql-13-postgis-3/oldstable,now 3.1.1+dfsg-1+deb11u1 amd64 [installed]
postgresql-13/oldstable-security,now 13.11-0+deb11u1 amd64 [installed,automatic]
postgresql-client-13/oldstable-security,now 13.11-0+deb11u1 amd64 [installed,automatic]
postgresql-client-common/oldstable,now 225 all [installed,automatic]
postgresql-common/oldstable,now 225 all [installed,automatic]

@cbeaujoin-stellar
Copy link

cbeaujoin-stellar commented Aug 29, 2024

It seems link to #244 (comment) as @nemesifier said.

Because the _call_reflash_command is called from a subprocess it causes the issue.

The workaround was to use a subprocess but in fact it never ends because of Paramiko channel recv_exit_status method never released (timeout is not used their => paramiko/paramiko#1787), the root cause seems to be linked to the sysupgrade command that return the exit code after it stops the SSH service.

I think that if we don't use a subprocess but rewrite exec_command in ssh.py of the openwisp-controller without recv_exit_status but with alternative solution like here https://stackoverflow.com/questions/35403031/python-paramiko-exec-command-timeout-doesnt-work we could achieve a real timeout and avoid using subprocess.

@cbeaujoin-stellar
Copy link

After some investigation I can confirm that sysupgrade close ssh before returning the exit code:
Fri Aug 30 09:27:45 UTC 2024 upgrade: Commencing upgrade. Closing all shell sessions.

I tried to replace exit_status = stdout.channel.recv_exit_status() by:

        stdout.channel.status_event.wait(timeout=timeout)
        exit_status = stdout.channel.exit_status
        self.log('exit status {0}'.format(exit_status))
        ssh_client.close()

And it just suceed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants