Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSL error: decryption failed or bad record mac & SSL SYSCALL error: EOF detected #127

Closed
mkrsn opened this issue Mar 10, 2021 · 23 comments · Fixed by #228
Closed

SSL error: decryption failed or bad record mac & SSL SYSCALL error: EOF detected #127

mkrsn opened this issue Mar 10, 2021 · 23 comments · Fixed by #228
Assignees
Labels
type: bug Something isn't working as expected type: documentation Improvements or additions to documentation
Milestone

Comments

@mkrsn
Copy link

mkrsn commented Mar 10, 2021

Environment

  • Python version: 3.8.5
  • Nautobot version: 1.0.0b2

Steps to Reproduce

  1. become user nautobot
  2. Download & extract 1.0.0b2
  3. Execute install.ssh (use existing config except unicorn.py; Adapt changes in systemd service files)
  4. Start nautobot
  5. Open Nautobot WebUI and surf a bit

This only happens after a fresh start of nautobot. The error disappears after a while until a nautobot restart.

Expected Behavior

no HTTP500

Observed Behavior

fyi: /opt/nautobot/current is a symlink to /opt/nautobot/1.0.0b2/

Server Error
There was a problem with your request. Please contact an administrator.

The complete exception is provided below:

<class 'django.db.utils.OperationalError'>

SSL error: decryption failed or bad record mac

Python version: 3.8.5
Nautobot version: 1.0.0b2

OperationalError at /plugins/device-onboarding/
SSL error: decryption failed or bad record mac
Request Method:	GET
Request URL:	https://<schnipp>/plugins/device-onboarding/
Django Version:	3.1.7
Exception Type:	OperationalError
Exception Value:	
SSL error: decryption failed or bad record mac
Exception Location:	/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py, line 84, in _execute
Python Executable:	/usr/bin/python3.8
Python Version:	3.8.5
Python Path:	
['.',
 '',
 '/opt/nautobot/1.0.0b2/venv/bin',
 '/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/git/ext/gitdb',
 '/usr/lib/python38.zip',
 '/usr/lib/python3.8',
 '/usr/lib/python3.8/lib-dynload',
 '/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages',
 '/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/gitdb/ext/smmap']
Server time:	Wed, 10 Mar 2021 12:34:59 +0000
Environment:


Request Method: GET
Request URL: https://<schnipp>/plugins/device-onboarding/

Django Version: 3.1.7
Python Version: 3.8.5
Installed Applications:
['django.contrib.admin',
 'django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'django.contrib.humanize',
 'cacheops',
 'corsheaders',
 'django_filters',
 'django_tables2',
 'django_prometheus',
 'mptt',
 'rest_framework',
 'taggit',
 'timezone_field',
 'nautobot.core',
 'nautobot.circuits',
 'nautobot.dcim',
 'nautobot.ipam',
 'nautobot.extras',
 'nautobot.tenancy',
 'nautobot.users',
 'nautobot.utilities',
 'nautobot.virtualization',
 'django_rq',
 'drf_yasg',
 'graphene_django',
 'nautobot_device_onboarding.OnboardingConfig']
Installed Middleware:
['django_prometheus.middleware.PrometheusBeforeMiddleware',
 'corsheaders.middleware.CorsMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware',
 'django.middleware.clickjacking.XFrameOptionsMiddleware',
 'django.middleware.security.SecurityMiddleware',
 'nautobot.core.middleware.ExceptionHandlingMiddleware',
 'nautobot.core.middleware.RemoteUserMiddleware',
 'nautobot.core.middleware.APIVersionMiddleware',
 'nautobot.core.middleware.ObjectChangeMiddleware',
 'django_prometheus.middleware.PrometheusAfterMiddleware']



Traceback (most recent call last):
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 233, in _get_session
    return self._session_cache

During handling of the above exception ('SessionStore' object has no attribute '_session_cache'), another exception occurred:
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)

The above exception (SSL error: decryption failed or bad record mac
) was the direct cause of the following exception:
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/core/handlers/exception.py", line 47, in inner
    response = get_response(request)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/core/handlers/base.py", line 181, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/views/generic/base.py", line 70, in view
    return self.dispatch(request, *args, **kwargs)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/nautobot/utilities/views.py", line 91, in dispatch
    if not self.has_permission():
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/nautobot/utilities/views.py", line 73, in has_permission
    if user.has_perms((permission_required, *self.additional_permissions)):
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/utils/functional.py", line 240, in inner
    self._setup()
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/utils/functional.py", line 376, in _setup
    self._wrapped = self._setupfunc()
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/auth/middleware.py", line 23, in <lambda>
    request.user = SimpleLazyObject(lambda: get_user(request))
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/auth/middleware.py", line 11, in get_user
    request._cached_user = auth.get_user(request)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/auth/__init__.py", line 174, in get_user
    user_id = _get_user_session_key(request)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/auth/__init__.py", line 58, in _get_user_session_key
    return get_user_model()._meta.pk.to_python(request.session[SESSION_KEY])
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 65, in __getitem__
    return self._session[key]
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 238, in _get_session
    self._session_cache = self.load()
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/sessions/backends/db.py", line 43, in load
    s = self._get_session_from_db()
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/sessions/backends/db.py", line 32, in _get_session_from_db
    return self.model.objects.get(
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/cacheops/query.py", line 353, in get
    return qs._no_monkey.get(qs, *args, **kwargs)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/query.py", line 425, in get
    num = len(clone)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/query.py", line 269, in __len__
    self._fetch_all()
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/cacheops/query.py", line 273, in _fetch_all
    return self._no_monkey._fetch_all(self)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/query.py", line 1308, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/query.py", line 53, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1156, in execute_sql
    cursor.execute(sql, params)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 98, in execute
    return super().execute(sql, params)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/cacheops/transaction.py", line 93, in execute
    result = self._no_monkey.execute(self, sql, params)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)

Exception Type: OperationalError at /plugins/device-onboarding/
Exception Value: SSL error: decryption failed or bad record mac

Seconde error:

OperationalError at /
SSL SYSCALL error: EOF detected
Request Method:	GET
Request URL:	https://<schnipp>/
Django Version:	3.1.7
Exception Type:	OperationalError
Exception Value:	
SSL SYSCALL error: EOF detected
Exception Location:	/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py, line 84, in _execute
Python Executable:	/usr/bin/python3.8
Python Version:	3.8.5
Python Path:	
['.',
 '',
 '/opt/nautobot/1.0.0b2/venv/bin',
 '/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/git/ext/gitdb',
 '/usr/lib/python38.zip',
 '/usr/lib/python3.8',
 '/usr/lib/python3.8/lib-dynload',
 '/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages',
 '/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/gitdb/ext/smmap']
Server time:	Wed, 10 Mar 2021 13:06:33 +0000
Environment:


Request Method: GET
Request URL: https://<schnipp>/

Django Version: 3.1.7
Python Version: 3.8.5
Installed Applications:
['django.contrib.admin',
 'django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'django.contrib.humanize',
 'cacheops',
 'corsheaders',
 'django_filters',
 'django_tables2',
 'django_prometheus',
 'mptt',
 'rest_framework',
 'taggit',
 'timezone_field',
 'nautobot.core',
 'nautobot.circuits',
 'nautobot.dcim',
 'nautobot.ipam',
 'nautobot.extras',
 'nautobot.tenancy',
 'nautobot.users',
 'nautobot.utilities',
 'nautobot.virtualization',
 'django_rq',
 'drf_yasg',
 'graphene_django',
 'nautobot_device_onboarding.OnboardingConfig']
Installed Middleware:
['django_prometheus.middleware.PrometheusBeforeMiddleware',
 'corsheaders.middleware.CorsMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware',
 'django.middleware.clickjacking.XFrameOptionsMiddleware',
 'django.middleware.security.SecurityMiddleware',
 'nautobot.core.middleware.ExceptionHandlingMiddleware',
 'nautobot.core.middleware.RemoteUserMiddleware',
 'nautobot.core.middleware.APIVersionMiddleware',
 'nautobot.core.middleware.ObjectChangeMiddleware',
 'django_prometheus.middleware.PrometheusAfterMiddleware']



Traceback (most recent call last):
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 233, in _get_session
    return self._session_cache

During handling of the above exception ('SessionStore' object has no attribute '_session_cache'), another exception occurred:
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)

The above exception (SSL SYSCALL error: EOF detected
) was the direct cause of the following exception:
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/core/handlers/exception.py", line 47, in inner
    response = get_response(request)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/core/handlers/base.py", line 181, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/views/generic/base.py", line 70, in view
    return self.dispatch(request, *args, **kwargs)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/views/generic/base.py", line 98, in dispatch
    return handler(request, *args, **kwargs)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/nautobot/core/views/__init__.py", line 45, in get
    ConsolePort.objects.restrict(request.user, "view")
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/nautobot/utilities/querysets.py", line 21, in restrict
    if user.is_superuser or permission_is_exempt(permission_required):
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/utils/functional.py", line 240, in inner
    self._setup()
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/utils/functional.py", line 376, in _setup
    self._wrapped = self._setupfunc()
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/auth/middleware.py", line 23, in <lambda>
    request.user = SimpleLazyObject(lambda: get_user(request))
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/auth/middleware.py", line 11, in get_user
    request._cached_user = auth.get_user(request)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/auth/__init__.py", line 174, in get_user
    user_id = _get_user_session_key(request)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/auth/__init__.py", line 58, in _get_user_session_key
    return get_user_model()._meta.pk.to_python(request.session[SESSION_KEY])
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 65, in __getitem__
    return self._session[key]
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 238, in _get_session
    self._session_cache = self.load()
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/sessions/backends/db.py", line 43, in load
    s = self._get_session_from_db()
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/contrib/sessions/backends/db.py", line 32, in _get_session_from_db
    return self.model.objects.get(
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/cacheops/query.py", line 353, in get
    return qs._no_monkey.get(qs, *args, **kwargs)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/query.py", line 425, in get
    num = len(clone)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/query.py", line 269, in __len__
    self._fetch_all()
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/cacheops/query.py", line 273, in _fetch_all
    return self._no_monkey._fetch_all(self)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/query.py", line 1308, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/query.py", line 53, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1156, in execute_sql
    cursor.execute(sql, params)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 98, in execute
    return super().execute(sql, params)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/cacheops/transaction.py", line 93, in execute
    result = self._no_monkey.execute(self, sql, params)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/opt/nautobot/1.0.0b2/venv/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)

Exception Type: OperationalError at /
Exception Value: SSL SYSCALL error: EOF detected
@jathanism
Copy link
Contributor

jathanism commented Mar 10, 2021

Thanks for the report. Can you please also provide the following details:

  • Operating system and version (e.g. Ubuntu 20.04.2 LTS)
  • PostgreSQL version
  • System memory
  • Number of CPUs

Also, if you'd be willing, would you mind following the latest install instructions for 1.0.0b2 that no longer call for the use of install.sh and also switches from Gunicorn to uWSGI?

https://nautobot.readthedocs.io/en/latest/installation/nautobot/

I'd be very curious to see if you're still hitting this issue. But I also want to rule out that you might not be hitting an issue with system resources? We're trying to lock down some recommended system requirements.

Thanks in advance!

@mkrsn
Copy link
Author

mkrsn commented Mar 10, 2021

Type: VM (kvm)
OS: Ubuntu 20.04.2 LTS
Postgres: 12.6-0ubuntu0.20.04.1
Redis: 5:5.0.7-2
RAM: 8GiB (~1GiB used; ~3GiB cached; ~4GiB free)
vCPU: 4

I've already switched from gunicorn to uWSGI. I'll get back on you after installing nautobot via new install instructions.

fyi: Our nautobot database is currently populated with a handful of demo data (just 4 devices, some IPs, a Rack and so on).

@jathanism
Copy link
Contributor

@mkrsn Thanks. Are you running the database server on localhost, or is it on a remote system?

@mkrsn
Copy link
Author

mkrsn commented Mar 10, 2021

It's a single-instance - all localhost.

@nniehoff
Copy link
Contributor

@mkrsn it looks like you reported the issue here as well. Did you only encounter the SSL error after installing the onboarding plugin?

@jathanism
Copy link
Contributor

Great, thanks. It looks like it may be related to our currently documented uWSGI configuration.

If it comes up again, can you please try adding this to your uwsgi.ini and see if it alleviates the issue:

# Workaround for SSL errors
lazy = true
lazy-apps = true

@jathanism
Copy link
Contributor

@mkrsn it looks like you reported the issue here as well. Did you only encounter the SSL error after installing the onboarding plugin?

Oh great catch. Thanks.

@mkrsn
Copy link
Author

mkrsn commented Mar 10, 2021

@nniehoff yes but it was just a strange coincident (Murphy says hi) . I only stumbled over this error after visting the onboarding-plugin. So i thought this is related to the Plugin, that's why i reported that error there. Only my colleague saw this error in another module, so i don't think that this is related to the onboarding plugin. Unfortunately i never tried it without installing the Plugin. Only disabled it - error appeared in any case.

Do you want me to reinstall and retest without any plugins installed (currently onboarding plugin and napalm)?

@jathanism i've reinstalled nautobot via the new way. Same behavior. After a fresh nautobot start the error occurs. After some minutes it disappears...

OperationalError at /dcim/devices/
SSL error: decryption failed or bad record mac
Request Method:	GET
Request URL:	https://<schnipp>/dcim/devices/
Django Version:	3.1.7
Exception Type:	OperationalError
Exception Value:	
SSL error: decryption failed or bad record mac
Exception Location:	/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/backends/utils.py, line 84, in _execute
Python Executable:	/usr/bin/python3.8
Python Version:	3.8.5
Python Path:	
['.',
 '',
 '/opt/nautobot/1.0.0b2/bin',
 '/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/git/ext/gitdb',
 '/usr/lib/python38.zip',
 '/usr/lib/python3.8',
 '/usr/lib/python3.8/lib-dynload',
 '/opt/nautobot/1.0.0b2/lib/python3.8/site-packages',
 '/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/gitdb/ext/smmap']
Server time:	Wed, 10 Mar 2021 16:22:03 +0000
Environment:


Request Method: GET
Request URL: https://<schnipp>/dcim/devices/

Django Version: 3.1.7
Python Version: 3.8.5
Installed Applications:
['django.contrib.admin',
 'django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'django.contrib.humanize',
 'cacheops',
 'corsheaders',
 'django_filters',
 'django_tables2',
 'django_prometheus',
 'mptt',
 'rest_framework',
 'taggit',
 'timezone_field',
 'nautobot.core',
 'nautobot.circuits',
 'nautobot.dcim',
 'nautobot.ipam',
 'nautobot.extras',
 'nautobot.tenancy',
 'nautobot.users',
 'nautobot.utilities',
 'nautobot.virtualization',
 'django_rq',
 'drf_yasg',
 'graphene_django',
 'nautobot_device_onboarding.OnboardingConfig']
Installed Middleware:
['django_prometheus.middleware.PrometheusBeforeMiddleware',
 'corsheaders.middleware.CorsMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware',
 'django.middleware.clickjacking.XFrameOptionsMiddleware',
 'django.middleware.security.SecurityMiddleware',
 'nautobot.core.middleware.ExceptionHandlingMiddleware',
 'nautobot.core.middleware.RemoteUserMiddleware',
 'nautobot.core.middleware.APIVersionMiddleware',
 'nautobot.core.middleware.ObjectChangeMiddleware',
 'django_prometheus.middleware.PrometheusAfterMiddleware']



Traceback (most recent call last):
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 233, in _get_session
    return self._session_cache

During handling of the above exception ('SessionStore' object has no attribute '_session_cache'), another exception occurred:
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)

The above exception (SSL error: decryption failed or bad record mac
) was the direct cause of the following exception:
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/core/handlers/exception.py", line 47, in inner
    response = get_response(request)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/core/handlers/base.py", line 181, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/views/generic/base.py", line 70, in view
    return self.dispatch(request, *args, **kwargs)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/nautobot/utilities/views.py", line 91, in dispatch
    if not self.has_permission():
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/nautobot/utilities/views.py", line 73, in has_permission
    if user.has_perms((permission_required, *self.additional_permissions)):
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/utils/functional.py", line 240, in inner
    self._setup()
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/utils/functional.py", line 376, in _setup
    self._wrapped = self._setupfunc()
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/contrib/auth/middleware.py", line 23, in <lambda>
    request.user = SimpleLazyObject(lambda: get_user(request))
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/contrib/auth/middleware.py", line 11, in get_user
    request._cached_user = auth.get_user(request)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/contrib/auth/__init__.py", line 174, in get_user
    user_id = _get_user_session_key(request)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/contrib/auth/__init__.py", line 58, in _get_user_session_key
    return get_user_model()._meta.pk.to_python(request.session[SESSION_KEY])
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 65, in __getitem__
    return self._session[key]
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 238, in _get_session
    self._session_cache = self.load()
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/contrib/sessions/backends/db.py", line 43, in load
    s = self._get_session_from_db()
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/contrib/sessions/backends/db.py", line 32, in _get_session_from_db
    return self.model.objects.get(
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/cacheops/query.py", line 353, in get
    return qs._no_monkey.get(qs, *args, **kwargs)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/models/query.py", line 425, in get
    num = len(clone)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/models/query.py", line 269, in __len__
    self._fetch_all()
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/cacheops/query.py", line 273, in _fetch_all
    return self._no_monkey._fetch_all(self)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/models/query.py", line 1308, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/models/query.py", line 53, in __iter__
    results = compiler.execute_sql(chunked_fetch=self.chunked_fetch, chunk_size=self.chunk_size)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/models/sql/compiler.py", line 1156, in execute_sql
    cursor.execute(sql, params)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/backends/utils.py", line 98, in execute
    return super().execute(sql, params)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/cacheops/transaction.py", line 93, in execute
    result = self._no_monkey.execute(self, sql, params)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/opt/nautobot/1.0.0b2/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)

Exception Type: OperationalError at /dcim/devices/
Exception Value: SSL error: decryption failed or bad record mac

@mkrsn
Copy link
Author

mkrsn commented Mar 10, 2021

@jathanism I've added both parameters to the uwsgi.ini, stopped (checked that) and started nautobot but error still appears.

@jathanism
Copy link
Contributor

@mkrsn Okay thanks. Going to try to reproduce this. Thanks.

@jathanism jathanism added the type: bug Something isn't working as expected label Mar 11, 2021
@jathanism jathanism self-assigned this Mar 11, 2021
@jathanism
Copy link
Contributor

jathanism commented Mar 12, 2021

@mkrsn Can you please try replacing your uwsgi.ini with this and see if that alleviates your issue?

[uwsgi]
; The IP address (typically localhost) and port that the WSGI process should listen on
http-socket = 127.0.0.1:8001

; Fail to start if any parameter in the configuration file isn’t explicitly understood by uWSGI
strict = true

; Enable master process to gracefully re-spawn and pre-fork workers
master = true

; Allow Python app-generated threads to run
enable-threads = true

;Try to remove all of the generated file/sockets during shutdown
vacuum = true

; Do not use multiple interpreters, allowing only Nautobot to run
single-interpreter = true

; Shutdown when receiving SIGTERM (default is respawn)
die-on-term = true

; Prevents uWSGI from starting if it is unable load Nautobot (usually due to errors)
need-app = true

; By default, uWSGI has rather verbose logging that can be noisy
disable-logging = true

; Assert that critical 4xx and 5xx errors are still logged
log-4xx = true
log-5xx = true

;
; Advanced settings (disabled by default)
; Customize these for your environment if and only if you need them.
; Ref: https://uwsgi-docs.readthedocs.io/en/latest/Options.html
;

; Number of uWSGI workers to spawn. This should typically be 2n+1, where n is the number of CPU cores present.
; processes = 5

@mkrsn
Copy link
Author

mkrsn commented Mar 12, 2021

I've tried your config. Problem disappears. I've enabled the "processes" variable and the Problem reappears.

Thats why i've tested the original Config. Every time i set "processes" >1 the Problem appears.

@jathanism
Copy link
Contributor

@mkrsn That's great feedback, thanks! I'd like to get to the bottom of the root cause to this other than just trying to circumvent it. This helps narrow it down.

@jathanism
Copy link
Contributor

jathanism commented Mar 13, 2021

This only affects Ubuntu systems where PostgreSQL was installed via apt, in which SSL is enabled by default. You can confirm this by connecting to the database using psql and running \conninfo:

$ psql nautobot -U nautobot -h localhost --password
Password:
psql (12.6 (Ubuntu 12.6-0ubuntu0.20.04.1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

nautobot=> \conninfo
You are connected to database "nautobot" as user "nautobot" on host "localhost" (address "::1") at port "5432".
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)

After more debugging, I can 100% get the issue to go away if I do any one of these:

  • Don't use processes = in uwsgi.ini or --processes CLI flag when starting uWSGI, which doesn't result in pre-forking
  • Disable SSL connection to PGSQL by setting DATABASES['default']['OPTIONS'] = {"sslmode": "disable"} in nautobot_config.py
  • Disable persistent connections by setting DATABASES['default']['CONN_MAX_AGE'] = 0 in nautobot_config.py, which is the Django default to always close the connection on every request.

Obviously the first two are not desirable for production use. The third option in my opinion should be our new default for the common case if we're going to continue to recommend uWSGI over Gunicorn.

The root cause is the aggressive pre-forking of uWSGI which performs copy-on-write on subprocesses that often duplicates the database connection objects. This issue did not arise when using Gunicorn because its default sync worker class defers to every worker handling only a single request before cycling and deliberately closing the database connection. This mirrors the default behavior of Django with CONN_MAX_AGE=0, which is why when I set that, uWSGI behaves as expected.

@jathanism
Copy link
Contributor

I thought it was also worth noting what DID NOT work so far to alleviate this issue:

@mkrsn
Copy link
Author

mkrsn commented Mar 13, 2021

Nice work, thanks :) I'll try to disable Postgres encryption on Monday and report back.

But wouldn't it be better to file a Bug at the uWSGI project instead of defining a workaround as new default?

If you really want to change the default config to workaround that problem maybe it would be better to let the user choose the best solution. This would just need a small note in the Documentation.

In a all-in-one Setup (bare metal & VM, not LXC, LXD or Docker), while just talking to localhost, the best solution would imho be to disable the encryption to the Postgres because:

  • You need priv rights (or must be in the pcap group - which is also root decision) to sniff Network traffic
  • If an attacker gets root - you have bigger Problems than a plain connection to localhost
  • Persistence DB Connections increase the Performance alot (with activated TLS even more (3-Way TCP handshake, TLS Handshake & 4-Way TCP teardown)) and decreases the Time to first Byte.

@jathanism
Copy link
Contributor

jathanism commented Mar 13, 2021

Nice work, thanks :) I'll try to disable Postgres encryption on Monday and report back.

But wouldn't it be better to file a Bug at the uWSGI project instead of defining a workaround as new default?

uWSGI gratuitously documents that their aggressive pre-forking is working as intended and that if you don't like it, try the other options. The problem with uWSGI tho is that the configuration options are overwhelming and often confusing. I feel like there has to be some combination of settings for uWSGI that makes it work properly, but I've yet to find those.

If you really want to change the default config to workaround that problem maybe it would be better to let the user choose the best solution. This would just need a small note in the Documentation.

Agreed here. My preference would be to have the defaults work for the common case, with callouts and warnings about "advanced" configurations and common pitfalls with those configurations.

Like, if you want to use SSL on the database connection, that's the right thing to do, but it's going to also require that you do not use pre-forking, or if you do use pre-forking, that you disable persistent connections.

In a all-in-one Setup (bare metal & VM, not LXC, LXD or Docker), while just talking to localhost, the best solution would imho be to disable the encryption to the Postgres because:

  • You need priv rights (or must be in the pcap group - which is also root decision) to sniff Network traffic
  • If an attacker gets root - you have bigger Problems than a plain connection to localhost
  • Persistence DB Connections increase the Performance alot (with activated TLS even more (3-Way TCP handshake, TLS Handshake & 4-Way TCP teardown)) and decreases the Time to first Byte.

This is actually great. If you don't mind, I'll use this as inspiration for the forthcoming documentation changes.

@mkrsn
Copy link
Author

mkrsn commented Mar 15, 2021

Confirmed - disabling the conn-encryption to postgres solves the issue.

... If you don't mind, I'll use this as inspiration for the forthcoming documentation changes.

sure

@jathanism
Copy link
Contributor

@mkrsn Thanks for the confirmation. I'll be revising the docs with a quickness.

@chrismarget
Copy link
Contributor

@mkrsn and @jathanism, thanks for having this conversation. I've run into the same problem (using the new install instructions, on Ubutntu 20.04, using the distro-supplied PostgreSQL), and have worked around it by setting the database MAX_CONN_AGE to zero.

@jathanism
Copy link
Contributor

@mkrsn and @jathanism, thanks for having this conversation. I've run into the same problem (using the new install instructions, on Ubutntu 20.04, using the distro-supplied PostgreSQL), and have worked around it by setting the database MAX_CONN_AGE to zero.

Thanks for confirming. More motivation to get this into the docs and have a "common issues" or "troubleshooting" section. :)

@jathanism
Copy link
Contributor

jathanism commented Mar 15, 2021

The root cause is the aggressive pre-forking of uWSGI which performs copy-on-write on subprocesses that often duplicates the database connection objects. This issue did not arise when using Gunicorn because its default sync worker class defers to every worker handling only a single request before cycling and deliberately closing the database connection. This mirrors the default behavior of Django with CONN_MAX_AGE=0, which is why when I set that, uWSGI behaves as expected.

Something else occurred to me as well. If you review the original reason that CONN_MAX_AGE was put as a default setting in NetBox, it may have everything to do with the default worker strategy of Gunicorn constantly cycling workers on each request.

So for example using the default gunicorn.py recommended by NetBox, if you've got 5 worker processes, each with 3 threads each, that's 15 database connections that need to be maintained that would be constantly thrashing and cycling.

My opinion is that for any deployment of reasonable size or complexity, CONN_MAX_AGE on its own is not a sufficient approach using Gunicorn's built-in sync workers, and pairing it with async or gevent workers and green threads would be better.

@jathanism
Copy link
Contributor

See also: benoitc/gunicorn#996

@jathanism jathanism added this to the v1.0.0 milestone Mar 23, 2021
@jathanism jathanism added this to In progress in Release v1.0.0 Mar 24, 2021
@jathanism jathanism added the type: documentation Improvements or additions to documentation label Mar 31, 2021
Release v1.0.0 automation moved this from In progress to Done Apr 1, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: bug Something isn't working as expected type: documentation Improvements or additions to documentation
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

4 participants