Skip to content
This repository has been archived by the owner on Feb 19, 2021. It is now read-only.

IntegrityError on deleting document #394

Closed
kmlucy opened this issue Sep 3, 2018 · 22 comments
Closed

IntegrityError on deleting document #394

kmlucy opened this issue Sep 3, 2018 · 22 comments

Comments

@kmlucy
Copy link

kmlucy commented Sep 3, 2018

When I try to delete a document, I get the following error:

IntegrityError at /paperless/admin/documents/document/239/delete/
FOREIGN KEY constraint failed

Request Method: POST
Request URL: http://FQDN/paperless/admin/documents/document/239/delete/
Django Version: 2.0.8
Exception Type: IntegrityError
Exception Value: FOREIGN KEY constraint failed
Exception Location: /usr/lib/python3.6/site-packages/django/db/backends/sqlite3/base.py in execute, line 303
Python Executable: /usr/bin/python3
Python Version: 3.6.5
Python Path: ['/usr/src/paperless/src',
 '/usr/lib/python36.zip',
 '/usr/lib/python3.6',
 '/usr/lib/python3.6/lib-dynload',
 '/usr/lib/python3.6/site-packages']
Server time: Mon, 3 Sep 2018 21:58:47 +0000

with traceback:

Environment:


Request Method: POST
Request URL: http://FQDN/paperless/admin/documents/document/239/delete/

Django Version: 2.0.8
Python Version: 3.6.5
Installed Applications:
['django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'corsheaders',
 'django_extensions',
 'documents.apps.DocumentsConfig',
 'reminders.apps.RemindersConfig',
 'paperless_tesseract.apps.PaperlessTesseractConfig',
 'django.contrib.admin',
 'rest_framework',
 'crispy_forms',
 'django_filters']
Installed Middleware:
['django.middleware.security.SecurityMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'corsheaders.middleware.CorsMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'paperless.middleware.Middleware',
 'django.contrib.messages.middleware.MessageMiddleware',
 'django.middleware.clickjacking.XFrameOptionsMiddleware']



Traceback:

File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py" in _execute
  85.                 return self.cursor.execute(sql, params)

File "/usr/lib/python3.6/site-packages/django/db/backends/sqlite3/base.py" in execute
  303.         return Database.Cursor.execute(self, query, params)

The above exception (FOREIGN KEY constraint failed) was the direct cause of the following exception:

File "/usr/lib/python3.6/site-packages/django/core/handlers/exception.py" in inner
  35.             response = get_response(request)

File "/usr/lib/python3.6/site-packages/django/core/handlers/base.py" in _get_response
  128.                 response = self.process_exception_by_middleware(e, request)

File "/usr/lib/python3.6/site-packages/django/core/handlers/base.py" in _get_response
  126.                 response = wrapped_callback(request, *callback_args, **callback_kwargs)

File "/usr/lib/python3.6/site-packages/django/contrib/admin/options.py" in wrapper
  575.                 return self.admin_site.admin_view(view)(*args, **kwargs)

File "/usr/lib/python3.6/site-packages/django/utils/decorators.py" in _wrapped_view
  142.                     response = view_func(request, *args, **kwargs)

File "/usr/lib/python3.6/site-packages/django/views/decorators/cache.py" in _wrapped_view_func
  44.         response = view_func(request, *args, **kwargs)

File "/usr/lib/python3.6/site-packages/django/contrib/admin/sites.py" in inner
  223.             return view(request, *args, **kwargs)

File "/usr/lib/python3.6/site-packages/django/utils/decorators.py" in _wrapper
  62.             return bound_func(*args, **kwargs)

File "/usr/lib/python3.6/site-packages/django/utils/decorators.py" in _wrapped_view
  142.                     response = view_func(request, *args, **kwargs)

File "/usr/lib/python3.6/site-packages/django/utils/decorators.py" in bound_func
  58.                 return func.__get__(self, type(self))(*args2, **kwargs2)

File "/usr/lib/python3.6/site-packages/django/contrib/admin/options.py" in delete_view
  1736.             return self._delete_view(request, object_id, extra_context)

File "/usr/lib/python3.6/site-packages/django/contrib/admin/options.py" in _delete_view
  1768.             self.log_deletion(request, obj, obj_display)

File "/usr/lib/python3.6/site-packages/django/contrib/admin/options.py" in log_deletion
  806.             action_flag=DELETION,

File "/usr/lib/python3.6/site-packages/django/contrib/admin/models.py" in log_action
  29.             change_message=change_message,

File "/usr/lib/python3.6/site-packages/django/db/models/manager.py" in manager_method
  82.                 return getattr(self.get_queryset(), name)(*args, **kwargs)

File "/usr/lib/python3.6/site-packages/django/db/models/query.py" in create
  417.         obj.save(force_insert=True, using=self.db)

File "/usr/lib/python3.6/site-packages/django/db/models/base.py" in save
  729.                        force_update=force_update, update_fields=update_fields)

File "/usr/lib/python3.6/site-packages/django/db/models/base.py" in save_base
  759.             updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)

File "/usr/lib/python3.6/site-packages/django/db/models/base.py" in _save_table
  842.             result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)

File "/usr/lib/python3.6/site-packages/django/db/models/base.py" in _do_insert
  880.                                using=using, raw=raw)

File "/usr/lib/python3.6/site-packages/django/db/models/manager.py" in manager_method
  82.                 return getattr(self.get_queryset(), name)(*args, **kwargs)

File "/usr/lib/python3.6/site-packages/django/db/models/query.py" in _insert
  1125.         return query.get_compiler(using=using).execute_sql(return_id)

File "/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py" in execute_sql
  1285.                 cursor.execute(sql, params)

File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py" in execute
  100.             return super().execute(sql, params)

File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py" in execute
  68.         return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)

File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py" in _execute_with_wrappers
  77.         return executor(sql, params, many, context)

File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py" in _execute
  85.                 return self.cursor.execute(sql, params)

File "/usr/lib/python3.6/site-packages/django/db/utils.py" in __exit__
  89.                 raise dj_exc_value.with_traceback(traceback) from exc_value

File "/usr/lib/python3.6/site-packages/django/db/backends/utils.py" in _execute
  85.                 return self.cursor.execute(sql, params)

File "/usr/lib/python3.6/site-packages/django/db/backends/sqlite3/base.py" in execute
  303.         return Database.Cursor.execute(self, query, params)

Exception Type: IntegrityError at /paperless/admin/documents/document/239/delete/
Exception Value: FOREIGN KEY constraint failed

Let me know if you can reproduce or not or if you need more information. I am running the Docker container.

@danielquinn
Copy link
Collaborator

Ew. Ok well this is probably a relations thing, where deleting a document is blowing up because stuff that's linked to that document isn't being deleted automatically.

I'm on mobile right now, but I'll try to have a look in the next couple days.

@danielquinn
Copy link
Collaborator

Unfortunately, I can't reproduce this. I had a document in my own instance and managed to delete it just fine by going to the admin, clicking on the name of the document and then clicking the Delete button.

When you click the Big Red Delete Button, it should show you a page of what will be deleted and ask for confirmation. Can you post what's on that page?

@kmlucy
Copy link
Author

kmlucy commented Sep 9, 2018

image

I am also getting the same error if I try to rename the document, add a tag, change the correspondent, anything really. It's not just this document. I tested a few others with the same results.

@danielquinn
Copy link
Collaborator

That's just too weird. I just can't reproduce this. Judging by your traceback, it looks like you're running Sqlite, and not one of the more robust database servers, but have you been poking around in the database manually anyway?

@kmlucy
Copy link
Author

kmlucy commented Sep 9, 2018

No, I haven't touched the database. I run it in the Docker container. The only thing I do weird is change some paths in my entrypoint file to work in a subdirectory, but I tried getting rid of that to test and got the same result.

@danielquinn
Copy link
Collaborator

Naw, I can't see how that'd be a problem. This definitely reads like a database problem, but an IntegrityError is usually due to an attempted insert pointing to a row that isn't there any more. But you're deleting, so I can't see how that'd be the problem.

I'm sorry, but I'm at a loss right now. I'm going to suggest two things, both of which suck:

  1. Open /usr/lib/python3.6/site-packages/django/db/backends/sqlite3/base.py at line 303 and try printing out just what's being executed, then try to delete some stuff. Maybe something else is going on that we didn't predict.
  2. Use the document_export management command to dump your database into a directory and then just blow away your data.sqlite file, start Paperless from scratch, and re-import your dump. If indeed there's some corruption in your database, it's unlikely to follow you across an export/import.

It's pretty late here, so I have to sign off, but I hope that one of those pans out for you.

@kmlucy
Copy link
Author

kmlucy commented Sep 9, 2018

I don't know enough for number one, but I tried number two. I'm still getting the same error. Interestingly, this time, something is different. When I go back from the error page, there is a banner:
The document "20180903215750: 13557875_10153532147480755_7292443244723339364_n" was deleted successfully.

The file itself is gone, but the item is still there in Paperless.

@kmlucy
Copy link
Author

kmlucy commented Sep 17, 2018

So it's not just documents. I get the same error when trying to edit anything, including just renaming a tag. This is after backing up and reimporting Paperless from scratch, so I don't know now this is limited to me.

@kmlucy
Copy link
Author

kmlucy commented Sep 17, 2018

I can poke around in sqlite if you tell me what I am looking for.

@danielquinn
Copy link
Collaborator

Here's an interesting thought: what if you set the permissions on data.sqlite to 666? It sounds like stuff can go into the db just fine via the consumer, but the webserver might be running as a different user and is therefore unable to modify the file.

I'm reaching, but I'm running out of ideas.

@kmlucy
Copy link
Author

kmlucy commented Sep 17, 2018

No change with the permissions, unfortunately. I found this though: https://docs.djangoproject.com/en/2.0/releases/2.0/#foreign-key-constraints-are-now-enabled-on-sqlite

I know very little about Django or SQLite, but that looks like a change when moving to Django 2.0 is causing this. When attempting to replicate, have you tried creating a test instance with an older version of Paperless and upgrading?

@erikarvstedt
Copy link
Contributor

erikarvstedt commented Sep 17, 2018

I can reproduce this with Django 2.0 + the current paperless master, I'll post instructions later today.

@danielquinn
Copy link
Collaborator

Hooray for confirmation! Alright once @erikarvstedt has a reproducable case, that'll give me (or them) something to work with!

@erikarvstedt
Copy link
Contributor

erikarvstedt commented Sep 18, 2018

Here comes the repro.

I could trigger this bug by upgrading the Paperless nix package to Django 2.0.8.

1. Run Paperless

Run this Dockerfile

docker build -t paperless-bug .
docker run -p 8000:80 paperless-bug

or, even better, install Nix and run this in a shell:

paperless=$(nix-build --no-out-link -E '
with (import <nixpkgs> {});
let
  pkgs = import (fetchFromGitHub {
    owner = "erikarvstedt";
    repo = "nixpkgs";
    rev = "paperless-django2";
    sha256 = "1wcrsf7ai8m5r855dcdw434qs8zc96bc5zw8y2zpya3jxr1spc8i";
  }) {};
in
  pkgs.paperless-django2.withConfig {
    dataDir = /tmp/paperless-django2;
    config = {
      PAPERLESS_DISABLE_LOGIN = "true";
    };
  }
')

$paperless migrate
$paperless runserver --noreload localhost:8000
rm -r /tmp/paperless-django2 # remove app data

2. Trigger the bug

Browse to http://localhost:8000/admin/documents/tag/add/ and add a new tag.
Result: IntegrityError at /admin/documents/tag/add/

Curiously, the error doesn't occur when adding a tag with the REST api (http://localhost:8000/api/tags/).

You may change pkgs.paperless-django2 to pkgs.paperless in the above nix source
(or uncomment the extra section in the Dockerfile) to see how the bug disappears
with Django 1.11.

I know too little about Django to have any idea what's going on here. 😐

@erikarvstedt
Copy link
Contributor

erikarvstedt commented Sep 18, 2018

I you want to experiment and run Paperless from your local source, try the following.

@danielquinn
Copy link
Collaborator

I'm sorry, I can't reproduce this, and I'm not in a position to support Nix right now, so that Dockerfile doesn't really help me :-(

What I can tell you is that I ran the following commands and everything worked as expected

docker system prune --all --volumes
docker-compose up
docker-compose run --rm webserver createsuperuser
docker-compose run --rm consumer /usr/src/paperless/src/manage.py document_importer /export

This is my docker-compose.yml file:

version: '2'

services:
    webserver:
        image: danielquinn/paperless
        ports:
            - "8000:8000"
        volumes:
            - data:/usr/src/paperless/data
            - media:/usr/src/paperless/media
        env_file: docker-compose.env
        environment:
            - PAPERLESS_OCR_LANGUAGES=
        command: ["runserver", "--insecure", "0.0.0.0:8000"]

    consumer:
        image: danielquinn/paperless
        volumes:
            - data:/usr/src/paperless/data
            - media:/usr/src/paperless/media
            - /tmp/paperless/consume:/consume
            - /tmp/paperless/export:/export
        env_file: docker-compose.env
        command: ["document_consumer"]

volumes:
    data:
    media:

The import data came from another working installation, but even if you don't import anything and just start from scratch, I managed to create tags & delete them without issue.

To fix this, I need to reproduce it in the simplest way possible. Ideally, this means without even using Docker as it reduces the number of things that might be breaking stuff. At the moment, I'm going with the assumption that there's something broken on a Docker volume somewhere, but I have no proof to back that up.

@kmlucy
Copy link
Author

kmlucy commented Sep 24, 2018

I really don't understand how you aren't able to replicate this. It's not my database, or my configuration, or my proxy. I'm getting the same error with a blank database, no files, no special configuration, and no proxy. I can run:
docker run -v /opt/paperless/paperless.conf:/etc/paperless.conf:ro -p 8000:8000 danielquinn/paperless runserver --insecure 0.0.0.0:8000
and try to add a tag and I will get the same error. The only line in my paperless.conf for this example is to disable the login.

@erikarvstedt
Copy link
Contributor

erikarvstedt commented Sep 24, 2018

Ah, it's probably the disabled login which causes this. Enabling login (which means not setting PAPERLESS_DISABLE_LOGIN) makes the bug disappear in my Nix package.

@danielquinn, changing docker-compose.yml to

environment:
            - PAPERLESS_OCR_LANGUAGES=
            - PAPERLESS_DISABLE_LOGIN=true

should trigger the bug in your config.

@danielquinn
Copy link
Collaborator

Wow, I never would have guessed that this would be linked to the login disabling... Alright I've looked into this (and reproduced it!) and the result is not-awesome. Basically in disabling the login requirement, anything you do as a user in the admin can't be attributed to you, but "you" is no one.

We've avoided this 'til now because most of the actions you do in the admin are read-only, and there's no action logging for reads. However, there is a log for things like tag creation/deletion.

Now the fix for this can go a few different ways, so I'm looking for some input here:

1. Change the Hack User ID

Currently login-free sessions are handled with this crazy hack of a model which has a hard-coded value of -1 for an id, apparently to avoid conflicting with existing ids in the system. Unfortunately, this means that when Django tries to attribute your addition/deletion of a tag to "you", it uses -1 as the id and explodes.

An easy fix might be to hard-code id = 1 or even make it a property method that returns the first user it finds in the database. As @matthewmoto wrote this bit, I'm curious as to what he thinks.

2. Disable Logging if Logins are disabled.

Currently, this problem is being triggered because Django is trying to log the act of adding/deleting a tag, so if we hack around this decision-making process to choose not to log if logins are disabled, that'd work. However it would only fix the problem in the case of logging. Should we later add something that assumes a logged-in user, things would explode again.

3. Write a Proper Front-end

This is the preferred solution, because Paperless at its core is an abuse of the Django admin. Much of what's breaking is a direct result of us imposing things on the admin that it was never meant to shoulder.

However, I don't have time to write this, so while it's the best option for the project, it's not likely to happen unless someone picks up this task.


So basically we're looking at 1 or 2 I guess, but I'm curious how you guys use it. For example, Option 1 will only work if there's a user to assign to the logged-in user's process. I'm pretty sure Paperless doesn't work unless you've at least (followed the setup docs to) run createsuperuser. That user alone could be the to-be-assigned-all-things-done-in-the-admin-user if that's what makes sense.

So what do you think? I have no strong feelings either way we go, but it's important to me that this not mess up other people's working instances.

@kmlucy
Copy link
Author

kmlucy commented Oct 1, 2018

Either 1 or 2 would work for how I use it, i.e. with a single user. I would think option 1 would be a better long term option, because as you said, things could break again in the future.

Would a variation of #1 be possible, where in the settings file, when you disable login you choose the 'default' user? Something like PAPERLESS_NO_LOGIN_USER="kmlucy"?

@erikarvstedt
Copy link
Contributor

We had a similar issue with consumer log entries in a login-free setting. The fix was to add a an extra 'consumer' user for logging.
Maybe we can fix this bug in the same way: Create a 'webinterface' user and use its id in the User model.

@danielquinn
Copy link
Collaborator

Ok! I think I've fixed this in the latest push, so I'm going to close it. If it's still a thing though, feel free to re-open.

For the record, I ended up on going with a variation of option 1. @erikarvstedt thanks for digging up how we solved this before, but I didn't want to go too far down the road of creating a bunch of non-users to facilitate a hack. The right way will always be option 3, but until that happens, I don't to create too much of a hack foundation to build on.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants