Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: django.db.utils.IntegrityError: UNIQUE constraint failed: core_tag.slug #596

Closed
terxw opened this issue Jan 2, 2021 · 3 comments
Closed
Labels
status: done Work is completed and released (or scheduled to be released in the next version) type: bug report

Comments

@terxw
Copy link

terxw commented Jan 2, 2021

after running newest docker image (nikisweeting/archivebox:latest) i get the following bug
version 0.4.21 could run without this problem, although very slow, I have 30000 links, that is why i am trying newwer version

Steps to reproduce

docker-compose.yml

version: "3.7"
services:
    archivebox:
        container_name: archivebox
        # build: .
        image: nikisweeting/archivebox:latest
        command: server 0.0.0.0:8000
        stdin_open: true
        tty: true
        ports:
            - 8000:8000
        environment:
            - USE_COLOR=True
            - SHOW_PROGRESS=False
            - ONLY_NEW=False
            - TIMEOUT=120
            - MEDIA_TIMEOUT=3600
            - FETCH_TITLE=True
            - FETCH_WGET=True
            - FETCH_WARC=True
            - FETCH_PDF=True
            - FETCH_SCREENSHOT=True
            - FETCH_DOM=True
            - FETCH_GIT=True
            - FETCH_MEDIA=false
            - SUBMIT_ARCHIVE_DOT_ORG=True
            - USE_SINGLEFILE=True
            - CHECK_SSL_VALIDITY=False
            - FETCH_WGET_REQUISITES=True
            - RESOLUTION="1440,900"
            - WGET_USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36"
            - CHROME_USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.75 Safari/537.36"
            - CHROME_HEADLESS=True
            - SECRET_KEY=""
        volumes:
            - /etc/localtime:/etc/localtime:ro
            - /storage/data/docs/archivebox:/data

run ti with:

docker-compose run archivebox init

log output

docker-compose run archivebox init
[i] [2021-01-02 10:15:56] ArchiveBox v0.5.0: archivebox init
    > /data
[!] This folder contains a JSON index. It is deprecated, and will no longer be kept up to date automatically.
    You can run `archivebox list --json --with-headers > index.json` to manually generate it.
[*] Updating existing ArchiveBox collection in this folder...
    /data
------------------------------------------------------------------
[*] Verifying archive folder structure...
    √ /data/sources
    √ /data/archive
    √ /data/logs
    √ /data/ArchiveBox.conf
[*] Verifying main SQL index and running migrations...
    √ /data/index.sqlite3
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 573, in get_or_create
    return self.get(**kwargs), False
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 429, in get
    raise self.model.DoesNotExist(
__fake__.DoesNotExist: Tag matching query does not exist.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: UNIQUE constraint failed: core_tag.slug
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "/usr/local/bin/archivebox", line 33, in <module>
    sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')())
  File "/app/archivebox/cli/__init__.py", line 123, in main
    run_subcommand(
  File "/app/archivebox/cli/__init__.py", line 63, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/app/archivebox/cli/archivebox_init.py", line 33, in main
    init(
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/main.py", line 323, in init
    for migration_line in apply_migrations(out_dir):
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/sql.py", line 102, in apply_migrations
    call_command("migrate", interactive=False, stdout=out)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command
    return command.execute(*args, **defaults)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 85, in wrapped
    res = handle_func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/commands/migrate.py", line 243, in handle
    post_migrate_state = executor.migrate(
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 227, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/operations/special.py", line 190, in database_forwards
    self.code(from_state.apps, schema_editor)
  File "/app/archivebox/core/migrations/0006_auto_20201012_1520.py", line 20, in forwards_func
    to_add, _ = TagModel.objects.get_or_create(name=tag, slug=slugify(tag))
  File "/usr/local/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 576, in get_or_create
    return self._create_object_from_params(kwargs, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 610, in _create_object_from_params
    obj = self.create(**params)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 447, in create
    obj.save(force_insert=True, using=self.db)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 753, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 790, in save_base
    updated = self._save_table(
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 895, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 933, in _do_insert
    return manager._insert(
  File "/usr/local/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 1254, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1397, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: UNIQUE constraint failed: core_tag.slug    

Software versions

  • OS: ubuntu 20.04
  • ArchiveBox version: latest docker 0.5.0 nikisweeting/archivebox:latest
  • Python version: 3.9 in docker
  • Chrome version: from docker
@pirate
Copy link
Member

pirate commented Jan 2, 2021

Thanks for reporting, we'll investigate. In the meantime can you switch to using archivebox/archivebox:latest we moved away from the old nikisweeting/archivebox repo under my personal account to a new more "official" one.

@pirate
Copy link
Member

pirate commented Feb 1, 2021

It's caused by both name=... and slug=... being used to create a unique Tag, but only one of those fields conflicts with an existing Tag you have.

archivebox/core/migrations/0006_auto_20201012_1520.py:

def forwards_func(apps, schema_editor):
    SnapshotModel = apps.get_model("core", "Snapshot")
    TagModel = apps.get_model("core", "Tag")

    db_alias = schema_editor.connection.alias
    snapshots = SnapshotModel.objects.all()
    for snapshot in snapshots:
        tags = snapshot.tags
        tag_set = (
            set(tag.strip() for tag in (snapshot.tags_old or '').split(','))
        )
        tag_set.discard("")

        for tag in tag_set:
            to_add, _ = TagModel.objects.get_or_create(name=tag, slug=slugify(tag))
            snapshot.tags.add(to_add)

This is a rare but annoying edge case, it was a simple 1-line fix aa84a7f, but I didn't manage to add it in time for the v0.5.4 release.

        for tag in tag_set:
-            to_add, _ = TagModel.objects.get_or_create(name=tag, slug=slugify(tag))
+            to_add, _ = TagModel.objects.get_or_create(name=tag, defaults={'slug': slugify(tag)})
            snapshot.tags.add(to_add)

You're welcome to run dev, or wait for the next release:

docker build -t archivebox:dev https://github.com/ArchiveBox/ArchiveBox.git#dev
docker run -v $PWD:/data archivebox:dev ...

Let me know if you're still having issues after that and I can reopen the ticket.

@pirate pirate added the status: done Work is completed and released (or scheduled to be released in the next version) label Feb 1, 2021
@pirate pirate closed this as completed Feb 1, 2021
@pirate
Copy link
Member

pirate commented Apr 12, 2022

Note I've added a new DB/filesystem troubleshooting area to the wiki that may help people arriving here from Google: https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives#database-troubleshooting

Contributions/suggestions welcome there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: done Work is completed and released (or scheduled to be released in the next version) type: bug report
Projects
None yet
Development

No branches or pull requests

2 participants