Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: sqlite3.IntegrityError: NOT NULL constraint failed: core_archiveresult.cmd_version and .output #597

Closed
coisnepe opened this issue Jan 3, 2021 · 15 comments
Labels
size: easy status: done Work is completed and released (or scheduled to be released in the next version) status: needs followup Work is stalled awaiting a follow-up from the original issue poster or ArchiveBox maintainers type: bug report

Comments

@coisnepe
Copy link

coisnepe commented Jan 3, 2021

Describe the bug

Cannot start archivebox after moving from nikisweeting/archivebox to archivebox/archivebox. Django/Sqlite error ensues.

Steps to reproduce

Might be related to #596 ? I killed my docker container running nikisweeting/archivebox:latest and tried recreating a container with archivebox/archivebox:latest, linking to the same old volume. I did prepend the server command with archivebox init in order to run 3 new migrations. The container fails to start, issuing the following error.

Looks like this is the migration that not going through, which might point at a corrupted/faulty json index file?

Screenshots or log output

[*] Verifying main SQL index and running migrations...
    √ /data/index.sqlite3

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: NOT NULL constraint failed: core_archiveresult.cmd_version

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 227, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/operations/special.py", line 190, in database_forwards
    self.code(from_state.apps, schema_editor)
  File "/app/archivebox/core/migrations/0007_archiveresult.py", line 33, in forwards_func
    ArchiveResult.objects.create(extractor=extractor, snapshot=snapshot, cmd=result["cmd"], cmd_version=result["cmd_version"], 
  File "/usr/local/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 447, in create
    obj.save(force_insert=True, using=self.db)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 753, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 790, in save_base
    updated = self._save_table(
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 895, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 933, in _do_insert
    return manager._insert(
  File "/usr/local/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 1254, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1397, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: NOT NULL constraint failed: core_archiveresult.cmd_version

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/archivebox", line 33, in <module>
    sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')())
  File "/app/archivebox/cli/__init__.py", line 123, in main
    run_subcommand(
  File "/app/archivebox/cli/__init__.py", line 63, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/app/archivebox/cli/archivebox_init.py", line 33, in main
    init(
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/main.py", line 323, in init
    for migration_line in apply_migrations(out_dir):
  File "/app/archivebox/util.py", line 113, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/sql.py", line 102, in apply_migrations
    call_command("migrate", interactive=False, stdout=out)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command
    return command.execute(*args, **defaults)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 85, in wrapped
    res = handle_func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/commands/migrate.py", line 243, in handle
    post_migrate_state = executor.migrate(
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 229, in apply_migration
    migration_recorded = True
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/schema.py", line 35, in __exit__
    self.connection.check_constraints()
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 324, in check_constraints
    violations = cursor.execute('PRAGMA foreign_key_check').fetchall()
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 78, in _execute
    self.db.validate_no_broken_transaction()
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py", line 447, in validate_no_broken_transaction
    raise TransactionManagementError(
django.db.transaction.TransactionManagementError: An error occurred in the current transaction. You can't execute queries until the end of the 'atomic' block.

Software versions

  • OS: Docker on Synology NAS
  • ArchiveBox version: archivebox/archivebox:latest
@pirate
Copy link
Member

pirate commented Jan 4, 2021

@cdvv7788 can you make that field nullable, I suspected we might have some old archive users with no cmd versions in their detail json files.

@cdvv7788
Copy link
Contributor

cdvv7788 commented Jan 4, 2021

@pirate #599 I gave it a default. It should work unless there are cmd_version fields with explicit nulls.

@hwangeug
Copy link

hwangeug commented Jan 10, 2021

I'm getting the same error, migrating from 0.4.21 to 0.5.3, using a PyPi install rather than Docker:

[*] Verifying main SQL index and running migrations...
    √ /home/eugene/Dropbox/Apps/archivebox/index.sqlite3

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: NOT NULL constraint failed: core_archiveresult.cmd_version

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/django/db/migrations/executor.py", line 227, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/usr/lib/python3.9/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/usr/lib/python3.9/site-packages/django/db/migrations/operations/special.py", line 190, in database_forwards
    self.code(from_state.apps, schema_editor)
  File "/usr/lib/python3.9/site-packages/archivebox/core/migrations/0007_archiveresult.py", line 39, in forwards_func
    ArchiveResult.objects.create(extractor=extractor, snapshot=snapshot, cmd=result["cmd"], cmd_version=result["cmd_version"], 
  File "/usr/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/django/db/models/query.py", line 447, in create
    obj.save(force_insert=True, using=self.db)
  File "/usr/lib/python3.9/site-packages/django/db/models/base.py", line 753, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/usr/lib/python3.9/site-packages/django/db/models/base.py", line 790, in save_base
    updated = self._save_table(
  File "/usr/lib/python3.9/site-packages/django/db/models/base.py", line 895, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/usr/lib/python3.9/site-packages/django/db/models/base.py", line 933, in _do_insert
    return manager._insert(
  File "/usr/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/django/db/models/query.py", line 1254, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/usr/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1397, in execute_sql
    cursor.execute(sql, params)
  File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 98, in execute
    return super().execute(sql, params)
  File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: NOT NULL constraint failed: core_archiveresult.cmd_version

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/bin/archivebox", line 33, in <module>
    sys.exit(load_entry_point('archivebox==0.5.3', 'console_scripts', 'archivebox')())
  File "/usr/lib/python3.9/site-packages/archivebox/cli/__init__.py", line 129, in main
    run_subcommand(
  File "/usr/lib/python3.9/site-packages/archivebox/cli/__init__.py", line 69, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/usr/lib/python3.9/site-packages/archivebox/cli/archivebox_init.py", line 33, in main
    init(
  File "/usr/lib/python3.9/site-packages/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/archivebox/main.py", line 324, in init
    for migration_line in apply_migrations(out_dir):
  File "/usr/lib/python3.9/site-packages/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/archivebox/index/sql.py", line 98, in apply_migrations
    call_command("migrate", interactive=False, stdout=out)
  File "/usr/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command
    return command.execute(*args, **defaults)
  File "/usr/lib/python3.9/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/usr/lib/python3.9/site-packages/django/core/management/base.py", line 85, in wrapped
    res = handle_func(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/django/core/management/commands/migrate.py", line 243, in handle
    post_migrate_state = executor.migrate(
  File "/usr/lib/python3.9/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/usr/lib/python3.9/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/usr/lib/python3.9/site-packages/django/db/migrations/executor.py", line 229, in apply_migration
    migration_recorded = True
  File "/usr/lib/python3.9/site-packages/django/db/backends/sqlite3/schema.py", line 35, in __exit__
    self.connection.check_constraints()
  File "/usr/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 324, in check_constraints
    violations = cursor.execute('PRAGMA foreign_key_check').fetchall()
  File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 98, in execute
    return super().execute(sql, params)
  File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/lib/python3.9/site-packages/django/db/backends/utils.py", line 78, in _execute
    self.db.validate_no_broken_transaction()
  File "/usr/lib/python3.9/site-packages/django/db/backends/base/base.py", line 447, in validate_no_broken_transaction
    raise TransactionManagementError(
django.db.transaction.TransactionManagementError: An error occurred in the current transaction. You can't execute queries until the end of the 'atomic' block.

@coisnepe
Copy link
Author

I ended up stopping the container, downloaded the database off of my NAS, ran the migration manually in raw SQL, inserted the migration log, uploaded the DB and restarted the container.

@pirate
Copy link
Member

pirate commented Jan 11, 2021

This is fixed in the next minor patch version v0.5.4 (not yet out). If you want it early you can build it by doing git checkout dev; git pull; docker build . -t archivebox.

@coisnepe while impressive, in the future I advise against running manual migrations like that, because it may be hard to install v0.5.4 without conflicting with your manual migration ;)

You may be able to figure out how to apply v0.5.4 without breaking anything, but if not just lemme know and I can probably help you apply v0.5.4 on top of your manual fix.

@coisnepe
Copy link
Author

@pirate yeah I figured it probably wasn't going to be the smartest way but I couldn't resist. Thank you so much for not holding it against me though! I'll give it a try and see how it goes.

@pirate
Copy link
Member

pirate commented Jan 12, 2021

Fixed in this commit: a3008c8, it will be released in v0.5.4 shortly it's now released in v0.5.4.

@pirate pirate added type: bug report size: easy status: done Work is completed and released (or scheduled to be released in the next version) labels Jan 12, 2021
@pirate pirate closed this as completed Feb 1, 2021
@drpfenderson
Copy link

drpfenderson commented Feb 1, 2021

I am experiencing what seems to be this same issue when running the newly-released v0.5.4 using docker-compose. The archive I'm trying to upgrade is from v0.4.21.

Creating network "archive-upgrade_default" with the default driver
Creating archive-upgrade_sonic_1 ... done
[i] [2021-02-01 16:29:32] ArchiveBox v0.5.4: archivebox init
    > /data

[!] This folder contains a JSON index. It is deprecated, and will no longer be kept up to date automatically.
    You can run `archivebox list --json --with-headers > index.json` to manually generate it.
[*] Updating existing ArchiveBox collection in this folder...
    /data
------------------------------------------------------------------

[*] Verifying archive folder structure...
    √ /data/sources
    √ /data/archive
    √ /data/logs
    √ /data/ArchiveBox.conf

[*] Verifying main SQL index and running migrations...
    √ /data/index.sqlite3

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: NOT NULL constraint failed: core_archiveresult.output

The above exception was the direct cause of the following exception:


Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 227, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/operations/special.py", line 190, in database_forwards
    self.code(from_state.apps, schema_editor)
  File "/app/archivebox/core/migrations/0007_archiveresult.py", line 39, in forwards_func
    ArchiveResult.objects.create(extractor=extractor, snapshot=snapshot, cmd=result["cmd"], cmd_version=result["cmd_version"] or 'unknown',
  File "/usr/local/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 447, in create
    obj.save(force_insert=True, using=self.db)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 753, in save
    self.save_base(using=using, force_insert=force_insert,
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 790, in save_base
    updated = self._save_table(
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 895, in _save_table
    results = self._do_insert(cls._base_manager, using, fields, returning_fields, raw)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/base.py", line 933, in _do_insert
    return manager._insert(
  File "/usr/local/lib/python3.9/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/query.py", line 1254, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
  File "/usr/local/lib/python3.9/site-packages/django/db/models/sql/compiler.py", line 1397, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
django.db.utils.IntegrityError: NOT NULL constraint failed: core_archiveresult.output

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/archivebox", line 33, in <module>
    sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')())
  File "/app/archivebox/cli/__init__.py", line 129, in main
    run_subcommand(
  File "/app/archivebox/cli/__init__.py", line 69, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/app/archivebox/cli/archivebox_init.py", line 33, in main
    init(
  File "/app/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/main.py", line 325, in init
    for migration_line in apply_migrations(out_dir):
  File "/app/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/sql.py", line 98, in apply_migrations
    call_command("migrate", interactive=False, stdout=out)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command
    return command.execute(*args, **defaults)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 85, in wrapped
    res = handle_func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/commands/migrate.py", line 243, in handle
    post_migrate_state = executor.migrate(
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 229, in apply_migration
    migration_recorded = True
File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/schema.py", line 35, in __exit__
    self.connection.check_constraints()
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 324, in check_constraints
    violations = cursor.execute('PRAGMA foreign_key_check').fetchall()
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
    return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
    return executor(sql, params, many, context)
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/utils.py", line 78, in _execute
    self.db.validate_no_broken_transaction()
  File "/usr/local/lib/python3.9/site-packages/django/db/backends/base/base.py", line 447, in validate_no_broken_transaction
    raise TransactionManagementError(
django.db.transaction.TransactionManagementError: An error occurred in the current transaction. You can't execute queries until the end of the 'atomic' block.

Output of --version:

ArchiveBox v0.5.4
Cpython Linux Linux-5.4.0-45-generic-x86_64-with-glibc2.28 x86_64 (in Docker)

[i] Dependency versions:
 √  ARCHIVEBOX_BINARY     v0.5.4          valid     /usr/local/bin/archivebox
 √  PYTHON_BINARY         v3.9.1          valid     /usr/local/bin/python3.9
 √  DJANGO_BINARY         v3.1.3          valid     /usr/local/lib/python3.9/site-packages/django/bin/django-admin.py
 √  CURL_BINARY           v7.64.0         valid     /usr/bin/curl
 √  WGET_BINARY           v1.20.1         valid     /usr/bin/wget
 √  NODE_BINARY           v15.7.0         valid     /usr/bin/node
 √  SINGLEFILE_BINARY     v0.1.14         valid     /node/node_modules/single-file/cli/single-file
 √  READABILITY_BINARY    v0.1.0          valid     /node/node_modules/readability-extractor/readability-extractor
 √  MERCURY_BINARY        v1.0.0          valid     /node/node_modules/@postlight/mercury-parser/cli.js
 -  GIT_BINARY            -               disabled  /usr/bin/git
 -  YOUTUBEDL_BINARY      -               disabled  /usr/local/bin/youtube-dl
 √  CHROME_BINARY         v87.0.4280.141  valid     /usr/bin/chromium
 √  RIPGREP_BINARY        v0.10.0         valid     /usr/bin/rg

[i] Source-code locations:
 √  PACKAGE_DIR           22 files        valid     /app/archivebox
 √  TEMPLATES_DIR         3 files         valid     /app/archivebox/templates

[i] Secrets locations:
 -  CHROME_USER_DATA_DIR  -               disabled
 -  COOKIES_FILE          -               disabled

[i] Data locations:
 √  OUTPUT_DIR            15 files        valid     /data
 √  SOURCES_DIR           82 files        valid     ./sources
 √  LOGS_DIR              0 files         valid     ./logs
 √  ARCHIVE_DIR           1367 files      valid     ./archive
 √  CONFIG_FILE           81.0 Bytes      valid     ./ArchiveBox.conf
 √  SQL_INDEX             1.7 MB          valid     ./index.sqlite3

I thought it might be old config or something possibly interfering, so I went through and removed any user directory cruft I could find, as well as

docker-compose down
docker rm -f $(docker ps -a -q)
docker image prune
docker rmi `docker images -q`
docker-compose pull

Did not fix it, same error. Anywhere else something might be hiding to interfere? Or possibly a different, unrelated problem?

@pirate
Copy link
Member

pirate commented Feb 1, 2021

Different issue, but related, I can fix it. Can you post one of your archive/<timestamp>/index.json files from the old version so I can check for any other null/missing fields.

@pirate pirate changed the title Bugfix: sqlite3.IntegrityError: NOT NULL constraint failed: core_archiveresult.cmd_version Bugfix: sqlite3.IntegrityError: NOT NULL constraint failed: core_archiveresult.cmd_version and .output Feb 1, 2021
@pirate pirate added status: needs followup Work is stalled awaiting a follow-up from the original issue poster or ArchiveBox maintainers status: wip Work is in-progress / has already been partially completed labels Feb 1, 2021
@pirate pirate reopened this Feb 1, 2021
@drpfenderson
Copy link

Here is one with user/drive snipped out:

{
    "archive_path": "archive/1222742107",
    "base_url": "www.cjfearnley.com/buckyrefs.html",
    "basename": "buckyrefs.html",
    "bookmarked_date": "2008-09-30 02:35",
    "canonical": {
        "archive_org_path": "https://web.archive.org/web/www.cjfearnley.com/buckyrefs.html",
        "dom_path": "output.html",
        "favicon_path": "favicon.ico",
        "git_path": "git",
        "google_favicon_path": "https://www.google.com/s2/favicons?domain=www.cjfearnley.com",
        "index_path": "index.html",
        "media_path": "media",
        "pdf_path": "output.pdf",
        "screenshot_path": "screenshot.png",
        "warc_path": "warc",
        "wget_path": "www.cjfearnley.com/buckyrefs.html"
    },
    "domain": "www.cjfearnley.com",
    "extension": "html",
    "hash": "1JN8QY6Q7KSA2970V7TB",
    "history": {
        "archive_org": [
            {
                "cmd": [
                    "curl",
                    "-I",
                    "https://web.archive.org/save/http://www.cjfearnley.com/buckyrefs.html"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-09T00:24:08",
                "output": "Failed to find \"content-location\" URL header in Archive.org response.",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539044450",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-09T00:20:50",
                "status": "failed"
            },
            {
                "cmd": [
                    "curl",
                    "-I",
                    "https://web.archive.org/save/http://www.cjfearnley.com/buckyrefs.html"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:44:43",
                "output": "Failed to find \"content-location\" URL header in Archive.org response.",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539816055",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:40:55",
                "status": "failed"
            },
            {
                "cmd": [
                    "curl",
                    "-L",
                    "-I",
                    "-X",
                    "GET",
                    "https://web.archive.org/save/http://www.cjfearnley.com/buckyrefs.html"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T17:48:07",
                "output": "https://web.archive.org/web/20181018163921/http://www.cjfearnley.com/buckyrefs.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539880758",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T16:39:18",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:32:08",
                "output": "https://web.archive.org/web/20181018163921/http://www.cjfearnley.com/buckyrefs.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544653928",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:32:08",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T00:44:39",
                "output": "https://web.archive.org/web/20181018163921/http://www.cjfearnley.com/buckyrefs.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553129079",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T00:44:39",
                "status": "skipped"
            }
        ],
        "dom": [
            {
                "cmd": [
                    "chromium-browser",
                    "--headless",
                    "--dump-dom",
                    "http://www.cjfearnley.com/buckyrefs.html"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-09T00:27:55",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539044450",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-09T00:20:50",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:40:55",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539816055",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:40:55",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T16:39:18",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539880758",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T16:39:18",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:32:08",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544653928",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:32:08",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T00:44:37",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553129077",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T00:44:37",
                "status": "skipped"
            }
        ],
        "favicon": [
            {
                "cmd": [
                    "curl",
                    "https://www.google.com/s2/favicons?domain=www.cjfearnley.com"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-09T00:27:14",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539044451",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-09T00:20:51",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:40:55",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539816055",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:40:55",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T16:39:22",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539880762",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T16:39:22",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:32:08",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544653928",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:32:08",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T00:44:37",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553129077",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T00:44:37",
                "status": "skipped"
            }
        ],
        "git": [
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T00:44:37",
                "output": null,
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553129077",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T00:44:37",
                "status": "skipped"
            }
        ],
        "media": [
            {
                "cmd": [
                    "youtube-dl",
                    "--write-description",
                    "--write-info-json",
                    "--write-annotations",
                    "--yes-playlist",
                    "--write-thumbnail",
                    "--no-call-home",
                    "--no-check-certificate",
                    "--user-agent",
                    "--all-subs",
                    "--extract-audio",
                    "--keep-video",
                    "--ignore-errors",
                    "--geo-bypass",
                    "--audio-format",
                    "mp3",
                    "--audio-quality",
                    "320K",
                    "--embed-thumbnail",
                    "--add-metadata",
                    "http://www.cjfearnley.com/buckyrefs.html"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T01:18:31",
                "output": "media",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553129077",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T00:44:37",
                "status": "succeded"
            }
        ],
        "pdf": [
            {
                "cmd": [
                    "chromium-browser",
                    "--headless",
                    "--print-to-pdf",
                    "http://www.cjfearnley.com/buckyrefs.html"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-09T00:31:41",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539044449",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-09T00:20:49",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:40:55",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539816055",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:40:55",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T16:39:18",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539880758",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T16:39:18",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:32:08",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544653928",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:32:08",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T00:44:37",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553129077",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T00:44:37",
                "status": "skipped"
            }
        ],
        "screenshot": [
            {
                "cmd": [
                    "chromium-browser",
                    "--headless",
                    "--screenshot",
                    "--window-size=1440,1200",
                    "--hide-scrollbars",
                    "http://www.cjfearnley.com/buckyrefs.html"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-09T00:31:29",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539044449",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-09T00:20:49",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:40:55",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539816055",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:40:55",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T16:39:18",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539880758",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T16:39:18",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:32:08",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544653928",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:32:08",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T00:44:37",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553129077",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T00:44:37",
                "status": "skipped"
            }
        ],
        "title": [
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T00:44:37",
                "output": "List of Buckminster Fuller Resources on the Int...",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553129077",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T00:44:37",
                "status": "skipped"
            }
        ],
        "wget": [
            {
                "cmd": [
                    "wget",
                    "-N",
                    "-E",
                    "-np",
                    "-x",
                    "-H",
                    "-k",
                    "-K",
                    "-S",
                    "--restrict-file-names=unix",
                    "-p",
                    "http://www.cjfearnley.com/buckyrefs.html"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-09T00:30:01",
                "output": "www.cjfearnley.com/buckyrefs.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539044448",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-09T00:20:48",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:40:55",
                "output": "www.cjfearnley.com/buckyrefs.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539816055",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:40:55",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T16:39:18",
                "output": "www.cjfearnley.com/buckyrefs.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539880758",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T16:39:18",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:32:08",
                "output": "www.cjfearnley.com/buckyrefs.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544653928",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:32:08",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T00:44:37",
                "output": "www.cjfearnley.com/buckyrefs.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553129077",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T00:44:37",
                "status": "skipped"
            }
        ]
    },
    "is_archived": true,
    "is_static": false,
    "latest": {
        "archive_org": "https://web.archive.org/web/20181018163921/http://www.cjfearnley.com/buckyrefs.html",
        "dom": "output.html",
        "favicon": "favicon.ico",
        "git": null,
        "media": "media",
        "pdf": "output.pdf",
        "screenshot": "screenshot.png",
        "title": "List of Buckminster Fuller Resources on the Int...",
        "warc": null,
        "wget": "www.cjfearnley.com/buckyrefs.html"
    },
    "link_dir": "/SNIP/.archivebox-output/archive-working/archive/1222742107",
    "newest_archive_date": "2019-03-21T00:44:39",
    "num_failures": 2,
    "num_outputs": 8,
    "oldest_archive_date": "2018-10-09T00:20:48",
    "path": "/buckyrefs.html",
    "schema": "Link",
    "scheme": "http",
    "sources": [
        "/home/SNIP/pinboard_export"
    ],
    "tags": "audio knowledge research video",
    "timestamp": "1222742107",
    "title": "List of Buckminster Fuller Resources on the Int...",
    "updated": "2020-04-27T19:26:25",
    "updated_date": "2020-04-27 19:26",
    "url": "http://www.cjfearnley.com/buckyrefs.html"
}

Here is another:

{
    "archive_path": "archive/1398698567",
    "base_url": "arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers",
    "basename": "",
    "bookmarked_date": "2014-04-28 15:22",
    "canonical": {
        "archive_org_path": "https://web.archive.org/web/arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers",
        "dom_path": "output.html",
        "favicon_path": "favicon.ico",
        "git_path": "git",
        "google_favicon_path": "https://www.google.com/s2/favicons?domain=arstechnica.com",
        "index_path": "index.html",
        "media_path": "media",
        "pdf_path": "output.pdf",
        "screenshot_path": "screenshot.png",
        "warc_path": "warc",
        "wget_path": "arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/index.html"
    },
    "domain": "arstechnica.com",
    "extension": "",
    "hash": "1QXZS47MR74RTXK16XF5",
    "history": {
        "archive_org": [
            {
                "cmd": [
                    "curl",
                    "-I",
                    "https://web.archive.org/save/http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-05T23:45:33",
                "output": "Failed to find \"content-location\" URL header in Archive.org response.",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1538782918",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-05T23:41:58",
                "status": "failed"
            },
            {
                "cmd": [
                    "curl",
                    "-I",
                    "https://web.archive.org/save/http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T21:12:37",
                "output": "Failed to find \"content-location\" URL header in Archive.org response.",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539032841",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T21:07:21",
                "status": "failed"
            },
            {
                "cmd": [
                    "curl",
                    "-I",
                    "https://web.archive.org/save/http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T22:09:38",
                "output": "Failed to find \"content-location\" URL header in Archive.org response.",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539036361",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T22:06:01",
                "status": "failed"
            },
            {
                "cmd": [
                    "curl",
                    "-I",
                    "https://web.archive.org/save/http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:20:01",
                "output": "Failed to find \"content-location\" URL header in Archive.org response.",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539814600",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:16:40",
                "status": "failed"
            },
            {
                "cmd": [
                    "curl",
                    "-L",
                    "-I",
                    "-X",
                    "GET",
                    "https://web.archive.org/save/http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T16:37:33",
                "output": "https://web.archive.org/web/20181018155909/http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539878349",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T15:59:09",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:05:30",
                "output": "https://web.archive.org/web/20181018155909/http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544652330",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:05:30",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-20T23:18:15",
                "output": "https://web.archive.org/web/20181018155909/http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553123895",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-20T23:18:15",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T16:38:45",
                "output": "https://web.archive.org/web/20181018155909/http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553186325",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T16:38:45",
                "status": "skipped"
            }
        ],
        "dom": [
            {
                "cmd": [
                    "chromium-browser",
                    "--headless",
                    "--dump-dom",
                    "http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-06T00:58:02",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1538782913",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-05T23:41:53",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T21:07:21",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539032841",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T21:07:21",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T22:06:01",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539036361",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T22:06:01",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:16:40",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539814600",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:16:40",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T15:59:09",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539878349",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T15:59:09",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:05:30",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544652330",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:05:30",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-20T23:18:13",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553123893",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-20T23:18:13",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T16:38:45",
                "output": "output.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553186325",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T16:38:45",
                "status": "skipped"
            }
        ],
        "favicon": [
            {
                "cmd": [
                    "curl",
                    "https://www.google.com/s2/favicons?domain=arstechnica.com"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-05T23:44:04",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1538782918",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-05T23:41:58",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T21:07:21",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539032841",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T21:07:21",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T22:06:02",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539036362",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T22:06:02",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:16:40",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539814600",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:16:40",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T15:59:11",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539878351",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T15:59:11",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:05:30",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544652330",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:05:30",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-20T23:18:13",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553123893",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-20T23:18:13",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T16:38:45",
                "output": "favicon.ico",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553186325",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T16:38:45",
                "status": "skipped"
            }
        ],
        "git": [
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-20T23:18:13",
                "output": null,
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553123893",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-20T23:18:13",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T16:38:45",
                "output": null,
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553186325",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T16:38:45",
                "status": "skipped"
            }
        ],
        "media": [
            {
                "cmd": [
                    "youtube-dl",
                    "--write-description",
                    "--write-info-json",
                    "--write-annotations",
                    "--yes-playlist",
                    "--write-thumbnail",
                    "--no-call-home",
                    "--no-check-certificate",
                    "--user-agent",
                    "--all-subs",
                    "--extract-audio",
                    "--keep-video",
                    "--ignore-errors",
                    "--geo-bypass",
                    "--audio-format",
                    "mp3",
                    "--audio-quality",
                    "320K",
                    "--embed-thumbnail",
                    "--add-metadata",
                    "http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-20T23:56:12",
                "output": "media",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553123893",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-20T23:18:13",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T16:38:45",
                "output": "media",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553186325",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T16:38:45",
                "status": "skipped"
            }
        ],
        "pdf": [
            {
                "cmd": [
                    "chromium-browser",
                    "--headless",
                    "--print-to-pdf",
                    "http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-06T01:18:51",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1538782875",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-05T23:41:15",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T21:07:21",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539032841",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T21:07:21",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T22:06:01",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539036361",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T22:06:01",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:16:40",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539814600",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:16:40",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T15:59:09",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539878349",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T15:59:09",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:05:30",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544652330",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:05:30",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-20T23:18:13",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553123893",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-20T23:18:13",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T16:38:45",
                "output": "output.pdf",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553186325",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T16:38:45",
                "status": "skipped"
            }
        ],
        "screenshot": [
            {
                "cmd": [
                    "chromium-browser",
                    "--headless",
                    "--screenshot",
                    "--window-size=1440,1200",
                    "--hide-scrollbars",
                    "http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-06T08:30:15",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1538782881",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-05T23:41:21",
                "status": "succeded"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T21:07:21",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539032841",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T21:07:21",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T22:06:01",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539036361",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T22:06:01",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:16:40",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539814600",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:16:40",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T15:59:09",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539878349",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T15:59:09",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:05:30",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544652330",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:05:30",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-20T23:18:13",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553123893",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-20T23:18:13",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T16:38:45",
                "output": "screenshot.png",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553186325",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T16:38:45",
                "status": "skipped"
            }
        ],
        "title": [
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-20T23:18:13",
                "output": "Reading human history using ancient chicken DNA and chili peppers | Ars Tec",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553123893",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-20T23:18:13",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T16:38:45",
                "output": "Reading human history using ancient chicken DNA and chili peppers | Ars Tec",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553186325",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T16:38:45",
                "status": "skipped"
            }
        ],
        "wget": [
            {
                "cmd": [
                    "wget",
                    "-N",
                    "-E",
                    "-np",
                    "-x",
                    "-H",
                    "-k",
                    "-K",
                    "-S",
                    "--restrict-file-names=unix",
                    "-p",
                    "http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
                ],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-06T00:37:38",
                "output": "404 Not Found",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1538782872",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-05T23:41:12",
                "status": "failed"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T21:07:21",
                "output": "arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/index.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539032841",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T21:07:21",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-08T22:06:01",
                "output": "arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/index.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539036361",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-08T22:06:01",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-17T22:16:40",
                "output": "arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/index.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539814600",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-17T22:16:40",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-10-18T15:59:09",
                "output": "arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/index.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1539878349",
                "schema": "ArchiveResult",
                "start_ts": "2018-10-18T15:59:09",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2018-12-12T22:05:30",
                "output": "arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/index.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1544652330",
                "schema": "ArchiveResult",
                "start_ts": "2018-12-12T22:05:30",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-20T23:18:13",
                "output": "arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/index.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553123893",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-20T23:18:13",
                "status": "skipped"
            },
            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T16:38:45",
                "output": "arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/index.html",
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553186325",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T16:38:45",
                "status": "skipped"
            }
        ]
    },
    "is_archived": true,
    "is_static": false,
    "latest": {
        "archive_org": "https://web.archive.org/web/20181018155909/http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/",
        "dom": "output.html",
        "favicon": "favicon.ico",
        "git": null,
        "media": "media",
        "pdf": "output.pdf",
        "screenshot": "screenshot.png",
        "title": "Reading human history using ancient chicken DNA and chili peppers | Ars Tec",
        "warc": null,
        "wget": "arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/index.html"
    },
    "link_dir": "/SNIP/.archivebox-output/archive-working/archive/1398698567",
    "newest_archive_date": "2019-03-21T16:38:45",
    "num_failures": 5,
    "num_outputs": 8,
    "oldest_archive_date": "2018-10-05T23:41:12",
    "path": "/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/",
    "schema": "Link",
    "scheme": "http",
    "sources": [
        "/home/SNIP/pinboard_export"
    ],
    "tags": null,
    "timestamp": "1398698567",
    "title": "Reading human history using ancient chicken DNA and chili peppers | Ars Tec",
    "updated": "2020-04-27T18:57:13",
    "updated_date": "2020-04-27 18:57",
    "url": "http://arstechnica.com/science/2014/04/reading-human-history-using-ancient-chicken-dna-and-chili-peppers/"
}

Let me know if you need any more, and much appreciated!

@pirate
Copy link
Member

pirate commented Feb 1, 2021

That's perfect thanks, this is what I was looking for (invalid/empty/badly-parsed fields):

            {
                "cmd": [],
                "cmd_version": "Undefined",
                "end_ts": "2019-03-21T16:38:45",
                "output": null,
                "pwd": "/SNIP/.archivebox-output/archive-working/archive/1553186325",
                "schema": "ArchiveResult",
                "start_ts": "2019-03-21T16:38:45",
                "status": "skipped"
            }

It's fixed now in dev 0aea5ed:

docker build -t archivebox:dev https://github.com/ArchiveBox/ArchiveBox.git#dev
docker run -v $PWD:/data archivebox:dev ...

Let me know if that fix works ^

@drpfenderson
Copy link

drpfenderson commented Feb 1, 2021

Built and ran using provided commands, different error:

$ docker run -v $PWD:/data archivebox:dev init
[i] [2021-02-01 19:44:57] ArchiveBox v0.5.4: archivebox init
    > /data

[!] This folder contains a JSON index. It is deprecated, and will no longer be kept up to date automatically.
    You can run `archivebox list --json --with-headers > index.json` to manually generate it.
[*] Updating existing ArchiveBox collection in this folder...
    /data
------------------------------------------------------------------

[*] Verifying archive folder structure...
    √ /data/sources
    √ /data/archive
    √ /data/logs
    √ /data/ArchiveBox.conf

[*] Verifying main SQL index and running migrations...
    √ /data/index.sqlite3

Traceback (most recent call last):
  File "/usr/local/bin/archivebox", line 33, in <module>
    sys.exit(load_entry_point('archivebox', 'console_scripts', 'archivebox')())
  File "/app/archivebox/cli/__init__.py", line 129, in main
    run_subcommand(
  File "/app/archivebox/cli/__init__.py", line 69, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
  File "/app/archivebox/cli/archivebox_init.py", line 33, in main
    init(
  File "/app/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/main.py", line 325, in init
    for migration_line in apply_migrations(out_dir):
  File "/app/archivebox/util.py", line 112, in typechecked_function
    return func(*args, **kwargs)
  File "/app/archivebox/index/sql.py", line 98, in apply_migrations
    call_command("migrate", interactive=False, stdout=out)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/__init__.py", line 168, in call_command
    return command.execute(*args, **defaults)
 File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 371, in execute
    output = self.handle(*args, **options)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/base.py", line 85, in wrapped
    res = handle_func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/django/core/management/commands/migrate.py", line 243, in handle
    post_migrate_state = executor.migrate(
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 117, in migrate
    state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 147, in _migrate_all_forwards
    state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/executor.py", line 227, in apply_migration
    state = migration.apply(state, schema_editor)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/usr/local/lib/python3.9/site-packages/django/db/migrations/operations/special.py", line 190, in database_forwards
    self.code(from_state.apps, schema_editor)
  File "/app/archivebox/core/migrations/0007_archiveresult.py", line 39, in forwards_func
    ArchiveResult.objects.create(extractor=extractor, snapshot=snapshot, cmd=result["cmd"], cmd_version=result["cmd_version"] or 'unknown',
KeyError: 'cmd_version'

As an aside, I did find a few files that had more null fields using the search term you provided briefly in an edit to your last comment, though I don't know if that's needed now. Just a snippet:

    "latest": {
        "archive_org": "https://web.archive.org/web/20200424012942/https://designsystems.com/space-grids-and-layouts",
        "dom": "output.html",
        "favicon": "favicon.ico",
        "git": null,
        "media": "media",
        "pdf": "output.pdf",
        "screenshot": "screenshot.png",
        "singlefile": "/data/archive/1570818471/singlefile.html",
        "title": null,
        "warc": null,
        "wget": "designsystems.com/space-grids-and-layouts.html"
    },

@pirate
Copy link
Member

pirate commented Feb 1, 2021

Added another fix for that error, mind pulling/rebuilding from dev and trying again? @drpfenderson

The new change just skips invalid links and prints why they failed instead of breaking the entire migration:

                try:
                    ArchiveResult.objects.create(
                        extractor=extractor,
                        snapshot=snapshot,
                        pwd=result["pwd"],
                        cmd=result.get("cmd") or [],
                        cmd_version=result.get("cmd_version") or 'unknown',
                        start_ts=result["start_ts"],
                        end_ts=result["end_ts"],
                        status=result["status"],
                        output=result.get("output") or 'null',
                    )
                except Exception as e:
                    print(
                        '    ! Skipping import due to missing/invalid index.json:',
                        out_dir,
                        e,
                        '(open an issue with this index.json for help)',

If you see any failed links ! Skipping import due to missing/invalid index.json in the output just post a couple of those index.json files only as a response here.

@drpfenderson
Copy link

Pulled dev, built, and ran init. No errors thrown, and successful! I tested with docker-compose up -d and checking in browser. There are some weird things here and there, like main index showing 0 sources for a link, but navigating to a single link's index shows that it has all sources intact. But I'll file a separate bug report for that, as this issue is resolved! Thanks again @pirate and @cdvv7788 for the outstanding work and support.

@pirate pirate closed this as completed Feb 1, 2021
@pirate pirate removed the status: wip Work is in-progress / has already been partially completed label Jun 1, 2021
@pirate
Copy link
Member

pirate commented Apr 12, 2022

Note I've added a new DB/filesystem troubleshooting area to the wiki that may help people arriving here from Google: https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives#database-troubleshooting

Contributions/suggestions welcome there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: easy status: done Work is completed and released (or scheduled to be released in the next version) status: needs followup Work is stalled awaiting a follow-up from the original issue poster or ArchiveBox maintainers type: bug report
Projects
None yet
Development

No branches or pull requests

5 participants