Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration 158 Legacy Data Issues #2616

Merged
merged 2 commits into from
Jan 21, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
31 changes: 17 additions & 14 deletions common/db/db_migrator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -223,9 +223,12 @@ def self.setup_database(db)
!! !!
!! !!
!! Error: !!
!! #{e.inspect} !!
!! #{e.message} !!
!! #{e.backtrace.join("\n")} !!
EOF
e.message.split("\n").each do |line|
$stderr.puts " !! #{line}"
end
$stderr.puts <<EOF
!! !!
!! !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Expand Down Expand Up @@ -254,40 +257,40 @@ def self.fail_if_managed_container_migration_needed!(db)

if current_version && current_version > 0 && current_version < CONTAINER_MIGRATION_NUMBER
$stderr.puts <<~EOM

=======================================================================
Important migration issue
=======================================================================

Hello!

It appears that you are upgrading ArchivesSpace from version 1.4.2 or prior. To
complete this upgrade, there are some additional steps to follow.

The 1.5 series of ArchivesSpace introduced a new data model for containers,
along with a compatibility layer to provide a seamless transition between the
old and new container models. In ArchivesSpace version 2.1, this compatibility
layer was removed in the interest of long-term maintainability and system
performance.

To upgrade your ArchivesSpace installation, you will first need to upgrade to
version 2.0.1. This will upgrade your containers to the new model and clear the
path for future upgrades. Once you have done this, you can upgrade to the
latest ArchivesSpace version as normal.

For more information on upgrading to ArchivesSpace 2.0.1, please see the upgrade
guide:

https://archivesspace.github.io/tech-docs/administration/upgrading.html

The upgrade guide for version 1.5.0 also contains specific instructions for
the container upgrade that you will be performing, and the steps in this guide
apply equally to version 2.0.1. You can find that guide here:

https://github.com/archivesspace/archivesspace/blob/master/UPGRADING_1.5.0.md

=======================================================================

EOM

raise ContainerMigrationError.new
Expand Down
60 changes: 40 additions & 20 deletions common/db/migrations/158_ark_generalization.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,44 @@
require 'digest'
require 'json'

def delete_zombies
# Resources
zombies = self[:ark_name]
.left_join(:resource, Sequel.qualify(:ark_name, :resource_id) => Sequel.qualify(:resource, :id))
.filter(Sequel.~(Sequel.qualify(:ark_name, :resource_id) => nil))
.filter(Sequel.qualify(:resource, :id) => nil)
.select(Sequel.qualify(:ark_name, :id))
.map {|row| row.fetch(:id)}

self[:ark_name].filter(:id => zombies).delete

# Archival Objects
zombies = self[:ark_name]
.left_join(:archival_object, Sequel.qualify(:ark_name, :archival_object_id) => Sequel.qualify(:archival_object, :id))
.filter(Sequel.~(Sequel.qualify(:ark_name, :archival_object_id) => nil))
.filter(Sequel.qualify(:archival_object, :id) => nil)
.select(Sequel.qualify(:ark_name, :id))
.map {|row| row.fetch(:id)}

self[:ark_name].filter(:id => zombies).delete
end

def check_for_ambiguous_ark_links
bad_records = []
arks = self[:resource].group_and_count(:external_ark_url).having { count.function.* > 1 }.to_enum.map { |row| row[:external_ark_url] }.compact
resources = self[:resource].filter(:external_ark_url => arks).each do |row|
bad_records << "Resource #{row[:id]}: #{row[:title]} -- ARK: #{row[:external_ark_url]}"
end

unless bad_records.empty?
raise "These resources have duplicate ARK URLs. Please disambiguate before proceeding \n #{bad_records.join("\n")}"
end
end

Sequel.migration do
up do
check_for_ambiguous_ark_links

# New ArkName columns
alter_table(:ark_name) do
add_column(:ark_value, String, :null => true)
Expand All @@ -24,6 +60,9 @@

# Migrate existing ARKs to the new layout
self.transaction do

delete_zombies

now = (Time.now.to_f * 1000).to_i

self[:ark_name].update(:is_current => 0, :retired_at_epoch_ms => Sequel.lit("#{now} - id"))
Expand Down Expand Up @@ -87,26 +126,7 @@
drop_column(:external_ark_url)
end

## delete any unlinked arks
# Resources
zombies = self[:ark_name]
.left_join(:resource, Sequel.qualify(:ark_name, :resource_id) => Sequel.qualify(:resource, :id))
.filter(Sequel.~(Sequel.qualify(:ark_name, :resource_id) => nil))
.filter(Sequel.qualify(:resource, :id) => nil)
.select(Sequel.qualify(:ark_name, :id))
.map {|row| row.fetch(:id)}

self[:ark_name].filter(:id => zombies).delete

# Archival Objects
zombies = self[:ark_name]
.left_join(:archival_object, Sequel.qualify(:ark_name, :archival_object_id) => Sequel.qualify(:archival_object, :id))
.filter(Sequel.~(Sequel.qualify(:ark_name, :archival_object_id) => nil))
.filter(Sequel.qualify(:archival_object, :id) => nil)
.select(Sequel.qualify(:ark_name, :id))
.map {|row| row.fetch(:id)}

self[:ark_name].filter(:id => zombies).delete
delete_zombies

# We can now safely introduce foreign keys
alter_table(:ark_name) do
Expand Down