Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ActiveStorage::Attached::Many#attach may erase files uploaded concurrently #42941

Open
smt116 opened this issue Aug 4, 2021 · 21 comments
Open

Comments

@smt116
Copy link

smt116 commented Aug 4, 2021

Steps to reproduce

First setup the sample application:

  1. gem install rails -v6.1.4

  2. rails new ConcurrentAttachTestApp --skip-action-mailer --skip-action-mailbox --skip-action-text --skip-action-cable --skip-sprockets --skip-spring --skip-listen --skip-javascript --skip-turbolinks --skip-jbuilder --skip-test --skip-system-test --skip-bootsnap

  3. apply the following diff:

    diff --git a/Gemfile b/Gemfile
    index 01a705b..25b3e19 100644
    --- a/Gemfile
    +++ b/Gemfile
    @@ -15,6 +15,8 @@ gem 'puma', '~> 5.0'
     # Use Active Storage variant
     # gem 'image_processing', '~> 1.2'
    
    +gem 'pg'
    +
     group :development, :test do
       # Call 'byebug' anywhere in the code to stop execution and get a debugger console
       gem 'byebug', platforms: [:mri, :mingw, :x64_mingw]
    @@ -26,6 +28,8 @@ group :development do
       # Display performance information such as SQL time and flame graphs for each request in your browser.
       # Can be configured to work on production as well see: https://github.com/MiniProfiler/rack-mini-profiler/blob/master/README.md
       gem 'rack-mini-profiler', '~> 2.0'
    +
    +  gem 'parallel'
     end
    
     # Windows does not include zoneinfo files, so bundle the tzinfo-data gem
    diff --git a/app/models/user.rb b/app/models/user.rb
    new file mode 100644
    index 0000000..2db75b6
    --- /dev/null
    +++ b/app/models/user.rb
    @@ -0,0 +1,3 @@
    +class User < ApplicationRecord
    +  has_many_attached :files
    +end
    diff --git a/config/database.yml b/config/database.yml
    index 4a8a1b2..78ac63a 100644
    --- a/config/database.yml
    +++ b/config/database.yml
    @@ -10,8 +10,12 @@ default: &default
       timeout: 5000
    
     development:
    -  <<: *default
    -  database: db/development.sqlite3
    +  adapter: postgresql
    +  database: concurrent_attach_development
    +  encoding: unicode
    +  host: localhost
    +  pool: 5
    +  username: postgres
    
     # Warning: The database defined as "test" will be erased and
     # re-generated from your development database when you run "rake".
    diff --git a/db/migrate/20210804080852_create_active_storage_tables.active_storage.rb b/db/migrate/20210804080852_create_active_storage_tables.active_storage.rb
    new file mode 100644
    index 0000000..8779826
    --- /dev/null
    +++ b/db/migrate/20210804080852_create_active_storage_tables.active_storage.rb
    @@ -0,0 +1,36 @@
    +# This migration comes from active_storage (originally 20170806125915)
    +class CreateActiveStorageTables < ActiveRecord::Migration[5.2]
    +  def change
    +    create_table :active_storage_blobs do |t|
    +      t.string   :key,          null: false
    +      t.string   :filename,     null: false
    +      t.string   :content_type
    +      t.text     :metadata
    +      t.string   :service_name, null: false
    +      t.bigint   :byte_size,    null: false
    +      t.string   :checksum,     null: false
    +      t.datetime :created_at,   null: false
    +
    +      t.index [ :key ], unique: true
    +    end
    +
    +    create_table :active_storage_attachments do |t|
    +      t.string     :name,     null: false
    +      t.references :record,   null: false, polymorphic: true, index: false
    +      t.references :blob,     null: false
    +
    +      t.datetime :created_at, null: false
    +
    +      t.index [ :record_type, :record_id, :name, :blob_id ], name: "index_active_storage_attachments_uniqueness", unique: true
    +      t.foreign_key :active_storage_blobs, column: :blob_id
    +    end
    +
    +    create_table :active_storage_variant_records do |t|
    +      t.belongs_to :blob, null: false, index: false
    +      t.string :variation_digest, null: false
    +
    +      t.index %i[ blob_id variation_digest ], name: "index_active_storage_variant_records_uniqueness", unique: true
    +      t.foreign_key :active_storage_blobs, column: :blob_id
    +    end
    +  end
    +end
    diff --git a/db/schema.rb b/db/schema.rb
    new file mode 100644
    index 0000000..dbb533c
    --- /dev/null
    +++ b/db/schema.rb
    @@ -0,0 +1,51 @@
    +# This file is auto-generated from the current state of the database. Instead
    +# of editing this file, please use the migrations feature of Active Record to
    +# incrementally modify your database, and then regenerate this schema definition.
    +#
    +# This file is the source Rails uses to define your schema when running `bin/rails
    +# db:schema:load`. When creating a new database, `bin/rails db:schema:load` tends to
    +# be faster and is potentially less error prone than running all of your
    +# migrations from scratch. Old migrations may fail to apply correctly if those
    +# migrations use external dependencies or application code.
    +#
    +# It's strongly recommended that you check this file into your version control system.
    +
    +ActiveRecord::Schema.define(version: 2021_08_04_080852) do
    +
    +  # These are extensions that must be enabled in order to support this database
    +  enable_extension "plpgsql"
    +
    +  create_table "active_storage_attachments", force: :cascade do |t|
    +    t.string "name", null: false
    +    t.string "record_type", null: false
    +    t.bigint "record_id", null: false
    +    t.bigint "blob_id", null: false
    +    t.datetime "created_at", null: false
    +    t.index ["blob_id"], name: "index_active_storage_attachments_on_blob_id"
    +    t.index ["record_type", "record_id", "name", "blob_id"], name: "index_active_storage_attachments_uniqueness", unique: true
    +  end
    +
    +  create_table "active_storage_blobs", force: :cascade do |t|
    +    t.string "key", null: false
    +    t.string "filename", null: false
    +    t.string "content_type"
    +    t.text "metadata"
    +    t.string "service_name", null: false
    +    t.bigint "byte_size", null: false
    +    t.string "checksum", null: false
    +    t.datetime "created_at", null: false
    +    t.index ["key"], name: "index_active_storage_blobs_on_key", unique: true
    +  end
    +
    +  create_table "active_storage_variant_records", force: :cascade do |t|
    +    t.bigint "blob_id", null: false
    +    t.string "variation_digest", null: false
    +    t.index ["blob_id", "variation_digest"], name: "index_active_storage_variant_records_uniqueness", unique: true
    +  end
    +
    +  create_table "users", force: :cascade do |t|
    +  end
    +
    +  add_foreign_key "active_storage_attachments", "active_storage_blobs", column: "blob_id"
    +  add_foreign_key "active_storage_variant_records", "active_storage_blobs", column: "blob_id"
    +end
  4. bundle install

  5. bin/rails active_storage:install

  6. bin/rails db:setup

Now try the following code in the bin/rails console:

User.create

Parallel.each(1..10) do
  ActiveRecord::Base.connection_pool.with_connection do
    User.first.files.attach(io: File.open("README.md"), filename: "README.md")
  end
end

User.first.files.count

Note that User.first.files.count may return a smaller number than 10.

Expected behavior

Each Object#attach appends the new file and does not touch existing attachments.

Actual behavior

Object#attach may purge files that are uploaded concurrently.

System configuration

Rails version: 6.1.4

Ruby version: 3.0.1

@smt116
Copy link
Author

smt116 commented Aug 4, 2021

The only workaround that I think would work is using the exclusive lock, e.g.:

Parallel.each(1..50) do
  ActiveRecord::Base.connection_pool.with_connection do
    ActiveRecord::Base.transaction do
      ActiveRecord::Base.connection.execute("LOCK #{ActiveStorage::Blob.table_name} IN ACCESS EXCLUSIVE MODE")
      User.first.files.attach(io: File.open("README.md"), filename: "README.md")
    end
  end
end

@smt116 smt116 changed the title ActiveStorage::Attached::Many#attach is not "thread safe" ActiveStorage::Attached::Many#attach may erase files uploaded concurrently Aug 4, 2021
@smt116
Copy link
Author

smt116 commented Aug 4, 2021

Is there any reason why https://github.com/rails/rails/blob/main/activestorage/lib/active_storage/attached/many.rb#L51-L52 replaces the collection instead of creating new objects for attachment(s)?

@smt116
Copy link
Author

smt116 commented Aug 4, 2021

Another workaround is to create an attachment manually:

blob = ActiveStorage::Blob.create_and_upload!(io: File.open("README.md"), filename: "README.md")
attachment = ActiveStorage::Attachment.create!(blob: blob, name: "files", record_type: "User", record_id: 1)

I did not check the consequences like error handling yet, but at least it does not require a table-wide exclusive lock.

@georgeclaghorn
Copy link
Contributor

Due to awkward implementation-sharing between ActiveStorage::Attached::Many#attach and setters. Agree attach should just append.

@intrip
Copy link
Contributor

intrip commented Aug 4, 2021

@smt116 I do confirm ActiveStorage::Attached::Many#attach doesn't support concurrent operations ATM.

@georgeclaghorn
Copy link
Contributor

We also need to consider how appends should stack on top of other changes—we can’t just keep one change per reflection anymore. If you call the setter followed by attach in a single transaction, we need to apply both the replacement and the append.

@smt116
Copy link
Author

smt116 commented Aug 4, 2021

I see. I will use #42941 (comment) as the workaround for now. Please let me know If you are aware of any bad consequences of such a solution or a better approach.

Also, I can try to help with fixing the underlying issue, but I would need some hints on where to start.

@rails-bot
Copy link

rails-bot bot commented Nov 7, 2021

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 6-1-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@rails-bot rails-bot bot added the stale label Nov 7, 2021
@smt116
Copy link
Author

smt116 commented Nov 8, 2021

@rails-bot still is still relevant.

@rails-bot rails-bot bot removed the stale label Nov 8, 2021
@3pns
Copy link

3pns commented Dec 20, 2021

I encountered the same concurrency issue by doing a basic multi file upload form through rails api mode. Sending 9 images at the same time results in missing images 95% of the time.

I tried setting the config below as per #35817 but it does not fix the problem.

config.active_storage.replace_on_assign_to_many = false 

@rails-bot
Copy link

rails-bot bot commented Mar 20, 2022

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 7-0-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@rails-bot rails-bot bot added the stale label Mar 20, 2022
@smt116
Copy link
Author

smt116 commented Mar 20, 2022

Still relevant.

@rails-bot rails-bot bot removed the stale label Mar 20, 2022
@rails-bot
Copy link

rails-bot bot commented Jun 28, 2022

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 7-0-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@rails-bot rails-bot bot added the stale label Jun 28, 2022
@smt116
Copy link
Author

smt116 commented Jun 28, 2022 via email

@rails-bot rails-bot bot removed the stale label Jun 28, 2022
@rails-bot
Copy link

rails-bot bot commented Sep 26, 2022

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 7-0-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@rails-bot rails-bot bot added the stale label Sep 26, 2022
@smt116
Copy link
Author

smt116 commented Sep 26, 2022

still relevant.

@rails-bot rails-bot bot removed the stale label Sep 26, 2022
@rails-bot
Copy link

rails-bot bot commented Dec 25, 2022

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 7-0-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@rails-bot rails-bot bot added the stale label Dec 25, 2022
@smt116
Copy link
Author

smt116 commented Dec 26, 2022 via email

@rails-bot rails-bot bot removed the stale label Dec 26, 2022
@rails-bot
Copy link

rails-bot bot commented Mar 26, 2023

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 7-0-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@rails-bot rails-bot bot added the stale label Mar 26, 2023
@skipkayhil skipkayhil removed the stale label Mar 26, 2023
@rails-bot
Copy link

rails-bot bot commented Jun 24, 2023

This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 7-0-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.

@adamokasha
Copy link

Still relevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants