Make PurgeOrphanConversations migration safer #5572

jaywink · 2015-01-17T20:00:44Z

This migration crashed when I updated iliketoast.net

In order to make 0.5 upgrades for the 100+ pods safer, I'd propose adding some safety by using rescue to catch deletion problems. In my situation, it was one conversation that failed with:

PG::ForeignKeyViolation: ERROR:  update or delete on table "conversations" violates foreign key constraint "messages_conversation_id_fkey" on table "messages"
DETAIL:  Key (id)=(36) is still referenced from table "messages".
: DELETE FROM "conversations" WHERE "conversations"."id" IN (SELECT "conversations"."id" FROM "conversations" LEFT JOIN conversation_visibilities ON conversation_visibilities.conversation_id = conversations.id GROUP BY conversations.id HAVING COUNT(conversation_visibilities.id) = 0)/home/diaspora/.rvm/gems/ruby-2.1.5/gems/activerecord-4.1.8/lib/active_record/connection_adapters/postgresql_adapter.rb:822:in `async_exec'

jhass · 2015-01-17T20:03:11Z

Don't we want to delete the dependent messages if nobody can see them anymore?

jaywink · 2015-01-17T20:14:04Z

Maybe, but tbh the most important thing is to minimize problems with pods upgrading. IMHO having a few message objects on a few pods left around is a very minor problem :)

jhass · 2015-01-17T20:49:54Z

I'm not sure I can agree when the entire point of the migration is to cleanup data, by that logic I'd rather drop it completely.

jaywink · 2015-01-17T21:23:39Z

Fair point if we were living in a "all or nothing" world. I do find it hard to accept though that it would be better to leave everything than to try and clean as much as possible.

To me:
A pod upgrade fails due to some bad data == bad
No data is cleaned == not very good
Some few rows remain due to rescue here == irrelevant

IMHO it's overkill to start optimising this to delete a message in the event of an exception (which exception?). And that would also have to be rescued due to the ddl transaction disable call. And what do we do then to keep things perfect?

I'll be happy either way, if this is merged or not, already migrated my pod. Just thought this safeguard might help some pods get to 0.5 :)

jhass · 2015-01-17T21:38:49Z

Can't we just pass the dependent option to delete_all?

jaywink · 2015-01-18T14:26:11Z

I'm not familiar enough with the RoR ORM to say whether that is a safe thing to do here. Also, I'm not able to test it, since I've migrated my pod past this. If you think that is a safe thing to set here without testing then sure, please can you either give the line and I can edit or do the change - but personally I don't like committing (data destructive) code I cannot verify myself.

Besides, we don't know whether any pods at all will have the kind of data my pod had. Or maybe many do. Leave it up to you :)

jhass · 2015-01-18T14:50:09Z

Yeah right, this definitely should be verified. Maybe @margori can help us out here :)

margori · 2015-01-18T16:59:36Z

Count on me.

I see that this issue is about Postgre. I've tested only over MySQL.

margori · 2015-01-18T19:27:27Z

First, I wonder why did migration fail. I didn't expect it to fail by foreign key.

@jaywink , Is 'messages_conversation_id_fkey' constrain set to cascade deleting?

Reading db schema, I found that that key should be named 'messages_conversation_id_fk' not 'messages_conversation_id_fkey'. Do they both constrains exists in your db?

In your opinion, how can we handle this kind of db mismatch?

jaywink · 2015-01-24T21:07:39Z

                                     Table "public.conversations"
   Column   |            Type             |                         Modifiers                          
------------+-----------------------------+------------------------------------------------------------
 id         | integer                     | not null default nextval('conversations_id_seq'::regclass)
 subject    | character varying(255)      | 
 guid       | character varying(255)      | not null
 author_id  | integer                     | not null
 created_at | timestamp without time zone | not null
 updated_at | timestamp without time zone | not null
Indexes:
    "conversations_id_pkey" PRIMARY KEY, btree (id)
    "conversations_author_id" btree (author_id)
Foreign-key constraints:
    "conversations_author_id_fkey" FOREIGN KEY (author_id) REFERENCES people(id)
Referenced by:
    TABLE "conversation_visibilities" CONSTRAINT "conversation_visibilities_conversation_id_fkey" FOREIGN KEY (conversation_id) REFERENCES conversations(id)
    TABLE "messages" CONSTRAINT "messages_conversation_id_fkey" FOREIGN KEY (conversation_id) REFERENCES conversations(id)

                                            Table "public.messages"
         Column          |            Type             |                       Modifiers                       
-------------------------+-----------------------------+-------------------------------------------------------
 id                      | integer                     | not null default nextval('messages_id_seq'::regclass)
 conversation_id         | integer                     | not null
 author_id               | integer                     | not null
 guid                    | character varying(255)      | not null
 text                    | text                        | not null
 created_at              | timestamp without time zone | not null
 updated_at              | timestamp without time zone | not null
 author_signature        | text                        | 
 parent_author_signature | text                        | 
Indexes:
    "messages_id_pkey" PRIMARY KEY, btree (id)
    "messages_author_id" btree (author_id)
    "messages_conversation_id" btree (conversation_id)
Foreign-key constraints:
    "messages_author_id_fkey" FOREIGN KEY (author_id) REFERENCES people(id)
    "messages_conversation_id_fkey" FOREIGN KEY (conversation_id) REFERENCES conversations(id)

Well, I remembered I still have the conversation that killed the upgrade, initially somehow thought I didn't have access to it, but of course it is still there. It has two entries in messages, referencing the entry in conversations. There are no rows in conversation_visibilities for this conversation.

So maybe I will do try the delete all to see if that kills the messages, as Jonne suggested.

jaywink · 2015-01-24T21:31:35Z

Meh, I can't try this out on my pod now, but will report later. To test, added a new migration which would obviously in the final pull replace the original. AFAICT the :dependent needs to be in the conversations model. Rails models are kinda hard to understand still for me 😓 😩

Oh also figured it needs to be destroy_all not delete_all.

margori · 2015-01-25T12:59:48Z

As I expected, table 'message' in your db is set to cascade deleting.

To have a smoother migration and still showing podmin wich conversation must be deleted, I offer this code:

class PurgeOrphanConversations < ActiveRecord::Migration
  disable_ddl_transaction!

  def get_orphan_ids
    Conversation.joins("LEFT JOIN conversation_visibilities ON conversation_visibilities.conversation_id = conversations.id").group('conversations.id').having("COUNT(conversation_visibilities.id) = 0").pluck(:id).join(',')
  end

  def up
    orphan_ids = get_orphan_ids
    begin
      Message.where("conversation_id in (#{orphan_ids})").delete_all
      Conversation.where("id in (#{orphan_ids})").delete_all
    rescue
    end

    orphan_ids = get_orphan_ids
    error_message = "PurgeOrphanConversations: Failed to delete orphan conversations #{orphan_ids}"
    #output error_message in red if required
    puts "\033[31m#{error_message}\033[0m" if orphan_ids != ""
  end

  def down
  end
end

I tested in MySQL and PostgreSQL, with or without cascade deleting.

What do you think?

margori · 2015-02-02T19:30:43Z

@jaywink By the way, I believe that last participant in a conversation will receive an error message when deleting conversation in iliketoast.net pod. Can you check that?

jaywink · 2015-03-23T20:02:01Z

I guess this is not needed, or really have no idea if it is needed. It helped me, but I think on IRC heard that the migrations should be safer now :)

jhass · 2015-03-23T20:09:42Z

@jaywink I did not notice so about this one. I think we got the utf8mb4 fairly stable now though. Could you clarify?

jaywink · 2015-03-23T20:13:42Z

@jhass clarify what? I think you were the one who said the migrations should be more stable so I don't think this needs to be merged in :) I just forgot to close it, just saw it in the 0.5 milestone..

jhass · 2015-03-23T20:23:25Z

Weird, I can't remember saying that about this one. The utf8mb4 one, sure.

jaywink · 2015-03-23T20:25:25Z

Maybe it was more generic towards "migrations". Anyway, I'm sure we will
see problems during wider testing of 0.5.

Regarding 0.5, could we please cut the RC soon? See Loomio discussion :)

On 23.03.2015 22:23, Jonne Haß wrote:

Weird, I can't remember saying that about this one. The utf8mb4 one, sure.

—
Reply to this email directly or view it on GitHub
#5572 (comment).

Br,
Jason Robinson
https://jasonrobinson.me

Make PurgeOrphanConversations migration safer

a6dee8f

jaywink force-pushed the safer-migration branch from fa7d257 to a6dee8f Compare January 24, 2015 21:27

jhass added this to the next-major milestone Feb 27, 2015

jaywink closed this Mar 23, 2015

jaywink removed this from the next-major milestone Mar 23, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make PurgeOrphanConversations migration safer #5572

Make PurgeOrphanConversations migration safer #5572

jaywink commented Jan 17, 2015

jhass commented Jan 17, 2015

jaywink commented Jan 17, 2015

jhass commented Jan 17, 2015

jaywink commented Jan 17, 2015

jhass commented Jan 17, 2015

jaywink commented Jan 18, 2015

jhass commented Jan 18, 2015

margori commented Jan 18, 2015

margori commented Jan 18, 2015

jaywink commented Jan 24, 2015

jaywink commented Jan 24, 2015

margori commented Jan 25, 2015

margori commented Feb 2, 2015

jaywink commented Mar 23, 2015

jhass commented Mar 23, 2015

jaywink commented Mar 23, 2015

jhass commented Mar 23, 2015

jaywink commented Mar 23, 2015