Automatically clean up after failed indexing runs #402

aldenstpage · 2020-01-13T22:32:53Z

When an indexing job fails (such as if a node in our Elasticsearch cluster has a full disk, or a bug in indexer-worker halts the process), the incomplete index is left inside of the Elasticsearch cluster, requiring someone to manually delete it. The indexer should detect this condition when the job starts and handle it.

The production index is determined by the image alias. The indexer should delete any index NOT pointed to by this alias following the naming scheme image-<uuid>.

The text was updated successfully, but these errors were encountered:

hedonhermdev · 2020-02-22T20:40:09Z

Can I work on this issue?

CodeMonk263 · 2020-02-23T07:11:51Z

Can i work on this issue?

kgodey · 2020-02-24T21:07:23Z

@hedonhermdev go ahead. @CodeMonk263 please find another issue to work on since @hedonhermdev commented first.

DantrazTrev · 2020-02-29T14:37:07Z

@hedonhermdev are still working on this issue?

hedonhermdev · 2020-02-29T14:37:49Z

No.

…

On Sat, 29 Feb 2020 at 8:07 PM, Dantraz ***@***.***> wrote: @hedonhermdev <https://github.com/hedonhermdev> are still working on this issue? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#402>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFRPNJ5THFLEUXK4JO5K42TRFEORJANCNFSM4KGJ2ZYQ> .

DantrazTrev · 2020-02-29T14:40:59Z

Can i take it over?
@aldenstpage

kgodey · 2020-03-03T15:56:59Z

Go ahead @DantrazTrev

tushar912 · 2020-10-02T11:31:25Z

@DantrazTrev are u still working on this?

kgodey · 2020-10-02T15:44:13Z

@tushar912 it's been a few months since @DantrazTrev's post, I think you can go ahead and work on this.

tushar912 · 2020-10-02T16:18:16Z

Ok

tushar912 · 2020-10-06T08:01:57Z

The way i understood this issue is as follows. The main indexing job is done by indexer.py in ingestion_server . The TableIndexer class contains a method _index_table which checks if the database is in sync with index and replicates if not.There are two methods of indexing. reindex which creates a new index and makes it live alias and update which updates the index. Currently during reindex if the index is not created successfully it still persists in the cluster so the job is to delete the index if indexing fails . @kgodey or @aldenstpage please tell if i have understood correctly.

tushar912 · 2020-10-06T08:14:31Z

Also i am thinking of modifying the already existing consistency_check method and add it to the reindex to delete the index if it is not indexed properly. Am i on the right track?

aldenstpage added the help wanted Open to participation from the community label Jan 13, 2020

aldenstpage self-assigned this Jan 13, 2020

aldenstpage added the enhancement label Jan 13, 2020

kgodey added this to To Be Prioritized in Backlog via automation Jan 13, 2020

kgodey added the in progress label Feb 24, 2020

annatuma added this to Ready for Development in Active Sprint via automation Feb 28, 2020

annatuma removed this from To Be Prioritized in Backlog Feb 28, 2020

annatuma moved this from Ready for Development to In Progress (Community) in Active Sprint Feb 28, 2020

kgodey assigned DantrazTrev Mar 3, 2020

aldenstpage removed the in progress label Apr 30, 2020

aldenstpage removed this from In Progress (Community) in Active Sprint Apr 30, 2020

kgodey added ✨ goal: improvement Improvement to an existing feature and removed enhancement labels Sep 24, 2020

dhruvkb added the Hacktoberfest Ideal for Hacktoberfest participation label Sep 25, 2020

kgodey unassigned DantrazTrev Oct 2, 2020

tushar912 mentioned this issue Oct 7, 2020

Automatically clean up after failed indexing runs #636

Closed

7 tasks

cc-open-source-bot added the 🏷 status: label work required Needs proper labelling before it can be worked on label Dec 2, 2020

kgodey added this to [TEMPORARY] Deprioritize in Active Sprint Dec 2, 2020

kgodey removed this from [TEMPORARY] Deprioritize in Active Sprint Dec 2, 2020

kgodey added this to Pending Review in Backlog Dec 2, 2020

kgodey added this to [TEMPORARY] Deprioritize in Active Sprint Dec 2, 2020

kgodey removed this from [TEMPORARY] Deprioritize in Active Sprint Dec 2, 2020

kgodey added this to [TEMPORARY] Deprioritize in Active Sprint Dec 2, 2020

kgodey removed this from [TEMPORARY] Deprioritize in Active Sprint Dec 2, 2020

kgodey added the 🙅 status: discontinued Not suitable for work as repo is in maintenance label Dec 16, 2020

kgodey closed this as completed Dec 16, 2020

kgodey moved this from Pending Review to Done in Backlog Dec 16, 2020

obulat mentioned this issue Apr 17, 2023

Automatically clean up after failed indexing runs (original #402) WordPress/openverse#1756

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically clean up after failed indexing runs #402

Automatically clean up after failed indexing runs #402

aldenstpage commented Jan 13, 2020

hedonhermdev commented Feb 22, 2020

CodeMonk263 commented Feb 23, 2020

kgodey commented Feb 24, 2020

DantrazTrev commented Feb 29, 2020

hedonhermdev commented Feb 29, 2020 via email

DantrazTrev commented Feb 29, 2020 •

edited

kgodey commented Mar 3, 2020

tushar912 commented Oct 2, 2020

kgodey commented Oct 2, 2020

tushar912 commented Oct 2, 2020

tushar912 commented Oct 6, 2020

tushar912 commented Oct 6, 2020

Automatically clean up after failed indexing runs #402

Automatically clean up after failed indexing runs #402

Comments

aldenstpage commented Jan 13, 2020

hedonhermdev commented Feb 22, 2020

CodeMonk263 commented Feb 23, 2020

kgodey commented Feb 24, 2020

DantrazTrev commented Feb 29, 2020

hedonhermdev commented Feb 29, 2020 via email

DantrazTrev commented Feb 29, 2020 • edited

kgodey commented Mar 3, 2020

tushar912 commented Oct 2, 2020

kgodey commented Oct 2, 2020

tushar912 commented Oct 2, 2020

tushar912 commented Oct 6, 2020

tushar912 commented Oct 6, 2020

DantrazTrev commented Feb 29, 2020 •

edited