Skip to content

fix/avoid locks being unrecoverable#255

Merged
joaorafaelalmeida merged 5 commits intodevfrom
fix/prevent_stuck_locks
Jan 5, 2023
Merged

fix/avoid locks being unrecoverable#255
joaorafaelalmeida merged 5 commits intodevfrom
fix/prevent_stuck_locks

Conversation

@aspedrosa
Copy link
Copy Markdown
Contributor

@aspedrosa aspedrosa commented Jun 30, 2022

If we performed a deployment upgrade while some celery tasks were executing, some redis locks would be unrecoverable.
With this PR:

  1. created a separate redis db for locks.
  2. when the dashboard worker starts:
    1. clears all locks
    2. starts tasks for all pending uploads that might have been interrupted. Since those execute operations over the database on a transaction, we can just restart the upload process because the previous progress was discarded.

Some other improvements that could be implemented: if there are several uploads pending, to the same database, only resume the most recent one and cancel the others.

@joaorafaelalmeida
Copy link
Copy Markdown
Member

While we do not approve this PR, we can solve this issue using the following:

  1. docker-compose restart dashboard_worker
  2. docker-compose logs dashboard_worker if any task restarted
  3. access redis container
  4. access redis database redis-cli -n 0
  5. Execute keys *. It should display something like "lock: lock...", "workers_updating..."
  6. Finally, execute flushdb

Copy link
Copy Markdown
Contributor

@joselfrias joselfrias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed.

@joaorafaelalmeida joaorafaelalmeida merged commit 3a998ca into dev Jan 5, 2023
@joaorafaelalmeida joaorafaelalmeida deleted the fix/prevent_stuck_locks branch January 5, 2023 17:50
joaorafaelalmeida added a commit that referenced this pull request Jan 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants