Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark 404'd Repos to be Ignored and Re-collect Errored Repos #2678

Merged
merged 15 commits into from Feb 13, 2024

Conversation

IsaacMilarky
Copy link
Contributor

@IsaacMilarky IsaacMilarky commented Feb 8, 2024

Description

  • Add a new CollectionState enum value called 'IGNORE'. This marks the repos to be ignored.
  • Fix some import issues where CollectionState was being defined multiple times. I have moved this class to its own file in an attempt to make sure that we only define enums once.
  • Add periodic task to set errored repos to pending daily

Signed commits

  • Yes, I signed my commits.

Signed-off-by: Isaac Milarsky <isaac.milarsky@hhs.cms.gov>
Signed-off-by: Isaac Milarsky <isaac.milarsky@hhs.cms.gov>
Signed-off-by: Isaac Milarsky <isaac.milarsky@hhs.cms.gov>
…ndby

Signed-off-by: Isaac Milarsky <isaac.milarsky@hhs.cms.gov>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit

pylint

augur/tasks/util/collection_state.py|1| C0114: Missing module docstring (missing-module-docstring)
augur/tasks/util/collection_state.py|3| C0115: Missing class docstring (missing-class-docstring)
augur/tasks/util/collection_util.py|30| C0116: Missing function or method docstring (missing-function-docstring)

augur/tasks/git/facade_tasks.py Show resolved Hide resolved
augur/tasks/github/detect_move/core.py Outdated Show resolved Hide resolved
augur/tasks/init/celery_app.py Show resolved Hide resolved
augur/tasks/init/celery_app.py Show resolved Hide resolved
augur/tasks/init/celery_app.py Show resolved Hide resolved
@@ -36,6 +36,7 @@
from enum import Enum
from augur.tasks.util.redis_list import RedisList
from augur.application.db.models import CollectionStatus, Repo
from augur.tasks.util.collection_state import CollectionState
from augur.tasks.util.collection_util import *
from augur.tasks.git.util.facade_worker.facade_worker.utilitymethods import get_facade_weight_time_factor
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0413: Import "from augur.tasks.git.util.facade_worker.facade_worker.utilitymethods import get_facade_weight_time_factor" should be placed at the top of the module (wrong-import-position)

@@ -328,6 +329,22 @@ def augur_collection_update_weights():
session.commit()
#git_update_commit_count_weight(repo_git)

@celery.task
def retry_404_repos():
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0116: Missing function or method docstring (missing-function-docstring)

augur/tasks/start_tasks.py Show resolved Hide resolved
session.execute_sql(query)



#Retry this task for every issue so that repos that were added manually get the chance to be added to the collection_status table.
@celery.task(autoretry_for=(Exception,), retry_backoff=True, retry_backoff_max=300, retry_jitter=True, max_retries=None)
def create_collection_status_records():
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0116: Missing function or method docstring (missing-function-docstring)

@@ -36,6 +36,7 @@
from enum import Enum
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0412: Imports from package enum are not grouped (ungrouped-imports)

Signed-off-by: Isaac Milarsky <isaac.milarsky@hhs.cms.gov>
augur/tasks/github/detect_move/core.py Outdated Show resolved Hide resolved
@@ -0,0 +1,12 @@

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0114: Missing module docstring (missing-module-docstring)

augur/tasks/util/collection_state.py Show resolved Hide resolved
INITIALIZING = "Initializing"
UPDATE = "Update"
FAILED_CLONE = "Failed Clone"

def get_list_of_all_users(session):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0116: Missing function or method docstring (missing-function-docstring)

Signed-off-by: Isaac Milarsky <isaac.milarsky@hhs.cms.gov>
@@ -209,7 +201,7 @@ def setup_periodic_tasks(sender, **kwargs):
"""
from celery.schedules import crontab
from augur.tasks.start_tasks import augur_collection_monitor, augur_collection_update_weights
from augur.tasks.start_tasks import non_repo_domain_tasks
from augur.tasks.start_tasks import non_repo_domain_tasks, retry_errored_repos
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0415: Import outside toplevel (augur.tasks.start_tasks.non_repo_domain_tasks, augur.tasks.start_tasks.retry_errored_repos) (import-outside-toplevel)

@@ -328,6 +329,22 @@ def augur_collection_update_weights():
session.commit()
#git_update_commit_count_weight(repo_git)

@celery.task
def retry_errored_repos():
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0116: Missing function or method docstring (missing-function-docstring)

Signed-off-by: Isaac Milarsky <isaac.milarsky@hhs.cms.gov>
@IsaacMilarky IsaacMilarky changed the title Allow 404 Repos to Automatically be Re-collected Mark 404 Errors to be Ignored and Re-collect Errored Repos Feb 8, 2024
@IsaacMilarky IsaacMilarky changed the title Mark 404 Errors to be Ignored and Re-collect Errored Repos Mark 404'd Repos to be Ignored and Re-collect Errored Repos Feb 8, 2024
Signed-off-by: Isaac Milarsky <isaac.milarsky@hhs.cms.gov>
from augur.tasks.init.celery_app import engine
logger = logging.getLogger(create_collection_status_records.__name__)

#TODO: Isaac needs to normalize the status's to be abstract in the
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0511: TODO: Isaac needs to normalize the status's to be abstract in the (fixme)

Signed-off-by: Isaac Milarsky <isaac.milarsky@hhs.cms.gov>
@@ -33,9 +33,9 @@
from augur.tasks.init.celery_app import celery_app as celery
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0413: Import "from augur.tasks.init.celery_app import celery_app as celery" should be placed at the top of the module (wrong-import-position)

@@ -33,9 +33,9 @@
from augur.tasks.init.celery_app import celery_app as celery
from augur.application.db.session import DatabaseSession
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0413: Import "from augur.application.db.session import DatabaseSession" should be placed at the top of the module (wrong-import-position)

@@ -33,9 +33,9 @@
from augur.tasks.init.celery_app import celery_app as celery
from augur.application.db.session import DatabaseSession
from logging import Logger
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0413: Import "from logging import Logger" should be placed at the top of the module (wrong-import-position)


A special celery task that automatically retries itself and has no max retries.
"""

from augur.tasks.init.celery_app import engine
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
C0415: Import outside toplevel (augur.tasks.init.celery_app.engine) (import-outside-toplevel)

Signed-off-by: Isaac Milarsky <isaac.milarsky@hhs.cms.gov>
@IsaacMilarky IsaacMilarky marked this pull request as ready for review February 9, 2024 18:40
@IsaacMilarky IsaacMilarky added the add-feature Adds new features label Feb 9, 2024
Copy link
Contributor

@ABrain7710 ABrain7710 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic in ping_github_for_repo_move doesn't appear to be quite right. It appears to not be setting the repos that return a 404 (deleted repos) to a status of IGNORE. It throwing an excpetion for 404, succeeding if it the status wasn't a 301, and setting the repos that are 301 (moved) to IGNORE. We need to be setting 404s to IGNORE and erroring repos that are 301 so they can be retried

@sgoggins
Copy link
Member

@IsaacMilarky : Is the question @ABrain7710 raised addressed?

The logic in ping_github_for_repo_move doesn't appear to be quite right. It appears to not be setting the repos that return a 404 (deleted repos) to a status of IGNORE. It throwing an excpetion for 404, succeeding if it the status wasn't a 301, and setting the repos that are 301 (moved) to IGNORE. We need to be setting 404s to IGNORE and erroring repos that are 301 so they can be retried

Copy link
Member

@sgoggins sgoggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, waiting to see if @ABrain7710 's comment is addressed before merging.

@ABrain7710
Copy link
Contributor

@sgoggins No the requested changes have not been addressed. I messaged Isaac and he agreed the logic is off

Signed-off-by: Isaac Milarsky <krabs@tilde.team>
@@ -27,7 +27,7 @@ def get_command(self, ctx, name):
try:
module = importlib.import_module('.' + name, 'augur.application.cli')
return module.cli
except ModuleNotFoundError:
except ModuleNotFoundError as e:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] reported by reviewdog 🐶
W0612: Unused variable 'e' (unused-variable)

Copy link
Contributor

@ABrain7710 ABrain7710 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need to be updating all the collection statuses to IGNORE so that we don't collect facade or ml on deleted repos

Signed-off-by: Isaac Milarsky <krabs@tilde.team>
Signed-off-by: Isaac Milarsky <krabs@tilde.team>
@ABrain7710 ABrain7710 self-requested a review February 13, 2024 00:34
@ABrain7710 ABrain7710 merged commit 69788d1 into dev Feb 13, 2024
7 of 8 checks passed
@sgoggins sgoggins deleted the allow-repo-retry-collection-isaac branch February 20, 2024 00:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add-feature Adds new features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants