Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes working towards quantifying repos by their metrics to not overwhelm collection #2227

Merged
merged 18 commits into from Mar 9, 2023

Conversation

IsaacMilarky
Copy link
Contributor

@IsaacMilarky IsaacMilarky commented Mar 9, 2023

Description

  • Work towards being able to schedule a predefined load of collection by repo sizes rather than raw repo amount
  • Add functions to see how large a repo is respective of its pull_requests and issues
  • Split start_tasks.py into two to make it less overwhelming
  • Split facade collection into two hooks. One to clone/update, one to collect data. This is done so that in the future similar metrics to the pr/issue metrics to see how large a repo is respective of its commits
  • Fix issues with logic when handling dependency errors
  • Add new status for facade when it is cloning and updating called 'Initializing'
  • Change celery settings to work with newly split start_tasks.py
  • Fix issue with sqlalchemy where 'one' was used instead of 'first'.

Signed commits

  • Yes, I signed my commits.

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
… moving towards scheduling repos based on commit count which can't be done until the repos have been cloned

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
@IsaacMilarky IsaacMilarky added the add-feature Adds new features label Mar 9, 2023
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
sgoggins
sgoggins previously approved these changes Mar 9, 2023
Copy link
Member

@sgoggins sgoggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve as far as I can tell, and some of he changes are significant enough that I want @ABrain7710 to review before merging.

@@ -457,43 +496,13 @@ def facade_phase(repo_git):

# Figure out what we need to do
limited_run = session.limited_run
delete_marked_repos = session.delete_marked_repos
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are getting rid of many of the confusing different settings. yes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes some have been removed but not all. Delete_marked_repos particularly should probably be removed because we never remove repos.


except NoResultFound as e:
session.logger.debug(f"Failed local login lookup with error: {e}")
if not contributors_with_matching_name:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will taking this out of a try except cause the collection to fail in the case that a person is not identified?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because it uses first which will return None if it doesn't exist. Where one expects a row to exist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it will not.

@@ -27,161 +27,17 @@
from augur.tasks.db.refresh_materialized_views import *
# from augur.tasks.data_analysis import *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely want @ABrain7710 to review this file

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
@ABrain7710 ABrain7710 merged commit 1ea59ce into dev Mar 9, 2023
@IsaacMilarky IsaacMilarky deleted the weighted-user-collection-metrics branch March 9, 2023 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add-feature Adds new features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants