Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add weight calculations for repo scheduling in repo collection #2316

Merged
merged 51 commits into from Apr 18, 2023

Conversation

IsaacMilarky
Copy link
Contributor

@IsaacMilarky IsaacMilarky commented Apr 12, 2023

Description

  • Add fields to the collection_status table to reflect estimates about a repo's general size, e. g. commit count for the facade_weight and the total pull requests plus total issues for the core_weight estimate. The pr and issue count and commit count are kept in their own column while the weight is determined from that and the age of the repo.
  • Add methods to calculate a repo's facade_weight and core_weight from the repo's url and facade directory respectively.
  • Add a RedisScalar class to more easily define raw scalar values with redis in a more uniform way.
  • Add a time_factor to the weight calculation so that the longer a repo has been in the database without being collected the lower that repo's weight becomes.
  • Note there are separate curves for determining the time_factor of repos that have already been collecting them. This separate curve increases the weight of the repo until the repo has reached 30 days since last collection and then decreases the weight from that point on.
  • The time_factor for new repos is determined by the function x^4 where x is the days since repo was added
  • The time_factor for old repos is determined by (x - 30)^4 for repos when x >= 30 and -(x - 30)^4 otherwise
  • Adjust augur stop/kill command to get the names of all processes and store them before killing them
  • Change behavior of orm insertion to collection_status table so that it automatically calculates the repo's core_weight when it is added.
  • Change behavior of core collection to make collect_issues and collect_pull_requests return the number of issues and pull requests respectively. This is to save on api calls through the use of a celery chord that uses these return values to pass into core_task_update_weight_util which updates the weight for that repo.
  • Add periodic task to update repo weights on midnight on even numbered days.
  • Add command to augur db 'reset-repo-age' to change the repo_added field to the current timestamp. This is to make repos uniform after migrating repos from old databases to avoid repos being represented as much older than they actually are.
  • Add call to 'reset-repo-age' to install script to reset the repo age whenever the instance is installed
  • Remove alembic script that automatically adds the repos to collection_status without calculating weight.

Signed commits

  • Yes, I signed my commits.

IsaacMilarky and others added 30 commits March 17, 2023 18:13
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <krabs@tilde.team>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <krabs@tilde.team>
Signed-off-by: Isaac Milarsky <krabs@tilde.team>
Signed-off-by: Isaac Milarsky <krabs@tilde.team>
Signed-off-by: Isaac Milarsky <krabs@tilde.team>
Signed-off-by: Isaac Milarsky <krabs@tilde.team>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
…d to a new database

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
…ered days

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Copy link
Contributor

@ABrain7710 ABrain7710 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small issue

augur/tasks/start_tasks.py Show resolved Hide resolved
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
@sgoggins sgoggins added feature-request Request for a new feature in Augur server Related to the Augur server workers Related to data workers database Related to Augur's unifed data model release Related to releasing a new version of Augur add-feature Adds new features labels Apr 15, 2023
@sgoggins sgoggins marked this pull request as draft April 15, 2023 16:32
@sgoggins
Copy link
Member

I converted the PR to a draft just to make sure I don't merge it before intended.

IsaacMilarky and others added 11 commits April 17, 2023 14:51
Signed-off-by: Isaac Milarsky <krabs@tilde.team>
Signed-off-by: Isaac Milarsky <krabs@tilde.team>
Signed-off-by: Isaac Milarsky <krabs@tilde.team>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
@IsaacMilarky IsaacMilarky marked this pull request as ready for review April 18, 2023 18:02
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
@ABrain7710 ABrain7710 self-requested a review April 18, 2023 18:41
@IsaacMilarky IsaacMilarky merged commit 161733d into dev Apr 18, 2023
1 check passed
@IsaacMilarky IsaacMilarky deleted the collection-hooks-with-weight branch April 18, 2023 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
add-feature Adds new features database Related to Augur's unifed data model feature-request Request for a new feature in Augur release Related to releasing a new version of Augur server Related to the Augur server workers Related to data workers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants