Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change format of facade directories #2396

Merged
merged 16 commits into from May 4, 2023
Merged

Change format of facade directories #2396

merged 16 commits into from May 4, 2023

Conversation

IsaacMilarky
Copy link
Contributor

@IsaacMilarky IsaacMilarky commented May 3, 2023

Description

  • Change the repo_path that facade collection used to no longer be determined by the repo's group id, now a repo_id is used which is unique to that repo.
  • Create an alembic script to automatically migrate instances to the new change
  • Fix issues associated with the aforementioned change due to changes in formatting

Signed commits

  • Yes, I signed my commits.

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
… deleted

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
… deleted

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
@IsaacMilarky IsaacMilarky changed the title Change format of facade directory organization Change format of facade directories May 3, 2023
sgoggins
sgoggins previously approved these changes May 3, 2023
Copy link
Member

@sgoggins sgoggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM .. with the assumption that something in here is deleting the old clone, and making a new one ... or moving the old one. Either way, we can't afford to have the old clones and the new clones both on disk. It would be too costly for augur users, and not something they expect.

@sgoggins
Copy link
Member

sgoggins commented May 3, 2023

I want @ABrain7710 to review this before we merge it.

@IsaacMilarky
Copy link
Contributor Author

This LGTM .. with the assumption that something in here is deleting the old clone, and making a new one ... or moving the old one. Either way, we can't afford to have the old clones and the new clones both on disk. It would be too costly for augur users, and not something they expect.

The old clones are deleted and then cloned again via an alembic script.

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
Copy link
Contributor

@ABrain7710 ABrain7710 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing I see that needs added is a query that updates the core and secondary statues to pending if the issue_pr_sum is NULL

…sue/pr count via alembic

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
@IsaacMilarky
Copy link
Contributor Author

The only thing I see that needs added is a query that updates the core and secondary statues to pending if the issue_pr_sum is NULL

That makes sense. I have made the change

Signed-off-by: Isaac Milarsky <imilarsky@gmail.com>
@ABrain7710 ABrain7710 self-requested a review May 4, 2023 01:29
Copy link
Contributor

@ABrain7710 ABrain7710 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should try to restrict these operations rather than just running them on every repo, so that if someone runs a downgrade/updgrade in the future it won't just reset everything for no reason


UPDATE augur_operations.collection_status
SET facade_status='Pending', facade_task_id=NULL, facade_weight=NULL,commit_sum=NULL,facade_data_last_collected=NULL;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should only run this on repos that don't have a commit_count or have a repos that are stored in the old structure. The reason being that I would like this script to reset all of facade on a huge instane if someone accidentally downgrades the database

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they wanted to downgrade the schema it would be proper to also reset the facade repos again. We also can't check based on the commit_count because the revision before this one has commit_count without the directory change. I think it's fine how it is in my opinion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters if someone accidentally clears their facade repos while downgrading the schema since it's so fast to clone.

Copy link
Contributor

@ABrain7710 ABrain7710 May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reply to first part: How would it be proper? I don't see why that would be the desired behavior. Usually in my experience downgrades are to bring the database back to the prior state, but resetting the repos doesn't do anything to reset to the old state.

Reply to second part: I think we can do it based on commit_count we would run it if the facade directory was wrong or the commit_count doesn't exist since they are both valid times to do it.

Reply to third part: I do think it matters, because although it is fast to clone, each repo still has to go through the collecting phase and if there are 22,000 repos that go reset and there collections finished instantly (since the commits exist) it would take around 6 hours for all the repos to get back through to success and this is not even considering the time to add all the tasks to the queue, and the time to clone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think it matters, because although it is fast to clone, each repo still has to go through the collecting phase and if there are 22,000 repos that go reset and there collections finished instantly (since the commits exist) it would take around 6 hours for all the repos to get back through to success and this is not even considering the time to add all the tasks to the queue, and the time to clone.

The only issue with this is that it sets all the repos to new but otherwise it doesn't really change the underlying behavior of facade. I do agree that people could see that as a problem though since people want to prioritize the newly added repos.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approved because it does solve the problem. I'm just a bit apprehensive about doing this without conditions because it may not do what people would expect it to do when it is run multiple times

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approved because it does solve the problem. I'm just a bit apprehensive about doing this without conditions because it may not do what people would expect it to do when it is run multiple times

I see the concern but I think any issue this alembic script might cause would be more of an inconvenience caused by not knowing facade repos are all being deleted. If we want to we can add a confirm prompt so that people are absolutely aware that their facade directories are being deleted. If we want to we can also mark the old repos or something so that they aren't re-collected as new.

@sgoggins sgoggins merged commit 9c958e0 into dev May 4, 2023
1 check passed
@IsaacMilarky IsaacMilarky deleted the facade-dir-changes branch May 4, 2023 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants