Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move user retirement code to edx-platform and drop it from Tubular #881

Closed
7 tasks done
feanil opened this issue Sep 19, 2023 · 22 comments
Closed
7 tasks done

Move user retirement code to edx-platform and drop it from Tubular #881

feanil opened this issue Sep 19, 2023 · 22 comments
Assignees

Comments

@feanil
Copy link
Contributor

feanil commented Sep 19, 2023

Context

The tubular repo mostly contains tools that are operationally relevant to the edx/2U org. However there are a few scripts that are relevant to the entire community.

Tasks

  • Move the openedx specific scripts to the edx-platform repository. They should have their own requirements file since you don't need the whole platform to run them now.
  • Make a PR to remove those scripts and relevant library code from the tubular repo.
  • Notify 2U SRE about the change and give them 2-4 weeks to schedule updating their internal tooling before we merge the Tubular PR.
  • Add how-to docs for running the scripts that get moved.
  • Add a note to the Operational section of Redwood Wiki to let people know that these scripts have moved to the edx-platform repo and will need to be executed from there.
  • After 2-4 weeks notification of 2U SRE team merge code removal PR from tubular
  • Archive the tubular repository

Relevant Scripts

Note: There will be other library code that will also need to be moved like the edx_apy.py. I didn't list everything that needs to be moved, just the top level scripts that need to be moved. You should move everything we need to make them work.

Acceptance Criteria

  • The relevant code only lives in edx-platform.
  • There is documentation for how to use it.

Other Useful Links

@ormsbee
Copy link

ormsbee commented Jan 16, 2024

Please don't bring over the mongo structure pruning code–just kill it entirely. It's potentially dangerous to run since we shifted the indexes into MySQL. 2U stopped running it a while back.

@farhan
Copy link

farhan commented Jan 25, 2024

PR is ready for code review: openedx/edx-platform#34063

  • I have not moved the structures.py script that is related to the Mongo structure pruning as per decision.
  • @feanil Do we still need to remove these scripts from the tubular repo, as we are planning to deprecate repo ahead?
  • I followed this suggested pattern but reverted it as we have two *.in files (base.in and testing.in) and need to maintain their upgrade order.
    If we add more scripts ahead, we will be in a better situation to write things generic.

Script I used

@find scripts -type f -name "*.in" -exec sh -c ' \
	echo "== Upgrading {} ==============================="; \
	pip-compile --upgrade -o $$(echo "{}" | sed "s/\.in/.txt/") "{}"; \' \;

@feanil
Copy link
Contributor Author

feanil commented Feb 2, 2024

@farhan I think we should still remove them to reduce confusion.

@jfavellar90
Copy link

jfavellar90 commented Feb 21, 2024

@ormsbee just joining the conversation late, I'm curious about the way 2U is handling the Mongo DB structures, if they stopped running the script, the structures keep growing with no control? Do they have a different mechanism to prune the old structures? I'm asking this because this tool helps us to solve issues with very large databases that contain a huge amount of old structures which make the backups difficult to run.

@ormsbee
Copy link

ormsbee commented Feb 21, 2024

Please don't run pruning code. I'm out of office now, will explain more next week, but pruning might not be a safe operation any longer.

@farhan farhan changed the title Move mongo pruning and retirement code to edx-platform and drop it from Tubular Move user retirement code to edx-platform and drop it from Tubular Feb 23, 2024
@farhan
Copy link

farhan commented Feb 27, 2024

Ticket Update:

  • The user retirement scripts have been shifted to edx-platform repository
  • The user retirement scripts have been marked deprecated in tubular repository
  • 2U SRE team has been notified about deprecation
  • Let's wait till 19th March and then merge the user-retirement scripts removal PR from tubular and archive it

cc: @feanil

@ormsbee
Copy link

ormsbee commented Feb 27, 2024

Okay, I think pruning is still safe because we're writing the index to both places: https://github.com/openedx/edx-platform/blob/fca51419f8facbcb0e6eaad7750bafee23f92509/xmodule/modulestore/split_mongo/mongo_connection.py#L756-L759

My original concern is this: The pruning code currently only looks at MongoDB. It was written at a really low level for performance reasons, and it predates the SplitModulestoreCourseIndex that is stored in MySQL and allows easy reverts using the Django admin today. As long as the two are kept in-sync, we should be okay. But the one in MySQL is now the source of truth, and it's possible for the MongoDB one drifts somehow (e.g. some persistence failure to write to the collection).

We should probably change the comment in that code above to indicate that it's still necessary for cleanup safety.

@bradenmacdonald: Does this sound right to you?

@feanil
Copy link
Contributor Author

feanil commented Feb 27, 2024

@ormsbee it sounds like that mean it's worth moving it out of here before we deprecate this repo?

@bradenmacdonald
Copy link

@ormsbee That sounds right. Also I don't know if it's happened at all recently, but we have observed some instances in the past where the two were not in sync. Since they're separate database systems, there's no way to completely guarantee consistency.

@robrap
Copy link

robrap commented Feb 28, 2024

@feanil:

  1. [question] Is the "Relevant Scripts" list in the PR description still up-to-date?
  2. [inform] 2U SRE is actively working on this. I'm not pursuing getting an estimate, and hopefully we'll hear back soon that it is done. That said, we can check in whenever you wish.

@feanil
Copy link
Contributor Author

feanil commented Feb 28, 2024

@robrap everything but structures.py which it seems like we may need to move afterall has already moved.

@ohnickmoy
Copy link

@feanil and @ormsbee, so can I nuke this mongo-pruner job within our internal jenkins jobs? I read the initial description without following up on the comments and assumed that this structures.py script would be in edx-platform and filed a PR reflecting this. However, sounds like it's either eliminate the job outright, or file a reversion on this piece of code.

@feanil
Copy link
Contributor Author

feanil commented Mar 1, 2024

@ohnickmoy if we're gonna keep it, we'll probably move it to edx-platform so you won't have to revert the job, we can just move that script. I'll check in with Dave today on it to figure out what the next steps should be.

@ormsbee
Copy link

ormsbee commented Mar 1, 2024

@ohnickmoy: I thought edx.org had stopped running this pruning a while back. It's definitely still running today?

I feel conflicted about this one because while the risk of losing sync is low, the consequences would be really bad, since it would mean data loss of actively used content.

@feanil: I guess the real thing to do here would be to make the pruner safe by having it check for and pull from the MySQL table for active version information, so that it's always has the right picture (or can at least abort if it sees a mismatch). So maybe let's bring the structures.py script into edx-platform for now, and put a task to follow up with that patch.

@ohnickmoy: I still think the recurring pruner job should be stopped until that fix happens ^.

@feanil
Copy link
Contributor Author

feanil commented Mar 1, 2024

@farhan can you mane one more PR to move over the structures.py file and update the relevant docs on the tubular side?

@ormsbee
Copy link

ormsbee commented Mar 1, 2024

Probably goes without saying, but bringing over structures.py would require bringing splitmongo.py and test_splitmongo.py for the ride as well.

Thank you for your patience.

@ohnickmoy
Copy link

ohnickmoy commented Mar 1, 2024

@ormsbee, from what I can see on tools-jenkins, there have been recent runs in the last month. I'll take a look closer if it's running on a schedule or if it's been run manually

EDIT: Yep running on a schedule. All three environments

@feanil
Copy link
Contributor Author

feanil commented Mar 11, 2024

Update deprecations are in place for most things, we're in the process of moving the pruning code: openedx/edx-platform#34328

Once that's done, we should be good to archive the tubular repo.

@ohnickmoy
Copy link

@feanil, im reading the PR. in sum, the structures.py job is not ready to migrate yet until merged?

@feanil
Copy link
Contributor Author

feanil commented Mar 12, 2024

@ohnickmoy correct, but that PR has now merged so that job should be good to migrate now.

@feanil
Copy link
Contributor Author

feanil commented Mar 12, 2024

PR that added the final deprecation warnings before we archived the repo: openedx-unsupported/tubular#744

@ohnickmoy
Copy link

thanks @feanil. Will get to it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Archived in project
7 participants