Skip to content

Conversation

@dwong2708
Copy link
Contributor

@dwong2708 dwong2708 commented Sep 10, 2025

Resolves: #380

Summary

This PR introduces the initial implementation for restoring a learning package from a ZIP archive. It sets the foundation for handling backup and restore of course-related data.

Changes

  • Added logic to parse and restore a learning_package.toml file
  • Implemented initial handling of ZIP file contents
  • Introduced placeholders for restoring additional entities:
    • Containers
    • Components
    • Collections

Notes

  • The current implementation restores only the learning package itself
  • Other entities are defined as placeholders and will be fleshed out in follow-up work
  • All operations are wrapped in a database transaction to ensure atomicity

Next Steps

  • Implement full parsing and restore for containers, components, and collections
  • Add unit tests for each restore path
  • Extend summary reporting of restored objects

- Implement restore of the learning package
- Add initial logic to handle ZIP file contents
- Include placeholders for restoring other entities (containers, components, collections)
@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Sep 10, 2025
@openedx-webhooks
Copy link

openedx-webhooks commented Sep 10, 2025

Thanks for the pull request, @dwong2708!

This repository is currently maintained by @axim-engineering.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details
Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@github-project-automation github-project-automation bot moved this to Needs Triage in Contributions Sep 10, 2025
@dwong2708 dwong2708 marked this pull request as ready for review September 10, 2025 22:54
@dwong2708 dwong2708 requested a review from ormsbee September 10, 2025 22:54
@dwong2708 dwong2708 self-assigned this Sep 10, 2025
@mphilbrick211 mphilbrick211 added the mao-onboarding Reviewing this will help onboard devs from an Axim mission-aligned organization (MAO). label Sep 11, 2025
@mphilbrick211 mphilbrick211 moved this from Needs Triage to Ready for Review in Contributions Sep 11, 2025
@ormsbee
Copy link
Contributor

ormsbee commented Sep 16, 2025

Apologies for the delay. I'll review this on Tues. morning.

Copy link
Contributor

@ormsbee ormsbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some relatively minor change requests. Thank you!

# Public API
# --------------------------

@transaction.atomic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This is okay for now, but we may need to be more granular with this eventually, instead of putting it over the entire method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood

# --------------------------

@transaction.atomic
def extract_zip(self, path: str) -> dict[str, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This function is meant to load these contents into the database, not just extract them from the zip file. So a function name like load seems more appropriate.
  2. This will be easier to test if it takes a ZipFile as an argument, because then you'll be able to construct in-memory ZipFiles for testing purposes using BytesIO (instead of having a lot of temp files thrown around).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the point 2, what about to accept either a str or a ZipFile ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied the accept str or ZipFile approach. Please let me know if you are ok of that change. Thanks

Copy link
Contributor

@ormsbee ormsbee Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave this parameter as only a ZipFile and not either-or. Try to avoid overloading parameters like this when possible:

  1. It sometimes results in dependent optional parameters which can be confusing, e.g. "these optional params only apply when we pass a file path")
  2. It mixes concerns. There are more error scenarios with opening that file than what you have listed right now: permissions denied, invalid zip file (due to partial/broken upload), etc. Furthermore, the file loading is going to be different and potentially need to give different feedback depending on what's doing it -- it's the command prompt for a management command, but something different for an async celery task worker (where a /tmp file would be meaningless to the user).

Having the code to load the raw file and make sure it's the right-kind-of-thing is a discrete piece of logic that will likely grow more complex over time. For instance, maybe we'll want a custom loader where you point it to a directory tree, and it dynamically generates an in-memory ZipFile on the fly so that it's easier to browse and work with tests. If this class only takes a ZipFile as an init param, it's pretty clear that this custom loading logic should go outside this class. But if this init takes a string, then someone may look at that and think, "okay, well, if the string is a path to a zip file, do this, but if the string is a path to a directory tree, do this other thing...", further complicating things.

Overall, it just simplifies the logic of the class in the longer term if it assumes it gets a well-formed, readable zip file. Then this class can worry exclusively about the contents of the archive and whether they are consistent with each other, and not worry about lower-level I/O details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate your explanations — they definitely make sense. I’ve applied the changes.

"""

def _load_component(
self, zipf: zipfile.ZipFile, component_file: str, learning_package: "LearningPackage"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You can import LearningPackage from the publishing api.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I also noticed the LearninPackage model is already imported. Thanks

@dwong2708 dwong2708 requested a review from ormsbee September 17, 2025 18:59
@ormsbee ormsbee merged commit c8ebb89 into openedx:main Sep 18, 2025
11 checks passed
@github-project-automation github-project-automation bot moved this from Ready for Review to Done in Contributions Sep 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mao-onboarding Reviewing this will help onboard devs from an Axim mission-aligned organization (MAO). open-source-contribution PR author is not from Axim or 2U

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Create minimal lp_load management command for Learning Core

5 participants