Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: download user files #22

Open
Cyberes opened this issue Oct 28, 2023 · 4 comments
Open

Suggestion: download user files #22

Cyberes opened this issue Oct 28, 2023 · 4 comments

Comments

@Cyberes
Copy link

Cyberes commented Oct 28, 2023

Canvas stores your submissions and other misc. files on the platform and it's pretty easy to download those using their API. Here's how I did it:

https://github.com/Cyberes/canvas-student-data-export/blob/master/module/user_files.py

from concurrent.futures import ThreadPoolExecutor, as_completed
from pathlib import Path

import canvasapi
from tqdm import tqdm

from module.helpers import make_valid_folder_path


def do_download(task):
    task[1].parent.mkdir(parents=True, exist_ok=True)
    task[0].download(task[1])


def download_user_files(canvas: canvasapi.Canvas, base_path: str):
    base_path = Path(base_path)
    user = canvas.get_current_user()
    folders = []
    for folder in user.get_folders():
        n = folder.full_name.lstrip('my files/')
        if n:
            c_n = make_valid_folder_path(n)
            folders.append((folder, c_n))

    files = []
    for folder, folder_name in tqdm(folders, desc='Fetching User Files'):
        for file in folder.get_files():
            out_path = base_path / folder_name / file.display_name
            files.append((file, out_path))

    with ThreadPoolExecutor(max_workers=10) as executor:
        bar = tqdm(files, desc='Downloading User Files')
        futures = [executor.submit(do_download, task) for task in files]
        for _ in as_completed(futures):
            bar.update()
@davekats
Copy link
Owner

@Cyberes nicely done! Do you know if your code downloads files that this tool currently does not download? I thought that maybe the files listed under User Files are a collection of files from every course the user is enrolled in. So maybe the tool currently downloads all of the same files, but just organized in a different way. I no longer have access to Canvas so I can't confirm this, but maybe you can. Thanks!

@Cyberes
Copy link
Author

Cyberes commented Oct 30, 2023

User files contain every file that a user has ever submitted, even for classes that can't be accessed anymore. Also holds misc. files from tools and whatever.

@davekats
Copy link
Owner

Very cool. This would be a nice feature to add. While it might result in a lot of duplicate data, it would ensure that no files are missed. Perhaps it could be added as an optional task to run based on a configuration parameter.

@Cyberes
Copy link
Author

Cyberes commented Jan 27, 2024

I also made sure files embedded in HTML were downloaded as well. For example, a module post page might have attached files.

Find them: https://github.com/Cyberes/canvas-student-data-export/blob/master/module/api/file.py
Fetch them: https://github.com/Cyberes/canvas-student-data-export/blob/master/module/get_canvas.py#L30
And download them: https://github.com/Cyberes/canvas-student-data-export/blob/master/module/threading.py#L32C12-L33C58

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants