Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API for submitting files #6525

Open
Timothy-Gonzalez opened this issue Oct 24, 2022 · 9 comments
Open

Add API for submitting files #6525

Timothy-Gonzalez opened this issue Oct 24, 2022 · 9 comments
Labels
enhancement A desired new feature or change (not a bug)

Comments

@Timothy-Gonzalez
Copy link

Timothy-Gonzalez commented Oct 24, 2022

As a student in CS128, submitting my code can be a pain. I have to first log in to the autograder, drag the proper files in, and then wait for minutes for a result, tabbing back and forth in the meantime.

It has been talked since 2017 to make this workflow easier (#3695, #3197, #675) by adding git / GitHub based submission, which would ease some of the pain in this workflow. However, this causes issues with different git setups, local vs remote repos, and mainly doesn't solve the root problem.

The best solution is to avoid relying on git or GitHub entirely, and expose a public API for submitting files. Right now, it appears PrarieLearn's public API is read-only. While I see that endpoints exist to do what I want, they are not listed publicly which I am guessing means they are not supported.

With a public API, this would cover all the use cases, as code could be uploaded locally in a terminal, vscode extension, or via a GitHub action.

By being an API, PrarieLearn wouldn't have to worry about implementing GitHub authentication or managing repos, only managing requests like usual.

As always, there are security implications with doing this. However, this can technically already be achieved using the mentioned undocumented endpoint. Supporting a official endpoint would allow full control over usage and ensure fair use.

Disclaimer: This is my first issue on PrarieLearn and it's very likely I might have messed something up. I could not find any other issues addressing this specially, but it's possible I missed one. If I did, let me know.

Thank you for reading. If you have any comments or suggestions for revision, let me know.

@nwalters512
Copy link
Contributor

nwalters512 commented Oct 24, 2022

I think an API for this would be reasonable, and certainly much simpler than trying to implement a per-variant Git repo. I imagine we'd limit it to files, as I can't see a reason to submit non-files programmatically.

Could you elaborate on what the student workflow might look like if such and API existed? Are you envisioning some kind of CLI to go along with this? Would auth be based on a personal access token, or something else?

I'm curious what undocumented API you found - is this just the POST endpoint that submissions go to?

Could you elaborate on the potential security issues you alluded to?

@Timothy-Gonzalez
Copy link
Author

Timothy-Gonzalez commented Oct 24, 2022

@nwalters512 Limiting it to files would make sense. The API I'm talking about is the POST https://www.prairielearn.org/pl/course_instance/id/instance_question/id/ one. Seems like it just BASE64 encodes the content and sends it.

I'd imagine the public API would just be exposing this to work with access tokens. I don't know much about how PrarieLearn is setup, but ideally it would be an API anyone could post to using an access token, and send the files you wish to submit.

Ideally this API would enable someone (like myself) to make a CLI to manage uploading to PrarieLearn. Students would be able to execute two commands:

pl-cli init

which would ask them to enter their token, specify a course and assignment, and save it to some config file for later use, and:

pl-cli upload

which would upload the files.

It wouldn't have to be officially endorsed / controlled by PrarieLearn, as anything could use the API.

I don't think there's necessarily any security issues (other than needing to ratelimit the endpoint to prevent abuse, and making sure the files uploaded match the files expected).

The one problem might be related to FAIR and making sure each student's submission is their own - as a malicious application could in theory be a "man in the middle" and log code uploaded. I think the solution to this would be requiring applications that use the API to be open-source, and attach a user-agent related to application used. The other less fun solution is PrarieLearn manages the CLI and the API remains private.

I'd ideally like the API to be open so anything (GitHub actions, cli, vscode extension) could use it.

Let me know if there's anything else that is unclear.

@nwalters512 nwalters512 added the enhancement A desired new feature or change (not a bug) label Oct 24, 2022
@nwalters512
Copy link
Contributor

Gotcha, that "API" is indeed just the normal POST handler for the instance question page, it just takes a bunch of form-encoded data, and in this case, the files are indeed base-64 encoded.

Any API we build would inherently be public; there's no way we could design something that would be private. Even if it were undocumented, PL is open source, so it would be trivial for someone to discover and reverse-engineer it 🙂

There's no way that we can mandate that any clients be open-source. We could require a user agent, but there may not be a ton of value in that since that's also user-controlled. I'm not overly-worried about malicious clients, but perhaps we could add a warning when API tokens are generated, something to the tune of "be careful which users and tools you give this token to".

If you're willing to submit a PR implementing such an API, I'd be happy to review it. Otherwise, we'll keep this feature request open, but I can't make any promises on a timeline.

@Timothy-Gonzalez
Copy link
Author

@nwalters512 Alright, will do (eventually).

I understand of course that anything could already use the undocumented API, I just wanted to make sure that it's clear it is intended to be used and supported, and that implementing this as a feature was okay. Also, documentation makes a endpoint much more usable.

Thanks for your time!

@julianschiavo
Copy link

Enjoyed reading this conversation! I had just been suggesting something similar in the 128 forums.

I think an open API is a great idea and has only minor security implications—for example, I think having a read API would be bad for numerous reasons, such as a malicious client stealing a target's submissions. I would offer to help with the PR but I haven't worked in web apps for a while; excited to see this happen and what it can be used for.

@echuber2
Copy link
Collaborator

for example, I think having a read API would be bad for numerous reasons

Fortunately, I don't think a read API is necessary for the feature to work. The response on the submission should be enough.

On Coursera, they implemented a public submission API that uses short-lived tokens students have to generate on the submission page for their specific assignment. That's kind of clunky but it removes the need to do elaborate key management on the client side.

If the goal here is to make a CLI tool similar to gh auth login or Git Credential Manager, you might want to refer to the strategies they used. GCM has the feature to use various keyrings depending on the OS, but it seems unnecessarily complex for PL's case.

@Timothy-Gonzalez
Copy link
Author

Timothy-Gonzalez commented Oct 25, 2022

@julianschiavo I don't think we would need a read API, and not adding one wouldn't prevent the application writing code to log it anyways. This security issue is just an aspect of having to trust a third-party application - which is already done for thousands of development environments.

Ultimately, this responsibility falls on the user to audit the software they use, as PrarieLearn can't solve this issue.


@echuber2 My hope is we can just use the pre-existing personal access tokens - since this already works for the read-only api.

image
Source: https://www.prairielearn.org/pl/settings

@echuber2
Copy link
Collaborator

Ah, I was thinking more about how to store the key locally (with the ephemeral keys concept, they simply are not stored). You could use the system keychains but that adds complexity. I'd suggest keeping it out of the working tree at least so students don't accidentally commit their key somewhere.

@Timothy-Gonzalez
Copy link
Author

@echuber2

The common place practice for a lot of tools seems to be to create a folder in the home directory, and store a .txt file in that. The reality is if someone already has compromised the user's computer you would have greater problems. Specific settings, like course id and assignment id, could be stored per-directory / project. If we detect .gitignore, we can add it to that as well if that's a issue, and emit a warning otherwise.

But that's getting ahead of ourselves - we need to make a working API first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A desired new feature or change (not a bug)
Projects
None yet
Development

No branches or pull requests

4 participants