Skip to content
This repository has been archived by the owner on Mar 12, 2020. It is now read-only.

Allow copying starter repo issues #555

Closed

Conversation

rmacklin
Copy link
Contributor

@rmacklin rmacklin commented Apr 5, 2016

I've got a basic implementation of #546 working with these changes.

I haven't finished the tests yet. I left a few inline comments on 7475523 with some questions.

The other thing I have not dealt with is error handling for the background job. I'd like to hear what you guys think we should do if errors occur during the issue copying. Automatically retrying the issue creation (from the point at which it failed) would be nice, but it could introduce undesired complexity, and there's still the possibility that errors occur again during the retry attempts. So a simpler approach would be to just alert the user of the error (which they could then manually recover from). Thoughts?

Screenshots:
screen shot 2016-04-05 at 1 56 20 pm

demoofcopyissueschanges

@johndbritton
Copy link
Contributor

The other thing I have not dealt with is error handling for the background job. I'd like to hear what you guys think we should do if errors occur during the issue copying. Automatically retrying the issue creation (from the point at which it failed) would be nice, but it could introduce undesired complexity, and there's still the possibility that errors occur again during the retry attempts.

We should handle any failure in the repository creation process like a transaction. If we fail to set the repository up exactly correctly, roll back the changes and alert the user of an error.

In the case of copying issues, this means we'd likely have to show the user a progress bar on Classroom while copying the code, creating the issues, and then finally alert if there was a success or failure via a websocket or polling.

@johndbritton
Copy link
Contributor

@rmacklin Thanks for getting this conversation started, we really appreciate your contribution.

I've left some feedback on the questions you asked, but I think we should take a step back to consider the approach. In #546 I mentioned a few other use cases, and I agree with you that creating issues at the start of the project is distinct enough that it could be it's own feature.

The concerns I have are twofold:

  • notification overload for the assignment creator, as we don't want to recommend using multiple accounts
  • How to manage issues on the source repository that aren't supposed to be copied over

Creating the issues as the assignment creator is an ok solution, but I don't think it's totally ideal. We don't want to recommend people have multiple accounts as it leads to support headaches when trying to push to multiple repositories from different accounts, contribution attribution, and it's also against the GitHub ToS to have multiple accounts.

I think we may want to consider using the account of the student or maybe a bot. I don't think the assignment creator is the right answer though.

At first glance, importing the issues from the starter code repository sounds good but how do you remove issues you don't want anymore? Does that mean it only imports open issues? What about discussing the actual starter code project, where would you do that? We're aiming to get teachers using GitHub repositories as a way to discuss coding assignments and collaborate on them, if issues are part of the assignment it hurts the ability to collaborate.

I'm not sure this is the best solution, but what about using files in the repository to generate the issues? That way teachers could collaborate on them using pull requests and they wouldn't collide with the meta space of developing the starter code project.

@rmacklin
Copy link
Contributor Author

rmacklin commented Apr 7, 2016

Thanks for clarifying the concerns around the notifications and using the assignment creator as the issue opener. I hadn't thought of opening the issues from the student's account, but that is also easy and makes sense as a way to avoid the assignment creator being auto-subscribed. I can certainly rework it to use the student account. If you'd prefer the bot account approach, I'd need some guidance there.

At first glance, importing the issues from the starter code repository sounds good but how do you remove issues you don't want anymore? Does that mean it only imports open issues?

That is indeed how I currently made it work, but I see your point with:

What about discussing the actual starter code project, where would you do that?

I don't use issues on the starter code repo for that purpose, but I agree that we don't want to prevent that.

I'm not sure this is the best solution, but what about using files in the repository to generate the issues? That way teachers could collaborate on them using pull requests and they wouldn't collide with the meta space of developing the starter code project.

I'm on board with that idea. I actually already have my issues for the assignment stored in markdown files because before I added this feature I was manually running a small script that read in those files and opened issues using Octokit. Were you thinking we'd let the user input a folder where the issues are stored relative to the repo root and open issues for each file (in lexicographic order)?

@johndbritton
Copy link
Contributor

I can certainly rework it to use the student account. If you'd prefer the bot account approach, I'd need some guidance there.

There are arguments for and against the student account and the bot account. @tarebyte What do you think about this?

@irishbryan Maybe you've got some opinions on who template issues for a student assignment should be created by.

Were you thinking we'd let the user input a folder where the issues are stored relative to the repo root and open issues for each file (in lexicographic order)?

It might make sense to create a folder .github/classroom/issues and have numbered markdown files like 100-First-issue-title.md, 200-Second-issue-title.md.

@johndbritton
Copy link
Contributor

To clarify my answer to this question:

Were you thinking we'd let the user input a folder

I think we'd be better served by having a convention and not letting the user choose the folder.

@johndbritton
Copy link
Contributor

For an example of what I mean by a transaction, see:

# Public
#
def delete_github_repository_on_failure
yield
rescue GitHub::Error
silently_destroy_github_repository
raise GitHub::Error, 'Assignment failed to be created'
end

@rmacklin
Copy link
Contributor Author

rmacklin commented Apr 9, 2016

Yep, I have seen that code so I understood what you meant 😃

Re:

I think we'd be better served by having a convention and not letting the user choose the folder.

and:

It might make sense to create a folder .github/classroom/issues

That works for me. My initial instinct was to have a convention for the folder, too. The reason I asked if we would want the folder to be user-configurable because I wasn't sure if enforcing a specific folder would be too limiting. Of course, if it did turn out that way we could follow up with an enhancement to make it configurable. So, that said, should I start down that path? Should we wait for @tarebyte to chime in?

@rmacklin
Copy link
Contributor Author

rmacklin commented Apr 9, 2016

Re:

and have numbered markdown files like 100-First-issue-title.md, 200-Second-issue-title.md.

That works for me too, but for the sake of discussion, I'll expand a bit more on:

I actually already have my issues for the assignment stored in markdown files because before I added this feature I was manually running a small script that read in those files and opened issues using Octokit.

My issue markdown files all start with YAML frontmatter (https://jekyllrb.com/docs/frontmatter) which specifies the issue title, and optionally a list of labels. The rest of the file is GitHub markdown used for the issue body. So an example would look like:

---
title: Add Authentication
labels:
- some
- labels
---

# My heading

Tasks:
- [ ] Lorem ipsum
- [ ] dolor sit amet

Resources:
- http://some.res/ource/link

The script would read all the markdown files in lexicographic order and open an issue after parsing the frontmatter data and the body. Using the frontmatter allows the issue title to be decoupled from the file name in addition to adding a standard way to (optionally) declare labels. Let me know if you think this format would be good, or if you have other ideas.

@johndbritton
Copy link
Contributor

Of course, if it did turn out that way we could follow up with an enhancement to make it configurable.

Agreed, let's move forward with a convention.

So, that said, should I start down that path? Should we wait for @tarebyte to chime in?

I think it's safe to continue down this path. @tarebyte can chime in if he has any reservations, but I think this is the ideal solution.

My issue markdown files all start with YAML frontmatter

I was going to suggest doing the same, this works really well.

Using the frontmatter allows the issue title to be decoupled from the file name in addition to adding a standard way to (optionally) declare labels.

I think this is superior to using file names for titles, so let's go with it. Filenames literally mean nothing other than the sort order. Do we want to allow ordering via YAML frontmatter? If we're decoupling from filenames, maybe that's worth doing too?

Also, is there a need to be able to ignore files for any reason? Do you think we should skip files that begin with . or that have a flag set in the frontmatter?

@johndbritton
Copy link
Contributor

Since we're going with the convention of files in .github/classroom/issues being opened as issues on import, do we still need the checkbox on the assignment creation form to enable / disable issue creation?

If a starter code repository has issues in that folder, let's just create them.

@johndbritton
Copy link
Contributor

In order to open the issues, we'll need to read the contents of files in the repository. I can see two approaches to that problem:

  1. Clone the repository and use file access to read the data
  2. Use the GitHub Git Data API: https://developer.github.com/v3/git/

I think we should prefer the GitHub Git Data API as this app runs on Heroku and we don't have a permanent file system. Additionally, cloning the entire repository to access the files in one directory seems overkill.

Rather than trying to add all of this in one big pull request, it might make sense to split the project into a few distinct steps, but I'll leave that up to @rmacklin to decide on.

@rmacklin
Copy link
Contributor Author

I think this is superior to using file names for titles, so let's go with it. Filenames literally mean nothing other than the sort order. Do we want to allow ordering via YAML frontmatter? If we're decoupling from filenames, maybe that's worth doing too?

Interesting thought. Personally, I kind of like having the order output by ls match the order of the created issues, I just didn't want my filenames to have to contain my issue titles. A downside of having the order specified in the frontmatter is that it's possible for two (or more) files to have the same value for that key (by accident), and then what would we do? Fall back to filename order?

Also, is there a need to be able to ignore files for any reason? Do you think we should skip files that begin with . or that have a flag set in the frontmatter?

Do you have a use case in mind? In any case, I was thinking we'd only use .md files in the folder so any other files would be ignored.

do we still need the checkbox on the assignment creation form to enable / disable issue creation?

If a starter code repository has issues in that folder, let's just create them.

Yes, I totally agree. We no longer need the checkbox with this approach.

I think we should prefer the GitHub Git Data API

Again, I totally agree. Before your comment I had briefly looked at the API docs and found Repository Contents in addition to Git Data. I haven't used either - so would Git Data be the better one to use?

@johndbritton
Copy link
Contributor

A downside of having the order specified in the frontmatter is that it's possible for two (or more) files to have the same value for that key (by accident), and then what would we do? Fall back to filename order?

I think you're right, let's go with simple and just use the sort order. We could add an optional field later if we want.

Do you have a use case in mind?

No, just trying to think through all the possibilities as we're essentially creating a spec by doing this.

I was thinking we'd only use .md files in the folder so any other files would be ignored.

This sounds good to me. What about sub-directories? For v0 of this let's only use the top level directory markdown files that don't begin with ..

Yes, I totally agree. We no longer need the checkbox with this approach.

🆒 We can bring it back in the future if needed, but I like that we're able to simplify the interface. If a teacher wants to skip creating the issues, they can fork the repo and delete the markdown files.

I haven't used either - so would Git Data be the better one to use?

I knew there was a way to get a file from path, but didn't see it when I was commenting before. Get Contents seems like what we want here: https://developer.github.com/v3/repos/contents/#get-contents

Looks like we'll have to pay attention to symlinks and submodules in repositories.

@johndbritton
Copy link
Contributor

@rmacklin Friendly ping to see if you're still planning to work on this.

@rmacklin
Copy link
Contributor Author

rmacklin commented May 7, 2016

Hey, sorry for going silent. I got really busy at work. I do plan to get back to this, might be able to this week.

@johndbritton
Copy link
Contributor

No worries @rmacklin, we appreciate the contributions.

@rmacklin
Copy link
Contributor Author

rmacklin commented May 26, 2016

@johndbritton I'm finally back to working on this! And...I have some new thoughts I'd like to discuss.

  1. I think I've found a drawback to the approach of storing the issues inside the starter code repo (and also with my previous approach). When the issues are just stored in the repo, it's difficult to allow a teacher to update the description of a task and have that edit automatically propagate to the corresponding issue in each student repository. And this is something I've done several times. And I don't think my case is special either. It seems pretty common for a student to ask a question about a particular issue which makes the teacher want to edit the description to clarify something, reword something, add an example, etc. So I think it would be ideal if the approach we take supports teachers being able to update a description and have the update propagate to the issues on the student repositories.

  2. Regarding the "which account should we use to open the issues" topic: Previously we had been thinking we'd use either the student's account or a bot account. I'm not using GroupAssignments, but after coming back to this after a long break I was re-familiarizing myself with the code and I realized that "the student account" only makes sense for Assignments, since GroupAssignments have multiple students. So, I'm thinking that the bot account might be the best approach. If you agree, I'd like to talk about what that would look like. That said, if the bot account poses complications, I think we could still go with opening issues from the assignment creator's account for an "initial release", and then switch to a bot account in a subsequent PR.

  3. Regarding:

    In the case of copying issues, this means we'd likely have to show the user a progress bar on Classroom while copying the code, creating the issues, and then finally alert if there was a success or failure via a websocket or polling.

    Do you have any more thoughts about this? I could probably implement a polling solution relatively easily, but if we wanted to go with websockets that'd involve changing the system architecture (unless you guys are already using websockets, but I hadn't come across any code that was). I'm certainly not against that, but I think we'd need to discuss that. Let me know what you're thinking about this issue.

I prototyped an approach that would address the drawback mentioned in (1). The gist of it is that we'd add a new model that would hold the issue data, which would allow us to push updates to the student repos if a teacher makes an update. I called this model Task because, at least for me, the terminology I naturally think of is that a large "assignment" can be broken down into smaller "tasks" (rather than "issues"). However, if you'd prefer to call it "issues" since that's what they'll end up being on GitHub, I'm open to renaming it. Anyway, Assignments and GroupAssignments can both have many Tasks. As before, when a student repo is created (after they accept an assignment invitation), we'd open an issue for each Task. But additionally, if a Task is edited, we could propagate the update to the student repos that have already been created.

You can check out what I've prototyped here: rmacklin/classroom@80f7d62~...3a3365f. The very basic functionality works, but of course I haven't addressed (2) or (3) from above. Additionally, I've played around with propagating updates but haven't pushed anything yet because I figured it's another thing we'd discuss. If we can assume (or require) that Task titles are never updated, then propagating updates to the description (i.e. the issue body) is easy: find the issue with the matching title, push an update. If titles can also be updated, then it's more involved (but certainly still doable - we can discuss some approaches for this). And we can naturally extend this to the deletion of a Task: close the corresponding issue in all student repos.

Okay, sorry for the big wall of text, but let me know your thoughts on any/all of this. Thanks!

@johndbritton
Copy link
Contributor

I'm finally back to working on this!

Awesome ✨

So I think it would be ideal if the approach we take supports teachers being able to update a description and have the update propagate to the issues on the student repositories.

Could this be solved by making it easy for teachers to add comments to all student issues at once? Keeping things up to date / in sync can get to be complicated and cumbersome.

"the student account" only makes sense for Assignments, since GroupAssignments have multiple students.

That's a good point that I hadn't considered.

So, I'm thinking that the bot account might be the best approach.

Yeah, that seems to make sense.

If you agree, I'd like to talk about what that would look like.

Maybe we create a special user account called ClassroomBot or something and use that.

That said, if the bot account poses complications, I think we could still go with opening issues from the assignment creator's account for an "initial release", and then switch to a bot account in a subsequent PR.

I think this is the best approach for now, with one additional step. Rather than releasing this widely and then changing it, we can release this feature behind a feature toggle so we can enable it on a per account basis. That way we can try it out before committing to using a specific style permanently and minimizing the number of people who would have a jarring change while we're tinkering with things.

For feature toggles we use the flipper gem.

@johndbritton
Copy link
Contributor

Forgot to respond to the other questions:

I could probably implement a polling solution relatively easily, but if we wanted to go with websockets that'd involve changing the system architecture (unless you guys are already using websockets, but I hadn't come across any code that was).

Rails 5 adds ActionCable, I bet we can use that when it becomes available.

Anyway, Assignments and GroupAssignments can both have many Tasks. As before, when a student repo is created (after they accept an assignment invitation), we'd open an issue for each Task. But additionally, if a Task is edited, we could propagate the update to the student repos that have already been created.

I think this approach could work, but I really think there's a huge benefit to making the starter code / issues a single artifact that can be modified and is versioned as a Git repo. You get all the features of GitHub to collaborate on building an assignment.

Rather than trying to sync the issues, what do you think about my idea of making it easy to comment across all the issues in student repositories?

@rmacklin
Copy link
Contributor Author

I think this is the best approach for now, with one additional step. Rather than releasing this widely and then changing it, we can release this feature behind a feature toggle so we can enable it on a per account basis. That way we can try it out before committing to using a specific style permanently and minimizing the number of people who would have a jarring change while we're tinkering with things.

👍 Sounds good to me. I haven't used Flipper before, but I'll take a look at their docs. But if you have an example of something in Classroom that was hidden behind a feature toggle in the past that would also be a helpful reference. I didn't see anything in the current code (I could've missed it though).

Rails 5 adds ActionCable, I bet we can use that when it becomes available.

Yeah, I heard about that. It sounds promising (and this would be a nice opportunity to try it out). So does that mean you guys are planning to upgrade Classroom to Rails 5 as soon as it's out?

I think this approach could work, but I really think there's a huge benefit to making the starter code / issues a single artifact that can be modified and is versioned as a Git repo. You get all the features of GitHub to collaborate on building an assignment.

Rather than trying to sync the issues, what do you think about my idea of making it easy to comment across all the issues in student repositories?

I think that idea is potentially workable but also has its own downsides. Also, wouldn't commenting on all the issues in the student repos still come with some of the same complications as keeping things in sync? Some downsides I can think of are:

  • If you make several updates, it could end up being a lot for the student to read. And given that what's stated in the comments could be corrections to something in the original description, you'd be forcing the students to read the wrong thing, and then later (maybe a lot later) see it corrected it to the right thing.

  • How do you handle students who haven't accepted the invitation yet? Say you want to clarify something in an assignment, but you know there are outstanding invites. I don't think it'd be reasonable to have to wait until all students have accepted the invitation because you'd be putting the "early birds" at a disadvantage. So, you could comment on the existing issues, but what happens when the next student accepts the invitation? Do we have to have comments queued up for them?

    With the approach of always directly updating the issue, this concern goes away completely. Existing issues will get updated and when new repos are created their issues will be opened as the "latest version". And each student sees the same thing when they're looking at the issue in their repo.

Keeping things up to date / in sync can get to be complicated and cumbersome.

I agree, but it would help if you could list the complications you have in mind. A significant one that I can think of is handling errors, but I think we'd need similar error handling with the "comment on all issues" approach.

Also, I momentarily forgot about ActiveModel::Dirty, which removes the complication I had previously mentioned with allowing Task titles to be updated.

@johndbritton
Copy link
Contributor

example of something in Classroom that was hidden behind a feature toggle in the past

There are no examples, we haven't used it in this app yet.

So does that mean you guys are planning to upgrade Classroom to Rails 5 as soon as it's out?

Yeah, we do our best to keep everything up to date. Might not be immediate, but I think we want to be on Rails 5 shortly after it comes out.

If you make several updates, it could end up being a lot for the student to read. And given that what's stated in the comments could be corrections to something in the original description, you'd be forcing the students to read the wrong thing, and then later (maybe a lot later) see it corrected it to the right thing.

I think both solutions add unneeded complexity. What do you think about skipping updating altogether for now and coming back to it in a second iteration?

I agree, but it would help if you could list the complications you have in mind.

Adding an additional Task model and related interfaces would be a lot of work that the app would have to handle. I could see that leading to people wanting ways to export Tasks or reuse them in a template as courses repeat. I think the ideal solution is to store all of that in the starter code repository.

Maybe there's a way that we can use webhooks on the starter code repository to listen for updates and trigger updates to issues in student repositories when the markdown files in the repositories are changed.

The more I think about it, the more I think we should wait and see if the updating feature is necessary. Step one would be to create issues programmatically and then we can keep them up to date if needed.

@rmacklin
Copy link
Contributor Author

Maybe there's a way that we can use webhooks on the starter code repository to listen for updates and trigger updates to issues in student repositories when the markdown files in the repositories are changed.

Yep, that would be possible. That's why I said (emphasis added):

it's difficult to allow a teacher to update the description of a task and have that edit automatically propagate to the corresponding issue in each student repository.

It's more work than if tasks are a model, but certainly not impossible. With GitHub webhooks, few things are impossible 😸

Anyway, I'm on board with delivering issue updating separately. Thanks for discussing things through with me.

Regarding:

Yeah, we do our best to keep everything up to date. Might not be immediate, but I think we want to be on Rails 5 shortly after it comes out.

Sounds good. If I'm ready before that happens, I'll go with polling and we can refactor to websockets after the Rails 5 upgrade.

@johndbritton
Copy link
Contributor

Sounds like a good plan to me, thanks for working on this.

@jeremyFreeAgent
Copy link

What about merging that PR?

@rmacklin
Copy link
Contributor Author

@jeremyFreeAgent This PR isn't ready to be merged, per the discussion above. I'll close it so it's not confusing.

@rmacklin rmacklin closed this Oct 13, 2016
@rmacklin
Copy link
Contributor Author

@jeremyFreeAgent That said, I was actually experimenting with this again recently (after another long period of not getting a chance to work on it). Out of curiosity, what are your needs for a feature like this? Do you essentially want the same thing as me where student repositories can be "preloaded" with issues, or is your real need a bit different?

@jeremyFreeAgent
Copy link

@rmacklin It is what you said: preloaded with issues. Currently I use the API to do that.

@obcode
Copy link

obcode commented Jun 2, 2017

@jeremyFreeAgent how do you use the API? I also need this preload-issues-thing, but automatically when a student repo will be created.

@nwoodthorpe
Copy link
Contributor

@obcode we're eventually looking to implement this, but we aren't sure when yet.

For now, you can use Repository Webhooks + our Issues API to preload a repository with issues.

@obcode
Copy link

obcode commented Jun 3, 2017

Okay. Thanks Olli

@jeremyFreeAgent
Copy link

Yes that is the way

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants