Skip to content

Conversation

@cthoyt
Copy link
Collaborator

@cthoyt cthoyt commented Oct 5, 2022

This PR adds the ability to encode the pull request when an ontology was added to the OBO Foundry GitHub repository. It demonstrates how this annotation looks for two ontologies: CLAO and CLYH. It also updates the ontology detail page to display this information.

Benefits of Doing This

  1. Lets you directly find public discussion about the PR when it was added. This gives practical insight into how review was done and by whom.
  2. Side benefit - lets you look up when new ontologies were added.

How to curate more

Figuring out the PR to go with each ontology would be a "good first issue". Here are the steps to figuring this out:

  1. Navigate to the ontology page on GitHub (e.g., https://github.com/OBOFoundry/OBOFoundry.github.io/blob/master/ontology/clao.md)
  2. Click "History" Screen Shot 2022-10-05 at 22 27 04
  3. Scroll to the bottom and click on the title of the oldest commit. This might be paginated, so keep pressing "older" if you see it. Screen Shot 2022-10-05 at 22 27 39
  4. Look on the left to find the pull request page. It will say something like Master #XXXX Screen Shot 2022-10-05 at 22 27 56

This PR adds the ability to encode the pull request when an ontology was added to the OBO Foundry GitHub repository.

It's true that this repository does not well-reflect ontologies that were pulled over from SVN, and therefore the oldest ontologies will not have good entries for this. Please do not let this deter the acceptance of this PR.

This PR will allow an alternate venue of identifying the date added for new repositories. It can be made a requirement for new ontologies later. It can also be a nice curation task as a "good first issue"
@cthoyt cthoyt changed the title Add new metadata field for pull request added Add new optional metadata field for pull request added Oct 5, 2022
Copy link
Contributor

@matentzn matentzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cthoyt awesome. I think this is very useful, and we should merge it. I would note that there is some minor risk involved that you make the issue tracker where the request was made implicit in the metadata, which means that an external consumer of the metadata needs to know the repo that handles the requests. I kinda would prefer, just with an eye for future use of this, using the full URL. This also alleviates the risk of the metadata registry moving to a different repo (ok, that is.. a cosmetic risk, not really realistic). What do you think? I will merge either way, just wanted to check whether you thought about this or not.

@matentzn
Copy link
Contributor

One more thing:

  • I would prefer the NOR issue linked rather than the PR, because it is much more reach with a public discussion then the usually poor metadata pr, which provides a public record of the review as well. You can use then the closing date of the issue as an accurate reflection of the joining date.

@cthoyt
Copy link
Collaborator Author

cthoyt commented Nov 15, 2022

I think it's better to keep just the issue number, because there are several ways it can get used to construct a URL, API call, etc. It would be annoying to have to do text parsing every time you want to use this field.

WRT pull request/issue, the solution I took in the Bioregistry is to track both. In theory, all pull requests should explicitly link back to the issue they close and use a magic commit message so when the PR is merge the issue is closed. It's only implicit how an issue gets matched to the PR that closes it.

Also, since the secondary goal of this field is to look up the actual work done (not the description of work) I still think we should start with tracking the PR.

@matentzn
Copy link
Contributor

I think it's better to keep just the issue number, because there are several ways it can get used to construct a URL, API call, etc. It would be annoying to have to do text parsing every time you want to use this field.

Ok, convinced. I guess when we expose the registry metadata as RDF we can construct an appropriate URL if needed.

Also, since the secondary goal of this field is to look up the actual work done (not the description of work) I still think we should start with tracking the PR.

Ok, not that convinced here, because I don't really see the value of tracking "work done" in this context (the PR is really no "work" in the same sense as other repos, its a quick copy-paste action).

I thought that you are trying to get a backdoor to get "date added"? Is this not your main interest? Given that you can use the close date of the issue for that, wouldn't that give it to you? Minus that, what is the actual value of "look up the actual work done", when "looking about the discussion that led to the acceptance" is clearly of huge value?

@cthoyt
Copy link
Collaborator Author

cthoyt commented Nov 15, 2022

If we were to do it the other way, what instructions would you give to someone to curate them? I was just looking at issues tagged with "new ontology" (see here) and there are only 70. I am worried that this field will not be complete, because it might be the case that there weren't issues for some ontologies in the past wild west times. However, this could also be true for pull requests.

I think the compromise here is still to enable tracking both (then use automated methods as good as possible to keep them complete). I will add the second field

@matentzn
Copy link
Contributor

Ok!

@cthoyt
Copy link
Collaborator Author

cthoyt commented Nov 15, 2022

added accompanying NOR field in a281a73. Here's a screenshot of how the ontology layout page looks now:

Screenshot 2022-11-15 at 14 12 59

@matentzn matentzn merged commit 5075728 into OBOFoundry:master Nov 15, 2022
matentzn added a commit that referenced this pull request Nov 15, 2022
@cthoyt cthoyt deleted the metadata-pr-added branch November 15, 2022 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants