-
Notifications
You must be signed in to change notification settings - Fork 745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitHub #1296
GitHub #1296
Conversation
Looks great, but too late for me today to run code. Two comments:
|
|
Yes, It is possible to look up the authors real name (or whatever is written) on their GitHub page or by calling the API. However, this can easily be a lot of requests, especially if you are thinking about large projects. Please also note, that currently I add them all as |
Thanks @adam3smith. @zuphilip, I agree that looking up each author takes time, and you might quickly hit the rate limit. Author vs. contributor is tricky, I think this is an area where a |
@zuphilip I am fine with merging. |
@adam3smith Are contributors in respect to CSL useful? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some questions on this. Feel free to disagree, just trying to make sense of what'd work best.
Github.js
Outdated
} | ||
|
||
/** | ||
Copyright (c) 2015 Martin Fenner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update date and probably copyright holder here
Github.js
Outdated
"title": "zotero/zotero", | ||
"creators": [ | ||
{ | ||
"lastName": "dstillman", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these giant contributor lists -- for CSL these would be many hundred people in the styles directory -- don't make much sense to me. I'd tend to say we don't want them. What's the argument for? And wouldn't it make sense to list the account or the organization of the repo as the author?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the argument for [these giant contributor lists]?
Every contribution is valuable. But I agree that in respect to referencing the software such giant list of all contributors are often not needed.
And wouldn't it make sense to list the account or the organization of the repo as the author?
Well, I would then add the owner information as author for User
accounts and as publisher for Organization
accounts. This information can be found out together with the full name of the owner by another API request, e.g. https://api.github.com/users/datacite
Should we treat forks the same way, i.e. the owner of the fork becomes then the author? Maybe, this is the best we can do, because forks can sometimes go into different directions and interesting to cite. In the other cases the user should go to the upstream directory and cite this (but I guess we don't have to solve this technically).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should acknowledge that it is impossible to generate an appropriate author list without additional input, whether it is a CITATION file or an extra step as in the Zenodo workflow. I think using the owner information is a good compromise, and I would also put an organization as author
.
I would say that GitHub is the publisher
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have GitHub in the libraryCatalog
field. BTW I just saw that Zotero names the publisher
field for computerProgram
s actually company
which IMO looks to fit well for organizations. I will just finish my new attempt to show the new output...
Github.js
Outdated
"items": [ | ||
{ | ||
"itemType": "computerProgram", | ||
"title": "zotero/zotero", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would make more sense to to take the bit after the last slash here, no? i.e. just zotero in this case? E.g. for DataCite below, imo DataCite should be the author and schema the title. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The title could become very short and generic, e.g. schema
in the other example. On the other hand I agree that this combined titles are rather artificial. We could also add the description as title and subtitle, e.g. "zotero - Zotero is a free, easy-to-use tool to help you collect, organize, cite, and share your research sources." (maybe replace the -
by :
and delete the last .
).
Can an organization be an author = programmer? Or is an organization more like in the role of a publisher? Which would be in line with this suggested citation data here https://github.com/datacite/bolognese/blob/master/CITATION
Moreover, I agree, that we should save the information about the owner somewhere...
Ah, these are all difficult questions
Thank you for the comments. I will try a new version as outlined above and we can continue discussion. |
Please have a look at the new version. |
Thanks @zuphilip this looks better to me (title and author). For authors that are organizations, do you use |
Yeah, this looks great now. Thanks both. |
@mfenner My assumption is here, that organizational owners of GitHub repos are the companies/organization the programmers (authors) are working for. AFAIK an organization in GitHub cannot itself provide contributions, it is always the members of that organization. It might be different w.r.t. the copyright of the code. Let me know if you think these assumptions are wrong or too general. |
@zuphilip you are right from the perspective of GitHub and contributions to a repo. I think in terms of authorship the organization could be useful, but there are too many assumptions that might not always hold. It is great to see this merged, and thanks a lot for this. Going forward we should see how to enrich with metadata provided by the repo via codemeta.json, CITATION, etc. I think the first step here would be community agreement. |
This is a continuation of @mfenner 's work in PR #888 and should address all comments from the review there. It uses the usual function definitions, switched to type
computerProgram
and uses now two API calls for some basics facts and all contributors. These API calls can be done without authentication, but then there is a rate limit of 60 calls per hour.@mfenner Any comments or tests are welcome. We can also continue from here to cover different methods when the repos are providing some BibTeX data (example would be helpful for that).
#NTI17