Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding GitHub topics to repositories #5

Closed
fmichonneau opened this issue Oct 9, 2019 · 14 comments
Closed

Adding GitHub topics to repositories #5

fmichonneau opened this issue Oct 9, 2019 · 14 comments

Comments

@fmichonneau
Copy link

fmichonneau commented Oct 9, 2019

Back in January 2017, GitHub introduced GitHub topics. They allow for repository discoverability across GitHub. Additionally, as the number of lessons within The Carpentries has grown, we have been looking at programmatic ways to identify which repositories within our organizations are lessons (as opposed to tools, templates, etc.) GitHub topics allow us to do this through the GitHub API.

Some of our repositories are already using them, however, we are looking to standardize GitHub topics across our lesson programs.

The Lesson Infrastructure committee (@carpentries/lesson-infrastructure-committee), following a proposal put together by community member Robert Zinna (@Zinnar), had approved a tentative plan for Topics for our repositories carpentries/lesson-infrastructure#8

Now that we have this repository for RFCs, we are sharing an initial draft (as a Google Spreadsheet) of the different GitHub Topics that would be added to our repositories.

The spreadsheet is organized as follows:

  • repository_url: URL of the GitHub repository
  • repository: name of the repository
  • topics_carpentries: all repositories would get a carpentries topic
  • topics_lesson_program: the lesson program for the repository
  • topics_repo_type: type of repository: lesson, template, infrastructure, etc.
  • topics_tool: the tool taught by the lesson
  • topics_curriculum: the curriculum the lesson belongs to
  • topics_skills: the skills taught in the lesson (several skills are listed for each, they will become independant topics)
  • topics_language: the human language for the lesson
  • topics_life_cyle: the life cycle for the lesson

We are requesting feedback and input from Maintainers on these proposed GitHub Topics. You can leave a comment on the Google Spreadsheet, a comment on this GitHub issue, or send us an email at team@carpentries.org.

Timeline

Please comment on this proposal by Friday, 25 October midnight UTC. @fmichonneau, @ErinBecker, and members of the @carpentries/lesson-infrastructure-committee will respond here to answer questions and summarise potential paths forward.

This was referenced Oct 10, 2019
@dvanic
Copy link

dvanic commented Oct 10, 2019

I'm not sure whether to tag cloud genomics with shell as a skill we teach, as it is what we use but not necessarily explicitly teach per se...

@remram44
Copy link

remram44 commented Oct 10, 2019

The 'workshop' and 'workshop-website' topics already exist, I recommend using one of those rather than 'workshop-overview' which is not currently used anywhere on the site. It's a bit too descriptive and too specific to the Carpentries' workflow, let's pick something that makes sense in the wider GitHub community.

Similarly, 'spreadsheet' is more popular than 'spreadsheets'; 'image-processing' (or 'computer-vision') is more popular than 'image-analysis' (and have official descriptions). 'data-management' is also a widely used term (in academia/libraries) while 'data-organization' is only used by 1 repository.

'webscraping' has alternatives 'scraping' (most popular) and 'web-scraping' (my preferred spelling but less widely used), I'm not sure what should be used. Similarly, 'versioning' is more popular than 'version-control' (though 'version-control' is my preferred spelling too).

@fmichonneau
Copy link
Author

@remram44 thanks for checking the popularity of the different tags. It's going to be useful to figure out which one to choose.

@fmichonneau
Copy link
Author

the RFC has been edited and the period for comments has been set to last until Friday, 25 October at midnight (UTC).

@remram44
Copy link

remram44 commented Oct 11, 2019

the RFC has been edited

What are the changes? Google Docs doesn't let people without edit access view the history 🙁

@ErinBecker
Copy link
Contributor

ErinBecker commented Oct 11, 2019

@remram44 - The changes that @fmichonneau is talking about are to the GH issue, not to the Google Doc. He updated the "Timeline" to add a comment-by date.

@ErinBecker
Copy link
Contributor

ErinBecker commented Oct 14, 2019

@dvanic - that's a good question! The same question applies to other lessons that use one technology to teach another (e.g. Git lesson uses shell). Any suggestions from @swcarpenty/git-novice-maintainers, @swcarpentry/git-novice-es-maintainers?

Edit: Looks like I can't tag teams from other GH organizations. ☹️

@maxim-belkin
Copy link

On behalf of the maintainers of python-novice-inflammation lesson: we'll discuss which "topics_skills" we think are appropriate and get back to you.. ETA: a few days.

@ErinBecker
Copy link
Contributor

Summary of discussion at 1st Maintainer meeting today:

  • When possible, use the more popular topics to help with discoverability
  • We will have a carpentries tag to help people filter to just our lessons, as well as lesson program specific tags.
  • Using a hyphen (software-carpentry) or a space (software carpentry) in a topic name gives the same results, but having things as a single word (softwarecarpentry) doesn't.

@ErinBecker
Copy link
Contributor

Thank you everyone for your feedback on this conversation! @fmichonneau and I will be meeting tomorrow to discuss this feedback and plan a timeline for implementing topics across all of The Carpentries repos.

@ErinBecker
Copy link
Contributor

Based on feedback, I've updated the Google Sheet. In cases where there were alternatives for the proposed tags, I included multiple alternatives when they were similar in frequency of usage, and included only the more popular topic when it was at least an order of magnitude more frequent. Since we're not limited in the number of topics on a repo, and our primary goal for using this feature is to enhance discoverability, I lean towards using more tags where there is evidence that the community is using both tags.

Summary of changes:

  • workshop-overview (0) --> workshop (1816) (workshop-website has only 28 repos)
  • spreadsheets --> both spreadsheets (103) and spreadsheet (489)
  • image-analysis(444) --> image-processing (6102) and computer-vision (7221)
  • data-organization (1) --> data-management (241)
  • webscraping --> webscraping (1394), scraping (1502), and web-scraping (1093)
  • version-control --> both version-control (402) and versioning (546)
  • data-visualisation --> data-visualisation (204), data-visualization (5003) (in this case, I'm including both spelling options so repo will be discoverable by folks who use both sets of English spelling conventions)
  • publication --> publication (145) and publishing (434)
  • regular-expressions --> regular-expressions (302) and regex (1603)

@maxim-belkin
Copy link

maxim-belkin commented Oct 30, 2019

For SWC/Python-novice-inflammation lesson, could you please add the following topics?

programming
data-analysis
data-visualization
automation
functions
loops
matplotlib
numpy

@fmichonneau
Copy link
Author

the topics have been added to the lessons.

@remram44
Copy link

Did GitHub remove the pagination from the topic pages? I see 8 repos on each topic page with no way to browse more?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants