New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Jsonnet language #2653
Add Jsonnet language #2653
Conversation
I'm not from GitHub staff but I don't think there are enough files to add Jsonnet as a language. As per the contribution guidelines:
I only count 12 repositories across 10 users in your search example ;) |
Can someone from GitHub staff clarify on the requirements for adding a language? On #2348, @bkeepers said the following:
which seems to imply that the requirements is having hundreds of files for a given language. One of the accepted languages, JSONLD (added in #957), appears to have over 15k files, but I only count ~18 repositories with a few (~3) of the repositories containing the bulk of .jsonld files: https://github.com/search?p=1&q=extension%3Ajsonld+NOT+nothack&ref=searchresults&type=Code&utf8=%E2%9C%93 |
@davidzchen Note that there's a difference between adding a new language and addning a new extension to an already supported language. I believe @bkeepers was referring to the latter. Also, the requirements have changed slightly over time, so the quote may be outdated. |
@larsbrinkhoff That is a fair point. It definitely makes sense to have different requirements for adding an extension and adding a language since the implications on the classifier and computation resources will be different. In any case, it would be good to get some clarification from GitHub Staff about the requirements for both cases. There has been some discussion on this in #2657 as well, but as pointed out on both issues, having hundreds of repositories does not always seem to be a hard requirement, and other factors, such as the unlikelihood of collisions for a given file extension, are also considered. In this case, I will leave it to GitHub Staff to make the final decision on whether the Jsonnet language can be added at this point or whether we should wait until there are more files in more repositories. |
Ping! It'd be good to know exactly how much adoption you're looking for. Note that the jsonnet repo has > 400 stars. |
Rebased and resolved merge conflicts. |
Thanks for the pull request, @davidzchen! My preference would be to wait until there is a little more usage in the wild. There are a lot of samples, but they are in just a handful of repositories. Let's revisit in 3 months and see what it looks like. |
Thanks for your feedback, @bkeepers. Sounds good, let's revisit this in 3 months. |
@bkeepers It has been 3 months since this PR was first reviewed. Do you think there is now sufficient usage to allow adding Jsonnet to Linguist? |
It looks like there is a little more usage, but it still only in tens of repositories instead of hundreds. My preference would be to wait longer, but I'll defer to @arfon and others that have been more active in linguist recently. |
It's now 16 repositories among 14 users. |
Not that I know of. I use a script which iterates on all pages for Code results... :/ |
Can you share the script? :) |
Feel free to submit a PR to add it to |
Bump. There are tons of results for |
These days you'd have to also count the .libsonnet files although they're probably in the minority. |
I counted 46 repositories from 39 users for |
For what it's worth, I suspect that several Github Enterprise users would benefit from this support. Jsonnet has had a fair bit of uptake as a config language within various companies, where the results don't generally show up in public repositories. |
I'm a fan of this change as well. Are private and enterprise repositories being included in the counts? Jsonnet is actually quite nice for stuff like config files, which are perhaps less likely to be open sourced than in a private repo. |
.libsonnet and .jsonnet are together now > 1000 results https://github.com/search?utf8=%E2%9C%93&q=extension%3Alibsonnet+NOT+djhfjdhfdhfd&type=Code If this is sufficient usage, we'll refresh the PR |
@bkeepers Do you have any specific guidance here? If the answer is still no, then it would be nice to have a clear usage goal so that we know when we can bother you again. :) |
The same thing applies as before: we need to see large scale in-the-wild usage with the CONTRIBUTING.md suggesting "[i]n most cases we prefer that extensions be in use in hundreds of repositories before supporting them in Linguist." Using a very crude script (I'll tidy it up and add it to this repo at some point) to search using the API... ... a search for the
... and for the
As you can see, this is still not popular enough. |
Ooops, found a bug in my crude script which meant the |
@lildude that's good info, but what I'm really wondering is when we should come back. It sounds like you're saying to come back when there are 200 unique repositories? |
We can't really put it down to precise figures. It's more by general feel based on a combination of number of repos and uniqueness of the files, spread across the users. 200 copies (as opposed to forks) of the same repo is easy to achieve with 5 people. It doesn't make it wide spread usage. It's also quite common to see a lot of forks of the same repo for a new language with little variation in each fork, which is especially common in education environments. Yes, the number of repos is high, but the variation and real world usage isn't. I'd love to be able to precisely quantify this as we could easily write a test for it, but we can't right now due to limitations of the API. |
@lildude Did you ever check in your script? I'd like to see where we are now (6 months later). Thanks! |
@sparkprime Nope, and I can't as it triggers GitHub's abuse controls unless I use a whitelisted token, which I obviously can't share, and I don't feel comfortable sharing an abusive script 😄. |
Any chance you can abuse Github for me? :) |
Sure.
Total files found: 2934
Total files found: 418 The API only returns 1000 results so a few repos can dramatically affect the total number of unique repos and users by increasing the number of files they have with that extension. The totals should also be taken with a pinch of salt as it'll include files that clearly aren't the desired language but have the same extension. |
Thanks! I'm surprised that the number of repos has gone down from 79 to 74. Are you doing more de-duping of repos now than last time you ran this? |
Nope. I noticed this too, hence I added the extra qualifying paragraph. In short, more active users with more files can push less active user/repos out of the first 1000 search results and that is probably what has happened here. |
Ah, sorry I didn't get that the first time around :) I just did a bunch of searches using the API that split the space into portions < 1000 files, e.g. by doing one with "NOT params" and one with "params". This was fairly arduous and non-automatic but it got me the following stats: Number of repos: 159 See you in another 6 months :) |
Can we get this now? 😄 |
Another 6 months has passed. Can we reopen this yet? |
I've added the |
Bump popularity is increasing @pchaigno please add .libsonnet as well and add it to the jsonnet count. |
The |
It will need updating I think as there is new syntax over the last 5 years. Also the libsonnet prefix also should be included as probably more files are written in that. I'm not sure it would have changed the number of active repos much though, since if a repo has a libsonnet file it probably also has a jsonnet file in it as well. Think of it as like |
I made a new PR: #4455 |
Jsonnet is a functional, formally-verified configuration generation language for JSON.
Examples in the wild: https://github.com/search?utf8=%E2%9C%93&q=extension%3Ajsonnet+NOT+nothack&type=Code&ref=searchresults
Documentation: https://google.github.io/jsonnet/doc/
Corresponding issue: google/jsonnet#43