Refine definition of KiCad #3743

Alhadis · 2017-07-28T05:14:00Z

What the site currently classifies as KiCad is, according to their documentation, three entirely different formats:

Schematic files: Two different files, same format (and extension). Still in use.
- .sch: 5,501 samples collected out of ~224,968 results
Board files: Original PCB layout format, now superseded by new S-expression format.
- .brd: 2,907 samples collected out of ~107,658 results
Pcbnew layout: Primary PCB format used by KiCad suite, with a syntax based on S-expressions:
- .kicad_pcb: 16,808 results
- .kicad_mod: 284,942 results
- .kicad_wks: 306 results
- fp-lib-table: 4,482 results

Technical specifications of each format are covered in a PDF file, downloadable from KiCad's website.

I collected a fresh harvest of both sch and brd files, with the breakdown graphs included in each silo:

Notes

KiCad formats are data formats, not programming languages. I'm not sure why @pchaigno chose the latter in KiCad language with .sch extension #2309 / *.sch is both Eagle and KiCad schematics file #2187, but these formats are essentially just lists of coordinates, property lists, and object descriptions. XML has more in common with programming languages than these do.
A considerable number of sch search results were Scheme files. I ran an additional search to gauge the extension's usage better, and concluded that it's common enough to include it as a recognised Scheme extension. There were numerous Racket files as well, but I don't know the difference between Scheme and Racket, so I left the latter as-is.
Sample files were sourced from faffing around with vanilla KiCad and gEDA-gaf installations, and me clicking anything that looked like a drawing tool. Needless to say, I couldn't find any samples released under a clear permissive license, so I hacked together my own.
Scheme sample sboyer.sch released to the public domain (source confirmed here).
Obligatory sidenote: The harvester.js snippet I wrote to collect public search results has been moved to a Gist. Everything else has been torched and salted, with local copies eradicated by rm ‑rfP. I've never felt palpable disgust over such sloppy code. NFI what I was thinking or doing when I wrote that bullcrap.

If you want to calculate a summary of unique repositories, you can use the standard Unix toolchain:

# Filter list of unique repositories
grep < urls.log -iEoe '^https?://raw\.githubusercontent\.com/([^/]+/){2}' |\
uniq | sed -Ee 's,raw\.(github)usercontent,\1,i' > unique-repos.log

Update

New samples added from recently-discovered repositories which are, thankfully, released under a permissive license:

Sources shared with #3744.

pchaigno · 2017-07-28T07:21:45Z

lib/linguist/languages.yml

@@ -4056,6 +4078,7 @@ Scheme:
  color: "#1e4aec"
  extensions:
  - ".scm"
+  - ".sch"


No need for a heuristic rule between KiCad Schematic and Scheme?

Hrm, do you think it warrants one? These were the only non-KiCad files marked as KiCad Schematics, and this was the only KiCad file identified as Scheme. All-in-all, I think the classifier's doing a good enough job. =)

Travis shat bricks the moment I submitted this PR... any idea what's up with that? 😕

I think it's due to a recent update of Travis CI's default images. @kivikakk fixed it in her pull request.

I trust she knows what she's doing. 👍 'Cause I don't, haha. Thanks!

(I named one of the samples after her, BTW... or rather, I made a typo and decided to keep it because it was so lulzy)

(I named one of the samples after her, BTW... or rather, I made a typo and decided to keep it because it was so lulzy)

Awwww, this is as nice as the time I got called "Purveyor of the finest kivikode" by a coworker!

The extent of my Estonian knowledge in one picture, screen-capped from Notes.app:

Also, this.

I've spent the entire day sorting files into folders, I'm getting jittery and restless. 😆

pchaigno

Thanks for the pull request and all the work you put in!

(I wish I still had the time and resolve to do this kind of pull request...)

Alhadis · 2017-07-28T07:48:25Z

Haha, merci! This is only the first half of my research; I'm currently working on the second. =) It's related, but a separate format.

pchaigno · 2017-07-28T07:53:30Z

Soon, a monopoly on GitHub's grammars?

Alhadis · 2017-07-28T07:55:40Z

I have a special project in mind for the future regarding grammars and GitHub. =) But I have an infuriating habit to announce my intent to start projects before I actually do so, and then later lose interest when something else distracts me later on. So I'm saying nothing until I actually do start working on it.

I believe I've alluded to it to you before, though. ;)

pchaigno · 2017-07-29T10:41:33Z

I believe I've alluded to it to you before, though. ;)

This one? 😃

Alhadis · 2017-07-29T10:58:24Z

In a sense. It won't be a canonicalisation of TextMate, though. I envision it being more of an evolution: building upon every strength the TextMate grammar format has, while addressing the glaring weaknesses that've contributed to its growing fragmentation between editors - no sense of context, zero macro or variable support for grammar authors, and most of all, no support for multiline pattern-matching.

Preserving the existence of a portable, flexible and approach grammar format is the first half of the project. The other... well, language recognition. Remember my first grammar submission to Linguist, where I mistakenly believed that grammars were needed for classification accuracy? Well I was damn wrong, but that misconception started an interesting train of thought. If authored carefully, a grammar could reliably identify languages based on the frequency of matches, and the scopes assigned to them. The executable would be passed a file, and a list of grammar scopes to test. It'd then spit out a line-delimited list of multipliers describing how confident it is in each match. So for a start, there's your not-a-language right here:

$ syn something.sol --scopes js,tsx...
0.025 TeX
0.25 JS
...

Secondly, an important characteristic about Synapse (I may as well just spill my damn guts, now nothing will ever get started 😆 ) would be the simplicity of what it does. It doesn't output CSS, or HTML, or anything but an ordered sequence of offsets in a file tagged with human-readable descriptions ("scopes"), which are up to a higher-level implementation like Atom or GitHub to deal with. Hopefully it'll be pure Unix in spirit: think grep or ack with semantic pattern-matching.

There's the idea, now I've jinxed it. 😢

Alhadis · 2017-07-29T11:04:34Z

Gotta finish this off first, though: I've been promising to do it since last September. Nearly there...

Real-time troff rendering with HTML5 canvas technology. 👍 Only taken me 3-4 months, hahahah.

pchaigno · 2017-07-29T16:49:29Z

If authored carefully, a grammar could reliably identify languages based on the frequency of matches, and the scopes assigned to them. The executable would be passed a file, and a list of grammar scopes to test. It'd then spit out a line-delimited list of multipliers describing how confident it is in each match.

I think that was the approach taken by Pygments to classify files. The issue with this approach is that it's rather costly, in particular if you have a long list of grammars to interpret. It would be nice to have some actual benchmarks for this approach though.

Alhadis · 2017-07-29T16:59:19Z

Nope. See, Pygments uses lexical parsing (IIRC), whereas Synapse grammars use the same nested, regexp-based syntax that TextMate grammars have always used. My grammars are probably the most complex, structured and "semantically authored" grammars out there, and if anybody wants to compete with my belligerently OCD scrutiny, step up to the plate. My Roff grammar is probably the best example of how this can be done well.

There's a damn good reason I refuse to enforce an AST-based approach: artistic freedom. The flexibility of the current format is one thing I refuse to part with, and not everything that receives highlighting is, dare I say, a "language".

Besides, Python is slow as crap and sucks as a language anyway, but oh-its-so-readable-tho.

Alhadis · 2017-07-29T17:02:28Z

Oh, and grammar authors who've taken care to structure their grammars semantically can assign arbitrary "weights" to each pattern that can influence how strongly a successful match impacts language recognition.

function name ( param ) would, for example, hold much greater weight over, say, a floating point literal. Obviously, Synapse would use heuristics to draw a best guess for stuff like this, but allowing authors that degree of control over recognition makes me feel more comfortable than entrusting circuitry to make an informed decision.

This is where both of my backgrounds - artist and programmer, t-bone each other in a foaming display of "control freak". Suffice to say I have this shit planned out... writing 20 CSON grammars by hand will do that to ya.

(Nobody believes me when I tell them I have no girlfriend, can you believe it? ＬＥＬ)

EDIT: Oh yeah, another responsibility of Synapse would, of course, be converting between established grammar formats for other implementations. Pygments <-> TextMate <-> Highlights …

There's no way I'd do this if I knew this wasn't getting squash-merged.

Alhadis · 2017-08-02T11:15:59Z

@pchaigno While I'm reminded by #3751, do you feel that Eagle would be better represented as data rather than markup? I suppose there's a grey area where page-description languages overlap with image data, but Eagle seems pretty clearly composed of immutable data...

EDIT: Actually, don't worry. =) We can address this topic in #3751....

Alhadis added 2 commits July 28, 2017 04:57

Refine definition of KiCad language

a37eab0

Add ".sch" as a registered Scheme extension

8728cbc

Alhadis requested a review from pchaigno July 28, 2017 05:32

pchaigno reviewed Jul 28, 2017

View reviewed changes

pchaigno approved these changes Jul 28, 2017

View reviewed changes

Add more meaningful samples from real repositories

17a8d30

Alhadis mentioned this pull request Jul 29, 2017

Add Solidity language (take 5) #3560

Closed

6 tasks

Merge branch 'master' to restart Travis build

578a9cc

There's no way I'd do this if I knew this wasn't getting squash-merged.

Alhadis mentioned this pull request Aug 2, 2017

Fix classification of bogus "markup" languages #3751

Merged

Resolve merge conflicts with #3732

c31543a

Alhadis requested a review from lildude August 8, 2017 06:17

lildude approved these changes Aug 8, 2017

View reviewed changes

Alhadis merged commit dd3d858 into master Aug 8, 2017

Alhadis deleted the kicad branch August 8, 2017 08:47

lildude mentioned this pull request Aug 9, 2017

Release v5.2.0 #3765

Merged

valerionew mentioned this pull request Aug 18, 2017

Kicad #3784

Closed

Alhadis mentioned this pull request Aug 28, 2017

Language stats on repository landing page incorrect #3795

Closed

This was referenced Aug 29, 2017

Change KiCad Board language to KiCad Legacy Layout #3799

Merged

Improve support for non-code languages #3805

Closed

valerionew mentioned this pull request Sep 7, 2017

KiCAD project now being labelled as SourcePawn. #3812

Closed

seppestas mentioned this pull request Sep 19, 2017

Add some linguist-detectable attributes #3806

Merged

Alhadis mentioned this pull request Nov 8, 2017

LInguist is reporting my project as a Jupyter Notebook #3316

Closed

lildude mentioned this pull request Nov 24, 2017

Linguist not recognizing Eagle language #3918

Closed

Alhadis mentioned this pull request Apr 27, 2019

Visual programming language files are incorrectly not classed as programming #4508

Closed

github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine definition of KiCad #3743

Refine definition of KiCad #3743

Alhadis commented Jul 28, 2017 •

edited

Loading

pchaigno Jul 28, 2017

Alhadis Jul 28, 2017

pchaigno Jul 28, 2017

Alhadis Jul 28, 2017 •

edited

Loading

pchaigno Jul 28, 2017

Alhadis Jul 28, 2017 •

edited

Loading

kivikakk Jul 28, 2017

Alhadis Jul 28, 2017 •

edited

Loading

Alhadis Jul 28, 2017

pchaigno left a comment

Alhadis commented Jul 28, 2017

pchaigno commented Jul 28, 2017

Alhadis commented Jul 28, 2017

pchaigno commented Jul 29, 2017

Alhadis commented Jul 29, 2017

Alhadis commented Jul 29, 2017

pchaigno commented Jul 29, 2017

Alhadis commented Jul 29, 2017 •

edited

Loading

Alhadis commented Jul 29, 2017 •

edited

Loading

Alhadis commented Aug 2, 2017 •

edited

Loading

Refine definition of KiCad #3743

Refine definition of KiCad #3743

Conversation

Alhadis commented Jul 28, 2017 • edited Loading

Notes

Update

pchaigno Jul 28, 2017

Choose a reason for hiding this comment

Alhadis Jul 28, 2017

Choose a reason for hiding this comment

pchaigno Jul 28, 2017

Choose a reason for hiding this comment

Alhadis Jul 28, 2017 • edited Loading

Choose a reason for hiding this comment

pchaigno Jul 28, 2017

Choose a reason for hiding this comment

Alhadis Jul 28, 2017 • edited Loading

Choose a reason for hiding this comment

kivikakk Jul 28, 2017

Choose a reason for hiding this comment

Alhadis Jul 28, 2017 • edited Loading

Choose a reason for hiding this comment

Alhadis Jul 28, 2017

Choose a reason for hiding this comment

pchaigno left a comment

Choose a reason for hiding this comment

Alhadis commented Jul 28, 2017

pchaigno commented Jul 28, 2017

Alhadis commented Jul 28, 2017

pchaigno commented Jul 29, 2017

Alhadis commented Jul 29, 2017

Alhadis commented Jul 29, 2017

pchaigno commented Jul 29, 2017

Alhadis commented Jul 29, 2017 • edited Loading

Alhadis commented Jul 29, 2017 • edited Loading

Alhadis commented Aug 2, 2017 • edited Loading

Alhadis commented Jul 28, 2017 •

edited

Loading

Alhadis Jul 28, 2017 •

edited

Loading

Alhadis Jul 28, 2017 •

edited

Loading

Alhadis Jul 28, 2017 •

edited

Loading

Alhadis commented Jul 29, 2017 •

edited

Loading

Alhadis commented Jul 29, 2017 •

edited

Loading

Alhadis commented Aug 2, 2017 •

edited

Loading