Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve heuristic for RUNOFF/Roff *.rno files #4513

Merged
merged 2 commits into from
May 21, 2019
Merged

Improve heuristic for RUNOFF/Roff *.rno files #4513

merged 2 commits into from
May 21, 2019

Conversation

Alhadis
Copy link
Collaborator

@Alhadis Alhadis commented Apr 29, 2019

This PR improves the heuristic for differentiating RUNOFF .rno files from Roff. Specifically, it matches syntax specific to DSR (DIGITAL Standard Runoff), a typesetting system for VMS systems descended from the original RUNOFF program, and a distant cousin of Unix troff.

Description

While digging through the source code of the InfoCom games that were unearthed recently, I noticed a handful of RUNOFF documents (like this one) that were misclassified as Roff, rather than the (even older) RUNOFF language.

I wanted to add the documents as samples to help our classifier, but the InfoCom games were published for reference and historical interest only, and as stated by the READMEs:

This collection is meant for education, discussion, and historical work, allowing researchers and students to study how code was made for these interactive fiction games and how the system dealt with input and processing. It is not considered to be under an open license.

I know there are less licensing restrictions around adding samples than adding grammars, but I get the impression the InfoCom files are all encumbered, so I'm treating them as radioactive material.

However, I tested each of the .rno files I found locally, and can confirm they're now being classified correctly. You can test them yourself, if you prefer:

Checklist:

  • I am fixing a misclassified language
    • I have included a new sample for the misclassified language:
    • I have included a change to the heuristics to distinguish my language from others using the same extension.

@Alhadis Alhadis requested review from pchaigno and lildude and removed request for pchaigno May 10, 2019 12:01
@Alhadis
Copy link
Collaborator Author

Alhadis commented May 21, 2019

@lildude Any chance we could get this reviewed/merged before the next release?

Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. Sorry I missed this before. I have no idea what a RUNOFF file looks like so I'm gonna take your word for it that this works. 👍

@Alhadis Alhadis merged commit 2bac792 into master May 21, 2019
@Alhadis Alhadis deleted the rno-fix branch May 21, 2019 15:48
@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants