Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add URLs to rules #12

Open
danielnaber opened this issue Sep 16, 2013 · 12 comments
Open

add URLs to rules #12

danielnaber opened this issue Sep 16, 2013 · 12 comments
Labels

Comments

@danielnaber
Copy link
Member

Any error detection rule in LanguageTool (i.e. the rules in grammar.xml) can have an URL attached which provides more information about the error. Find a rule that could benefit from such a rule and add the URL. It should point to some reputable site where we can assume the URL will still be there in a few years.

Requires: knowledge of XML

@Mailaender
Copy link
Contributor

I wonder whether this wants to be a http://wiki.languagetool.org/ sub-page where the rule is explained, limitations are listed and the grammar/spelling rule is referenced with reputable sites.

@milekpl
Copy link
Member

milekpl commented Apr 27, 2014

Well, we haven't documented the rules in such detail because they are usually improved over the years and writing docs is also time-consuming. I think external references are enough. However, we could host explanations of rules -- including these links -- but it's then a major effort to maintain them.

@PanderOpenTaal
Copy link
Contributor

The rules in the Dutch grammar.xml are well documented with urls to reputable sites. :)

Here is a proposal to make it easier to review urls in grammar.xml files. Create an XSL that allows the XML to be rendered in a web browser (directly or after XSLT) showing:

  • name or abbreviation of the language
  • datetime of the last edit
  • categories with at least name
  • rulegroups with at least id, name and clickable url
  • rules with at least id, name, messages and clickable url

Upon each change of a grammar.xml file, this HTML page should automatically get regenerated. This can be part of https://languagetool.org/development/

This is especially handy for editorial people that do not have programming skills or are lost in XML and by this is a way to get them more involved in LanguageTool. I know people who would like to review urls and add propose new ones (to a developer) but at the moment it is too difficult for them to do that on their own.

Original reason why I found this issue is the following. I would like to change the grammar.xsd to allow for more than one url per rule and rulegroup. Can somebody check if this has no negative side effects for LT? If there are none, I will have some pull requests (one for the XSD and one for Dutch grammar.xml with addtional urls). The rationale is that the Dutch language authority itself has multiple pages per rule or advice.

@danielnaber
Copy link
Member Author

Just changing grammar.xsd to allow more than one URL will (probably) not work, the Java code needs to be adapted, too. We haven't done that yet as we prefer to give exactly one URL, it's easier for the user. If they want more opinions, they can always google anyway. Aren't the pages of the Dutch language authority linked, or don't they have some logical order?

@ghost
Copy link

ghost commented Sep 17, 2015

There are multiple sources. I once suggested to add data to the opentaal site with one page per rule, showing some text and maybe some sources.
Op 17 sep. 2015 10:34 schreef Daniel Naber notifications@github.com:Just changing grammar.xsd to allow more than one URL will (probably) not work, the Java code needs to be adapted, too. We haven't done that yet as we prefer to give exactly one URL, it's easier for the user. If they want more opinions, they can always google anyway. Aren't the pages of the Dutch language authority linked, or don't they have some logical order?

—Reply to this email directly or view it on GitHub.

@PanderOpenTaal
Copy link
Contributor

At the moment there are links to OnzeTaal which is nog the Dutch language authority (Taalunie) but it has very important information. The pages in both OnzeTaal and Taalunie have links themselves to other relevant pages on their website. So it is not that we want to add all these pages but for each source we would like to add at least one page.

In stead of one link, that would be maximum two or three links. It is better not to do this on the site of OpenTaal as it would introduce an extra dependency and and extra place to manage the links.

@danielnaber, if you point out where the urls are processed I would like to help out making the offering of more than one url possible. As an intermediate step, if the XSD would allow it in the XML, at least the current implementation should still work with only one url.

@janschreiber
Copy link
Contributor

In line 2 of grammar.xml, there is a reference to a stylesheet called
print.xsl
(~\languagetool-core\src\main\resources\org\languagetool\rules\print.xsl).

It should enable users to open grammar.xml in a browser and display the
rules in a way that is much easier to read, with URLs transformed into
clickable links, wrong examples in red, replacements in green, and
correct examples in gray. For some reason, though, it only works in
Internet Explorer for me.

--Jan

Am 17.09.2015 10:14, schrieb Pander:

The rules in the Dutch grammar.xml are well documented with urls to
reputable sites. :)

Here is a proposal to make it easier to review urls in grammar.xml
files. Create an XSL that allows the XML to be rendered in a web browser
(directly or after XSLT) showing:

  • name or abbreviation of the language
  • datetime of the last edit
  • categories with at least name
  • rulegroups with at least id, name and clickable url
  • rules with at least id, name, messages and clickable url

Upon each change of a grammar.xml file, this HTML page should
automatically get regenerated. This can be part of
https://languagetool.org/development/

This is especially handy for editorial people that do not have
programming skills or are lost in XML and by this is a way to get them
more involved in LanguageTool. I know people who would like to review
urls and add propose new ones (to a developer) but at the moment it is
too difficult for them to do that on their own.

Original reason why I found this issue is the following. I would like to
change the grammar.xsd to allow for more than one url per rule and
rulegroup. Can somebody check if this has no negative side effects for
LT? If there are none, I will have some pull requests (one for the XSD
and one for Dutch grammar.xml with addtional urls). The rationale is
that the Dutch language authority itself has multiple pages per rule or
advice.


Reply to this email directly or view it on GitHub
#12 (comment).

@PanderOpenTaal
Copy link
Contributor

See #312

@davidemdot
Copy link
Contributor

Hallo,

Could I ask what are you looking for nowadays? Adding more than a URL per rule, showing them in a specifical way (where?)... Please give me more information for trying to enhance it.

@danielnaber
Copy link
Member Author

Hi @davidemdot, adding more than one rule should rarely be needed. From my point of view, the task is to make sure there is a URL if the rule actually needs more explanation.

@davidemdot
Copy link
Contributor

Ok, thanks. So I will take it into account when I add new rules, and also I will check if there is any missing URL in the Spanish grammar.

linuxscout added a commit to linuxscout/languagetool that referenced this issue Dec 28, 2019
* add more rules

* add ignore

* Add new rules

* update rule to xml

* Add Prefix and suffix extraction

* Update tagger with stemmer for Waw and Pronouns

* Add new tagging system

- Re-build dictionary according to new tagging sets.
- update rules according to new tagset
- Add more tool words in added.
- Implement arabic tagger to handle enclitics and procletics for nouns.

* update tagset and dictionary description

* Add first version of disambiguation rules

* Add more rules

* update Arabic tagger

* Add more rules

* update verb dictionary

* Delete rules.css

* Delete rules.css

* Delete rules.xsd

* Delete run4.8.sh

* Delete testrules-arabic.sh
linuxscout pushed a commit to linuxscout/languagetool that referenced this issue Dec 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants