Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some HTML help output glitches #194

Merged
merged 3 commits into from Sep 22, 2021
Merged

Conversation

olebole
Copy link
Member

@olebole olebole commented Sep 17, 2021

The HTML output of the help has a number of problems which are fixed with the change of the lroff2html parser in this PR:

  • Bad intro (should start with <!DOCTYPE html>)
  • The title is part of the header, not the body,
  • Comments are marked with <!-- comment -->, not with <! comment >
  • the NAME attribute of <A> is deprecated and should be replaced by ID, not necessarily of an <A> tag.
  • Paragraphs are marked with <P> tags, not witl <UL> tags
  • The logic often mixed <P>, <PRE>, <DL> tags
  • Many anchors had the same name; especially since all levels of <DL> tags were armed with an anchor. This is reduced to only the first level.
  • use lowercase tags

Using a simple validating parser (lxml.etree.HTMLParser), this helped to find a number of problems in the .hlp files:

  • forgotten double quotes
  • usage of double quotes instead of arcsec unit (they confuse the parser as the parser starts a <tt> here)
  • format instructions within double-quoted strings (they confuse the parser as they are copied verbatim)
  • junk at the end of file

They are also all corrected.

With these changes, the output of a correct input file is (almost) correct. The only remaining systematic problem is that still some <DL> tags have the same id. This is due to the simple logic behind them: the id is just the first word of the term, which is not unique f.e. in the REVISIONS section (there the first word is always the task name).

The resulting (user oriented) HTML help pages are here: https://iraf.readthedocs.io/

@olebole olebole force-pushed the fix_html branch 6 times, most recently from 3d92b63 to 9e17a56 Compare September 20, 2021 15:17
 * Bad intro (should start with <!DOCTYPE html>)

 * The title is part of the header, not the body,

 * Comments are marked with <!-- comment -->, not with <! comment >

 * the NAME attribute of <A> is deprecated and should be replaced by
   ID, not necessarily of an <A> tag.

 * Paragraphs are marked with <P> tags, not witl <UL> tags

 * The logic often mixed <P>, <PRE>, <DL> tags

 * Many anchors had the same name; especially since all levels of <DL>
   tags were armed with an anchor. This is reduced to only the first
   level.

 * use lowercase tags
These are mostly

 * forgotten double quotes

 * usage of double quotes instead of arcsec unit (they confuse the
   parser as the parser starts a <tt> here)

 * format instructions within double-quoted strings (they confuse the
   parser as they are copied verbatim)

 * junk at the end of file

 * forgotten .le, .fi, or extra .nf
@olebole olebole merged commit 0b4d3ce into iraf-community:main Sep 22, 2021
@olebole olebole deleted the fix_html branch September 22, 2021 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant