Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Static HTML output with Sphinx #154

Merged
merged 51 commits into from
Apr 6, 2023
Merged

[WIP] Static HTML output with Sphinx #154

merged 51 commits into from
Apr 6, 2023

Conversation

n8willis
Copy link
Owner

This is an attempt to make the document set buildable, locally, as static HTML pages, so that they can be easily used outside of the GitHub web interface.

Obviously there are some bits still to be hammered out, mainly getting the MyST markdown parser to play nicely with the RestructuredText automatic-Table-Of-Contents magic. But, at the moment, the files in the _build/html/ directory should be correctly linked together and readable while offline.

Comments and suggestions on the CSS and HTML are welcome! Looking at it, the tables and code blocks need some attention. Other itches probably need scratching, too.

@n8willis n8willis mentioned this pull request Dec 17, 2022
n8willis and others added 16 commits February 26, 2023 15:12
…, exlcuding tt, code, and literal quotations.
…ily tweaks in-browser. Used for 'font-size-adjust' rules.
…ing and exclude-patterns for non-docs MD files to conf.py.
"Sphinx markup" branch merge.
@n8willis n8willis marked this pull request as ready for review March 14, 2023 11:42
@n8willis
Copy link
Owner Author

n8willis commented Mar 14, 2023

Putting this open for review now!

This branch adds a Sphinx-based local HTML page-set as the output for the documents.

  • Long-term, the goal is to move away from "only usable on GitHub.com" docs.
  • To build these, you'll need Sphinx; there are more detailed instructions & links in the BUILD.md file.
  • When you build them, you end up with a local set of files.

Feedback wanted about output

It is important to know about everything that breaks, however, the most important thing at this stage is how well the generated HTML pages work.

There is a lot of tweaking and formatting involved in the page output, and because the formatting affects the semantics of the docs for people who read them, ambiguities and confusing things really need to get caught. So, in all seriousness, no minor nitpick is too small.

A couple of specific things to call out:

The Bengali document previews some deeper changes than the others

You can start there if you want to see those. Namely:

  • It re-titles the <bng2> "shaping stages and steps" links within the documents. Previously (and in the other docs), they just had numbers: "3.3 akhn" or "4.2 pre-base matras". This caused a big problem when trying to add generated numbering for the pages and sections within Sphinx ... which is an important goal for referencing. Internal to each doc, though, the text has always tried to say things like "stage 3, step 3"; the change is that now the HTML sections are called that, too. I think it helps, but let me know if it hurts.
  • It adds a <samp> element wrapper to all strings that represent an input or output sequence. In looking at the markup, I decided this was important. We already have several other varieties of semantically-marked-up material in the same document, like programming tokens and literals (i.e., BASE_POS_LAST or U+200D). So things like "Consonant,Halant,Ra" did not fit into that category, but needed to be distinguishable because they're important. There are definitely some grey areas where you could make a case for one markup or the other; it might not be possible to please everyone in those cases, but definitely comment on any missed ones you see.
  • It wraps each image in an HTML <figure> element. This lets them have a caption and a number, which I think helps their overall readability, and might be useful to extend to things like tables and big code blocks as well (e.g., regular expressions). The only down side is that the markup to do it in Sphinx looks bad on GitHub's built-in repo view. It could just all be search-and-replaced with inline HTML elements; I'd like to know.

It uses local copies of the Source fonts

This branch started out using the Souce Serif/Sans/Code family from Adobe be Google Fonts, but the static versions served remotely didn't allow for tweaking the CSS well enough or using some of the features. So, instead, the variable-font files are included in the build _static tree. They have their own license (OFL), which is included in the folder.

Numbering everything is not easy

I believe that having per-section numbering of things is critical to new readers keeping track of where they are and to how people reference or talk about things. It is proving kind of tricky to get Sphinx to follow orders in that regard, however.

In particular, the "shaper" section docs themselves are nested for Indic2 and Arabic-like, and I haven't yet gotten the multitoc-numbering extension to behave. It looks like it's breaking on double-nested subtrees being the top item of a TOCtree, but I haven't yet tried to dig in. There is an open issue on that upstream with the extension here: executablebooks/sphinx-external-toc#89 ... which looks like it's not getting active looks. So if you want to dive in to that, separate from all the HTML juggling, that'd potentially be worthwhile.

Regardless, feedback on whether or not the numbering is useful would be helpful (remember to compare Bengali and other docs, though, due to the change mentioned above)

@n8willis n8willis merged commit 3eae1e8 into master Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant