Redesign of the Perl 6 Documentation System

Student: Antonio Gámiz Delgado [antoniogamiz10@gmail.com]
Project Idea: Based on A redesign of the Perl6 Documentation System

Abstract

Currently, pod6 files are processed by various scripts and modules (htmlify.p6, Pod::To::HTML, Pod::To::BigPage,...), that have redundant functionality, low level of testing, tight coupling between presentation rendering and source data. Even pod6 files are compiled several times.

What I intend to do in order to change that? I have three objectives:

Improve the stability of the system facing changes without provoking undesirable side effects.
Speed up the build process to be able to make changes faster.
Put together a lot of work in the docs done by several contributors during the last few years.

Deriverables

mini-docs repository:
- Find correct doc subset (conditions described in Minidoc repository section ).
- Make a new repository in the Perl6 org, add the files and document it.
- If everything is correct, close issue #2529
Link Health tool:
- Functionality to scrape recursively all links in docs.perl6.org.
- Functionality to store and classify all links in folders by status code.
- Functionality to compare links between different executions and throw appropiated warnings.
- Document all utilities made.
- Publish in Perl6 Ecosystem.
doc/lib/* Spinning-off and Cache System:
- New independent Perl6::Documentable and Perl6::Type module documented and tested using mini-doc.
- Perl6::Documentable::Registry:
  - in-memory-cache support.
  - Dependency tree.
  - Extensive documentation plus tests.
Pod::To::HTML and GitHub:
- Error fixing in Pod::To::HTML.
- Use templates instead of hardcoded html code.
- Functionality to read pod6 files using Perl6::Parser.
- Extensive documentation plus test coverage for correct html generation.

Timeline

May 6-27: In the Community Bonding period I will learn more about Perl6 ecosystem (testing methods, standards and code base structure), get to know the community itself and maybe I will try to make some progress because the first two weeks of June I have university tests. I will compensate this time in the rest of the coding period.
May 27 - June 16 (3 weeks): Minidoc repository plus Perl6::LinkHealth development.
June 17 - July 14 (4 weeks): Spinning off lib modules plus Perl6::Documentable::Registry in-memory-cache support.
July 15 - August 11 (4 weeks): Pod::To::HTML fixing plus GitHub rendering pod files (hopefully).
August 12 - August 19 (1 weeks): accommodate delays, complete documentation, revise tests and clean up.

Note: you will be able to follow at all times the progress of the project in this repository as I'll be writing and keeping a journal with everything I do. It will be updated every two or three days.

Implementation

All these steps have as a target A new documentation system, that some people in the community have already started. Currently, the doc repository contains several modules that could be independent. In addition, there is a huge lack of test coverage so I will try to change that, develop new tools to improve the documentation process and upgrade existing ones to make them faster.

Mini-doc repository

Currently, site-generation tools are tested by generating the entire site; hence, it takes a great amount of time to complete and if some test shows an error, you need to furtherly wait so as to check whether or not it has been fixed. Thus, a mini-doc repository will be made (as discussed in #2529).

This repository will contain a self-contained subset of the current doc folder. The mini-doc repo needs to fulfil some conditions. It has to be:

Big enough to cover most of the use cases.
Small enough to be lightweight: because this repo is expected to be downloaded from the site-generating tools to run the tests.
Self-contained: this means that a doc site can be generated from these files. For instance, Mu, Cool and Any could be chosen.

The repo structure will be something like:

doc/
  Language/
    *.pod6
  Programs/
    *.pod6
  Type/
    *.pod6

At the beginning, this repo will be an exact clone of the current doc repository. When we find the correct subset and we check that a doc site can be generated without any problems, every tool or file (but *.pod6) will be deleted.

Link Scraper

Link problems have been recurrent for a long time, issues like #561 (with top priority), #1825 (404 errors) or #585 (doubled links). As resulf of that, we need a link-scraper to gather all existing links in docs.perl6.org in order to know how many links are failing and why. This scraper will be used each time an important change is made to the main doc repo to make sure that the number of broken links is lower, or at least, constant between changes and to track several errors.

I will use the checklink tool and Cro::HTTP to check the links health. The process will start with the doc main page and will look for new links recursively.

The output of this tool will be stored in a directory called links, which will have the following structure:

links/
  200/
    info.csv
  404/
    info.csv
  xxx/ # whatever http error code
    info.csv
  all/
    hh_dd_mm.csv

Each info.csv file (csv format has been chosen but support for json could be considered) will contain all links that have thrown the error code of its directory name. Each line of these files will be like:

link, status_code, response_message, site_where_the_link_was_found

[status_code, response_message, site_where_the_link_was_found] are stored to have some debug information about problematic links.

In addition, with the idea of keeping track of all existing links, an extra folder will be handled by this tool: all. This directory will contain csv files whose name follows the format: time_day_month. Each of these files include all the links found in one execution and will be compared each time a new execution finishes throwing a warning if the number of links is lower (or greater) than before to check if some links have been lost.

We can publish this tool as a health checker specialized in the Perl6 Docs ( maybe Perl6::LinkHealth would be a good name).

doc/lib/* Spinning-off and Cache System

Right now, there are several modules defined in the lib folder that can be taken apart to independent modules in the Perl6 Ecosystem. As #1937 and #2573 issues say, Perl6::Documentable, Perl6::Documentable::Registry and Perl6::Type need a test suite covering most of the use cases (currently there scarcely are any). Moreover, documentation about these modules almost does not exist, so new people that need to change or fix something about them (like me) have to guess what to do. Hence, a detailed documentation will be made for them.

Moreover, the current cache-system relies on precomp pod6 files, which are read and then used by the tools (in htmlify.p6 using Registry: line). So the thing is, why to handle precomp files, which need to be read each time they are used, instead of handle everything in memory?

So, this is the plan:

Take Perl6::Documentable and Perl6::Type apart from the main repo to independent modules, document them, use mini-doc repository to test them faster and lastly integrate them again.
Take Perl6::Documentable::Registry apart and add in-memory-cache support, using a dependency tree to invalidate all files affected by a change and only recompile that files instead of the whole set (maybe using Pod::Load).
Add tests:
- First set of tests: this will cover that each function behaves as expected.
- Second set of tests: cheking that the dependency tree invalidates and recompiles the correct pod6 files.

If all of these steps have been made correctly, the integration of the new modules with the main repo should only be a matter of installing them and change some paths.

`Pod::To::HTML` and GitHub

As you can see on this issue #55, Pod6 was close to get rendered to HTML in GitHub, but due to the problem with pod files being compiled has not made this possible. So, a new parser for Perl6 has been released, developed by Jeff Goff. This parser could be used to process pod files without executing anything in them. Hence, using this new parser, we could get GitHub to render pod files!

This would mean that we will have to use Pod::To::HTML to render pod files, but currently, this module has some problems. Then, first we need to solve them and maybe start using templates (Template::Mustache) in the rendering process, as Richard Hainsworth has done in Pod::Render.

Finally, this is the plan for this part:

Fix the most important errors in Pod::To::HTML.
Start using Mustache templates.
Create a test suite covering each render function (to avoid current html errors present in several pages in the docs).
Add necessary functionality to Pod::To::HTML to read pod6 files using Perl6::Parser.

Closing

Eventually, in my opinion, it will be necessary to update the state of the doc system and tooling in order to plan what to do next. Hence, I will write a post in the wiki section of the doc repo, explaining the work done so far.

About me

I'm currently studying a double degree in Computer Science and Mathematics at the University of Granada, Spain. I have experience with C++, C, Java, Javascript, CSS, HTML, SQL, Python, Ruby and Perl6!

You can check my activity on the community here. I have participated in a couple of squashatons to complete docs! Moreover, as a future mathematician, I'm also writing my first math module in Perl6! You can check it out here.

From time to time I give some talks (like this). I'll be talking about privacy and security on the net in JASYP 2019! I will also organize a Perl, Perl5, Perl6 devroom in esLibre congress!

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
images		images
.gitignore		.gitignore
README.md		README.md
ideas-LinkHealth.md		ideas-LinkHealth.md
work-report.md		work-report.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

.gitignore

.gitignore

README.md

README.md

ideas-LinkHealth.md

ideas-LinkHealth.md

work-report.md

work-report.md

Repository files navigation

Redesign of the Perl 6 Documentation System

Abstract

Deriverables

Timeline

Implementation

Mini-doc repository

Link Scraper

doc/lib/* Spinning-off and Cache System

`Pod::To::HTML` and GitHub

Closing

About me

About

Releases

Packages

antoniogamiz/perl6-gsoc-application

Folders and files

Latest commit

History

Repository files navigation

Redesign of the Perl 6 Documentation System

Abstract

Deriverables

Timeline

Implementation

Mini-doc repository

Link Scraper

doc/lib/* Spinning-off and Cache System

Pod::To::HTML and GitHub

Closing

About me

About

Resources

Stars

Watchers

Forks

`Pod::To::HTML` and GitHub