Skip to content

Refactor documentation generator to be modular #262

@coolharsh55

Description

@coolharsh55

Problem: Currently the documentation generator is made up of three parts - 100.py for downloading CSVs, 200.py for producing RDFs, and 300.py for producing HTMLs. Except for CSVs, both other scripts will produce outputs for all configured extensions which causes git to show modifications to work that has not been changed (since RDF formats are not consistent in structure or blank nodes). This then causes issues with committing changes as manual work is needed to resolve the unwanted changes and only add those items that were intended.

Problem: The vocab_management.py is a large file made up of various configurations that dictates where to find source files, metadata for RDF and HMTL, and other items. It is necessary to be edited if any of these details change e.g. when creating a new extension. The file is large, there are multiple places corresponding to each extension, and there is a high chance that something is missed.

Problem: For people who are not me, it is likely to be confusing and cumbersome to figure out how this code works. Documentation is available, but is at high risk of not being up to date, and any changes to be made require knowledge of a highly technical nature which should not be necessary to simply generate / update files.

Solution: Change the way the documentation works to be:

  • Single executable script that takes parameters to do specific things like update CSVs, produce RDF and HTML. It calls other scripts internally.
  • Modular outputs for each process i.e. it should be possible to generate outputs for a specific extension without any other outputs also being generated.
  • Configuration should be modular for each extension and all configurations for a given extension should reside in a single place/file. E.g. for extension X, the CSVs to download, the RDF and HTML paths, the vocabulary metadata should all be in one file.
  • Documentation (in wiki) should be updated to have simpler instructions for how to regenerate documentation, how to update a typo, how to submit a PR using the above - which should result in a simpler and replicable process.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions