MichalisKamburelis

Michalis Kamburelis edited this page Jan 7, 2017 · 15 revisions
Clone this wiki locally

I’m one of pasdoc’s developers. I use pasdoc most of all to document my Castle Game Engine, an open-source 3D and 2D game engine for Object Pascal. My webpage is http://michalis.ii.uni.wroc.pl/~michalis/


Adding new feature to pasdoc

When adding new end-user visible feature, remember to always

  • add an entry to ChangeLog

  • add a testcase to tests/

  • add a documentation to wiki

  • make sure that mailing list is informed about it


Should documentation be placed inside units interface, or in separate files ?

Stating it differently, which way is better:

  1. Mix the comments with the sources, like javadoc, ocamldoc, pasdoc and many others do. You put comments inside source files, before each declared item, and then documentation generator parses source files and extracts those comments into documentation.

  2. Separate the comments from the sources, like fpdoc does. You put comments inside a separate XML file.

Below are some of my thoughs about this. If you don’t have time to read this, here’s my conclusion: I really don’t think that one way or the other is ultimately better. I prefer the "mix the comments with the sources" way, but I would be absolutely happy also using a documentation generator with an approach "separate the comments from the sources". Well, actually I did, I used and tested fpdoc a little, and I like it very much. No doubt, pasdoc can learn a few things from fpdoc (actually, some of the features I done in pasdoc were inspired by fpdoc behavior).

A common argument against "mix with sources" approach is that when the documentation gets larger, it tends to clutter unit’s interface. I don’t think that this is a real problem because:

  1. First of all, I don’t think about that as a clutter. Units interface specification without any documentation is good for compilers, not for humans. It’s especially important for new people, that read your source code for the 1st time. Documentation makes unit’s source larger, but it’s just an essential part of a unit.

  2. That said, it’s sometimes useful for someone to get a "higher-level" view of the way unit’s interface is written. The solution is: look at documentation generated by pasdoc. pasdoc HtmlOutput presents a nice summary view of each unit and of each class. This summary is clear, compact and uncluttered. Moreover, output of pasdoc presents some other summary views, that are not available when looking at the unit’s source: "all-items" pages (e.g. "all identifiers" page, "all classes", etc.), class hierarchy diagram and even some graphviz graphs with class hierarchy and unit dependency. So if you really prefer to just forget about all this documentation strings for now and just see the list of declared identifiers, then look at documentation generated by pasdoc. BTW, The same is true for fpdoc. Both pasdoc and fpdoc (and any other doc generator for other language) present in final documentation both a nice summary and a nice detailed descriptions for documented units. The difference between fpdoc and pasdoc comes when you look at units source code: if pasdoc was used, the unit’s source code is cluttered with comments. If fpdoc was used, the unit’s source is compact and not cluttered, but you don’t immediately see documentation.

Conclusion: for programmers already familiar with the unit source, it doesn’t matter. They are familiar with unit source, and even an interface largely mixed with a large comments looks "clear and compact" for them. I know that this is the case for me and many of my units. For programmers new to the unit source: "separate from sources" approach looks better, more compact, but is without documentation. So is it really useful ? Ultimately, the answer remains "both approaches, mix with sources and separate from sources, are equally good". In each approach, the generated output provides all the advantages: you can see both a compact summary and a large detailed descriptions there.

Some additional arguments:

  • In "mix with sources" approach, descriptions are "naturally" linked to the appropriate item. In "separate with sources" they are not. E.g. in fpdoc’s XML format you have to specify <element name="ItemName"> for each item ItemName. This means that you must specify the name of your identifier at least twice: in units interface source and in documentation. Program makeskel exists to generate a skeleton with all such <element…​> tags, so it’s not that hard. However, in "mix with sources" you don’t need such program like makeskel at all.

  • IMO [DoDi] external files are great for producing book-style documentation, with overviews, specifications and further background information, which should not clutter the source code. Such information also may be presented to the users in various (natural) languages. * Also the interface part of the units should stay readable, not cluttered by comments that possibly are outdated or became otherwise obsolete, in the actual implementation. In the interface section a one-line abstract should be sufficient for every item. The developer can find more detailed information with the implementation of a procedure or method, where it can be kept better in sync with the actual implementation. We definitely deserve an merge of descriptions, from different sources/places.


Some TODO things

Some larger things that I want to implement in pasdoc. No, I’m not currently working at them, I have many smaller to-dos now. Features below are on my long-term plan.

New output formats

Plain text output

This has two purposes:

  1. To be a demo of a simple output generator, potentially useful to anyone else who would like to write new output generator and would like to see how it can be done (existing html and latex generators are not good examples of "simple generators").

  2. And as a side-effect, we will use it to generate plain-text data that is easier to search by tipue. For now you can’t search using tipue for special characters that are escaped in html using &-references. Search engine will just not see these characters correctly. Also, when you search for a word like class search engine incorrectly finds items with code like <tag class="…​">, i.e. it finds word class inside html tag. This will be fixed by converting RawDescription to plain text. Short description placed in index entry will be as AbstractDescription in plain text, long description will be DetailedDescription + Params + etc. also in plain text.

Fpdoc output

Output to fpdoc input format. Text below will be moved to FpdocOutput page when this will be implemented.

Free Pascal code documenter, fpdoc, is a program distributed and developed as part of FreePascal. It’s goal is similar to pasdoc: parse Pascal units and generate documentation for them. But, unlike pasdoc, fpdoc does not read descriptions of items from the comments placed in unit’s source file. Instead you put your descriptions in a separate XML file. fpdoc reads both the unit’s source file and the XML file with descriptions and generates documentation from them. See fpdoc reference manual online for more information.

When you tell pasdoc to use fpdoc output then pasdoc will write documentation in fpdoc’s XML format. This means that after running pasdoc, you get a file docs.xml (unless you changed "docs" to something else using --name option). Then you can run fpdoc, telling fpdoc to again parse the same unit source files and additionally to take generated docs.xml file. And then you get documentation generated by fpdoc.

Why this is useful ?

  • This means that you can use fpdoc generated output, while at the same time writing descriptions inside Pascal source file using pasdoc’s @tags. I’m not going to tell you whether the documentation generated directly (e.g. to html or latex) by pasdoc is better or worse than the one generated by fpdoc – the purpose of this pasdoc’s output format is to allow you to check this out yourself.

  • Remember that things that pasdoc and fpdoc allow in their descriptions are similar but not exactly the same, so if you’re committed to using only fpdoc’s output, than you’re probably better off switching to writing your descriptions directly in fpdoc’s XML format, instead of using pasdoc to make XML files for fpdoc. And this is the second possible use of this: if you have a large documentation set written in pasdoc-style (with descriptions embedded in units' source files, with pasdoc’s @-tags) and you want to switch for whatever reason to fpdoc, then you can do it: just run pasdoc once with output format set to fpdoc and you’ll get all your documentation converted to fpdoc. Of course, remember that the page you’re reading right now is part of pasdoc’s documentation. This means that people who wrote it may consider pasdoc better than fpdoc :) This is something about your freedom: we are so generous that we even let you to easily switch from pasdoc to other documentation generator program :) Seriously: I don’t think that pasdoc is ultimately better (or worse) than fpdoc, and this output format allows you to combine some of pasdoc’s and fpdoc’s strengths.

Asciidoctor output

A great text-like format, with precise specification (unlike Markdown).

Support for groups of items

Group of items are items that share a common documentation string.

The idea is that you write one documentation string for a group of items. In generated documentation, this group of items is documented as one item, e.g.

=== procedure BlahBlah1; ===

Normal doc string for procedure BlahBlah1.

=== procedure Foo and
    procedure Bar and
    procedure Xyz ===

One doc string that describes at once three procedures Foo, Bar and Xyz.

=== procedure BlahBlah2 ===

Normal doc string for procedure BlahBlah2.

So the idea is that the items in one group not only share the same documentation string, but also that user reading this documentation clearly sees that these three items are documented in one place by one doc string. In other words: no, this can’t be implemented by simply copying the same doc string to a couple of items. This must be clear and readable, so that user reading documentation can immediately see that some items are grouped. So this will require special support in each doc final format.

Syntax 1:

{ One comment that describes at once three procedures
  Foo, Bar and Xyz.
  @groupbegin }
procedure Foo;
procedure Bar;
procedure Xyz;
{ @groupend }

Some rules :

  • where @groupbegin and @groupend are placed within a comment does not matter

  • you can place in one comment only one @groupbegin or one @groupend, but not both

  • Groups must be properly closed: of course you can’t use @groupbegin when you didn’t ended previous group, and you can’t use @groupend when there is no current group started, and you must close all groups.

Syntax 2: Alternative syntax that produces exactly the same results, is more troublesome to write but also gives more possibilities for human writing docs :

{ One comment that describes at once three procedures
  Foo, Bar and Xyz. }
procedure Foo;

{ @groupwith(Foo) }
procedure Bar;

{ @groupwith(Foo) }
procedure Xyz;

Rule:

  • comment that does have @groupwith() within should not have anything else (only whitespaces) inside. In particular, you can place only one @groupwith() inside comment.

  • item referenced by @groupwith() must have some comment itself (either explicit, or because it’s between @groupbegin/end, or because it has @groupwith())

Two syntaxes can be mixed, e.g. 3rd equivalent version of the same example is

{ One comment that describes at once three procedures
  Foo, Bar and Xyz.
  @groupbegin }
procedure Foo;
procedure Bar;
{ @groupend }

{ @groupwith(Foo) }
procedure Xyz;

Rules not dependent on any syntax:

  • whole group must be within the same scope, i.e. all it’s items are either within the global unit scope, or all it’s items are within the same class scope and with the same access specifier (access specifier = one of public, published, etc.) or within the same record.

  • For now, we should probably add additional constraints that can be removed in the future (but removing them now would be problematic, i.e. it’s difficult to design nice docs when you want to mix e.g. some type + some const
    some procedure in one group):

  • global procedures and functions may be grouped

  • constants may be grouped

    (so you can’t e.g. mix procedures with constants in one group)

  • properties and methods of the same class within the same access specifier may be grouped. TODO: Maybe we should forbid grouping properties with methods in one group ? It would ease the task of generating docs.

  • DoDi: grouping properties together with their read/write specifiers, i.e. fields or get/set methods. This can be done (or supported) by the parser.

  • DoDi: grouping events together, could be done by @@groupwith like means. Remember that such declarations do not necessarily occur in contiguous blocks, and each one consists of a field, a property, and an event handler type.

  • DoDi: When we continue to implement new syntactical features, like local types or variables in classes, or declarations of records in records, or parameter lists, then we have to face nested scopes in places, where the generators currently do not expect or allow for appropriate tables or pages. Some general redesign should be done, which allows for an integration of all the wanted features in an extended model of grouping and nesting declarations and descriptions. ** For enumerated type values, only consecutive values of the same enumerated type are allowed. So, practically, always use @groupBegin and @groupEnd. @groupWith is practically useless for them. For example, this should be allowed (real-world snippet from my game):

  TSoundType = ( stNone,

    { Player sounds.
      @groupBegin }
    stPlayerSuddenPain,
    stPlayerPotionDrink,
    stPlayerDies
    { @groupEnd });

Note that multiple variables defined at once, like this:

{ Some docs for A, B, C } A, B, C: Integer;

would be automatically grouped together. Currently this is equivalent to

{ Some docs for A, B, C }
A: Integer;
{ Some docs for A, B, C }
B: Integer;
{ Some docs for A, B, C }
C: Integer;

which means that description "Some docs for A, B, C" is copied three times in the documentation. This is bad, because the information that items A, B and C are documented togther, at once, is lost (i.e. user reading the docs does may not immediately see this).

Another advantage of this would be when we generate "All Functions and Procedures", "All Identifiers" etc. listings. If two (or more) items that are in the same group will be shown in successive rows of these listings (e.g. when items are overloaded versions of the same proc, and they are wrapped in one group) then we can squish them and present them as one table row (because all these items have the same description).

Sections

Support for sections, that divide unit into a couple of separate blocks but are not tied to any particular item (something in the spirit of ocamldoc’s "{1 Section title}"). Format is

{ @section(Section title) Additional comments about section. }

E.g.

{ @section(Utilities that deal with strings)
  Every string routine in this section is able to handle MBCS strings.
  Unless otherwise noted, all string comparisons are case-sensitive. }

Page of each unit should present hyperlinked table of contents of sections within this unit. Sections are only presented when looking at unit’s page.

Also LaTeX-like @subsection and @subsubsection could be nice ? Or (copying ocamldoc’s idea) just add a number to each section, i.e.

@section(1 Main section title)
@section(2 Sub section title)

is used instead of

@section(Main section title)
@subsection(Sub section title)

? I think that I prefer using "sub" prefixes, but this is negotiable.

Of course, it is not mandatory, not even desirable, to divide every unit you document into sections. This feature has it’s best use when you have a large unit with many global procedures/functions – then by using sections you can nicely indicate to reader that routines in this unit can be logically divided into separate sections, like

  • "routines that deal with strings",

  • "routines that deal with filenames",

  • "routines that deal with something-else".

Note that sections and groups (proposed in the previous point) somewhat complement each other.

  • Groups allow you to easily group together things that are very closely related, so closely that they are documented by one documentation string. Groups make it both easier to write documentation, and easier to reader to see that these things are documented at once.

  • Sections allow you to group more things together, that are somewhat loosely related, so they all deserve a common description, but also every item inside section has it’s own specific documentation.

In summary, this feature is like splitting a large unit to many "sub-units" in documentation.

More wiki-like syntax for pasdoc descriptions

Wiki-like syntax means that you can achieve some (formatting) effect without using any @-tag. Some existing features of pasdoc descriptions are already wiki-like syntax (see WritingDocumentation):

  • Empty line creates a paragraph

  • Dashes rules (--- creates m-dash, — creates n-dash)

  • Automatic recognizing of URLs

More wiki-like features are planned. The following things should be achievable with wiki-like syntax:

  • marking text italic/bold (equivalent to @bold and @italic tags)

  • marking text as simple code (@code tag)

  • making lists (@orderedList, @unorderedList, @definitionList, @item, @itemLabel; preferably some auto-detection of @itemSpacing should also be done here)

  • making tables (@table, @row, @cell; preferably also @rowHead)

Notes:

  • Wiki-like syntax should be carefully chosen. Wiki-like syntax adds additional meanings to some simple constructions, so if we design wiki-like rules badly, people will too often accidentaly do something. At the same time, wiki-like syntax must look simple and readable in source code, use short sequences of characters to mark things (otherwise there will be no benefit of using wiki-like syntax over traditional @-tags approach). Negative example of bad wiki-like syntax is LaTeX. There are many special rules and exceptions in LaTeX syntax, and often they are things that are very seldom used in practice. This means that LaTeX writers can easily activate some special feature by accident, and this is bad.

  • Introducing wiki-like syntax would break pasdoc compatibility badly. So it will have to be actvated by --wiki-syntax. Alternatively, if many people will support this decision, we can make wiki-like syntax active by default and provide a way to turn if off by --no-wiki-syntax. Wiki-like syntax is always just a shortcut for equivalent functionality of @-tags, so if someone prefers to not use wiki-like syntax, it’s OK.

  • Note that we can’t directly borrow ideas from some wiki engines (like moinmoin). That’s because wiki engines usually say that line-break creates a new paragraph. This means that wiki pages usually have very long lines. That’s not a problem for wiki engines, because edit boxes in WWW browsers will wrap text, and if someone uses external editor then it’s easy to explain to him please don’t introduce line-breaks without a purpose of creating new paragraph. But pasdoc can’t treat line-break as a new paragraph. Pasdoc must treat line-break just as some whitespace. That’s because pasdoc descriptions are used within Pascal source files, and people don’t like to have long lines in source files, and are often uncomfortable with viewing source files that have too long lines. This may seem like a small thing, but actually this means that many other rules of wiki engines must be different than pasdoc wiki-like syntax. * Note that we can’t diectly borrow ideas from aft because aft uses tab character to mark various things. But people are often uncomfortable with using tab characters in Pascal source files. So for pasdoc, tab character must always mean "just some whitespace".

  • Auto-linking may be treated as some form of wiki-like syntax (shortcut for @link). But it is activated by different command-line option, --auto-link, and can be locally deactivated by @noLink tag.

  • Programs that we could borrow some ideas from:

    • txt2tags. My favourite standalone text formatter. I practically stopped using LaTeX right now, I prefer writing in txt2tags markup. It’s also better than aft because it doesn’t make problems when you don’t write in ISO-8859-1 character set (but e.g. in ISO-8859-2 set that includes Polish chars).

    • ocamldoc. pasdoc equivalent for ocaml. ocamldoc has some nice bits of wiki-like syntax for some things, e.g. for lists.