# 0.8 - autogenerate module docs from metadata #991

Closed
opened this Issue Sep 4, 2012 · 40 comments

### 5 participants

This will allow for module introspection, building the docs in multiple formats, and even future crazy stuff like TUIs and WUIs that can help you write playbooks.

The format should include information about what fields are required, types, values, notes, and an introduction section.

Needs of various markup (bold/italic) should also be considered for the notes/intro section.

we could keep the metadata in the module

this would also allow things like ansible-explain modulename on the CLI

Playing with this ATM for LaTeX conversion. The following idea:

• Module description in a YAML file
---
name: mysql_user
version: 0.6
short: Adds or removes a user from a MySQL database.
long: Requires the MySQLdb Python package on the remote host. For Ubuntu, this is as easy as apt-get install python-mysqldb.
options:
-
opt: name
mandatory: true
desc: name of the user (role) to add or remove
-
-
opt: magic
version: "0.9"
desc: does lots of magic things...
notes: >
Both login_password and _login_username_ are required *when you* are passing credentials. If none are present, the module will attempt to read the credentials from ~/.my.cnf, and finally fall back to using the MySQL default login of root with no password.

Example privileges string format:

• Parse and render with Jinja2 (eating our own dogfood, so to speak :-)

#!/usr/bin/env python

import os
import sys
import yaml
import codecs
import json

variable_start_string="@{",
variable_end_string="}@",
)
template = env.get_template('latex1.j2')

print "DOING ", filename
try:
f = codecs.open(filename, 'r', 'utf-8')

f.close()

if not doc:
print "Can't load file %s" % filename

doc['filename'] = filename
doc['docuri'] = doc['name'].replace('_', '-')

print(json.dumps(doc, indent=4))

print template.render(doc)

except KeyboardInterrupt:
sys.exit(1)

dir = 'modules'

for filename in os.listdir(dir):
if filename.lower().endswith(('.yaml', '.yml')):

• Template for LaTeX I need:
%--- @{ name | upper }@  ----  from @{ filename }@ ---

\mods{@{name}@}{@{docuri}@}{@{short}@}

%: -- module options

{% if options is defined -%}
\begin{xlist}{abcdefghijklmno}
{% for o in options  -%}
{% if o['mandatory'] is defined -%}
\item[\man\,\C{@{ o.opt }@}]
{% else -%}
\item[\opt\,\C{@{ o.opt }@}]
{% endif -%}

@{ o.desc }@
{% if o['version'] is defined -%}
*** New in VERSION @{ o.version }@
{% endif -%}

{% endfor -%}
\end{xlist}
{% endif -%}

• Resulting LaTeX from a template:

%--- MYSQL_USER  ----  from modules/mysql_user.yml ---

\mods{mysql_user}{mysql-user}{Adds or removes a user from a MySQL database.}

%: -- module options

\begin{xlist}{abcdefghijklmno}
\item[\man\,\C{name}]
name of the user (role) to add or remove
\item[\opt\,\C{magic}]
does lots of magic things...
*** New in VERSION 0.9
\end{xlist}


Definitely need some form of markup in strings (desc, notes) to identify module names, emphasis, urls, etc.

Is the idea to put this metadata in the module iself?

Like with a docstring?

Also, could the options in the metadata be used to generate the "magic"?

ie

opts:
- opt: name
required: true  #(changed to required to match verbiage from magic)
desc: name of the user (role) to add or remove
choices:
- foo
- bar
- baz
default: foo
aliases:
- nom
- nombre
- whatchacalit


Of course we'd have to keep the old way for backwards compatibility.

Any updates on this @mpdehaan ? ;-)

A prototype of WebUI (based on AngularJS) is available from:

Each module have a JSON data format similar to:

{
name : 'command',
description : 'The command module takes the command name followed by a list of arguments, space delimited. The given command will be executed on all selected nodes. It will not be processed through the shell, so variables like “$HOME” and operations like "<", ">", "|", and "&" will not work. As such, all paths to commands must be fully qualified.', params : [ { name : '(free form)', description : 'the command module takes a free form command to run', required : false, default : 'N/A' }, { name : 'creates', description : 'a filename, when it already exists, this step will NOT be run', required : false, default : '' }, { name : 'removes', description : 'a filename, when it already exists, this step will be run', required : false, default : '' }, { name : 'chdir', description : 'cd into this directory before running the command (0.6 and later)', required : false, default : '' } ], examples : [ { code : 'command /sbin/shutdown -t now', description : 'Shutdown the server' }, { code : 'command /usr/bin/make_database.sh arg1 arg2 creates=/path/to/database', description : 'Run the make_database.sh script with arg1 and arg2 only if file /path/to/database do not exist.' } ] } Next steps: 1. Make sidebar fixed on page scroll 2. Add support for HTML in description 3. Add support for module avilability version 4. Add filter (on header bar) to select ansible version (0.6, 0.7, ...) and display only module for specified version 5. Make header-bar fixed on page scroll Some other suggestion? Damn, that's gorgeous stuff, and I'd like to include it in the UI -- I'm 100% sure it's going to be bootstrap/angular and I should at least get a 'hello world' up in the next few days so you can help out. I would love to have you aboard. What you have is pretty good, but I think the YAML needs to be tweaked such that it's a hash of option names to values though -- which I was wanting to wait on this, like: { 'service' : { options: { 'foo' : { 'required' : true, 'choices': [ 'a', 'b', 'c'], 'description' : '', 'default' : '', 'aliases' : [], } } examples: [], notes: [] } }  So if someone wants to make that happen, add the documentation to just one module file, and also write the RST generator script and wire it into the docs Makefile, and send me a pull request, and if that looks ok we can start annotating all of the rest. This would then need to be in a DOCUMENTATION doc string in each module. Because the modules are really scripts and do not have a "name" == "main" we will probably have to parse the code out of them, but that will not be terribly hard. Just start reading at DOCUMENTATION = """ and stop reading at the next """. doc string inside each module file, and we'd have to write a script to generate the ".rst" files for the doc site. Once we had that, full speed ahead! Is this what the DOCUMENTATION string should look like? DOCUMENTATION = """ --- get_url: description: Downloads files from HTTP, HTTPS, or FTP to the remote server. The remote server must have direct access to the remote resource. options: - url: description: HTTP, HTTPS, or FTP URL required: true default: none aliases: none - dest: description: absolute path of where to download the file to. If dest is a directory, the basename of the file on the remote server will be used. If a directory, thirsty=yes must also be set. required: true default: basename of url - thirsty: description: (new in 0.7) if yes, will download the file every time and replace the file if the contents change. if no, the file will only be downloaded if the destination does not exist. Generally should be ‘yes’ only for small local files. prior to 0.6, acts if ‘yes’ by default. required: false choices: - "yes" - "no" default: "no" - others: description: all arguments accepted by the file module also work here required: false examples: - code: get_url url=http://example.com/path/file.conf dest=/etc/foo.conf mode=0440 description: Obtain and install config file notes: > This module doesn't support proxies or passwords. """ This would produce the following JSON: { "get_url": { "notes": "This module doesn't support proxies or passwords.\n", "options": [ { "url": { "default": "none", "required": true, "description": "HTTP, HTTPS, or FTP URL", "aliases": "none" } }, { "dest": { "default": "basename of url", "required": true, "description": "absolute path of where to download the file to. If dest is a directory, the basename of the file on the remote server will be used. If a directory, thirsty=yes must also be set." } }, { "thirsty": { "default": "no", "required": false, "description": "(new in 0.7) if yes, will download the file every time and replace the file if the contents change. if no, the file will only be downloaded if the destination does not exist. Generally should be \u2018yes\u2019 only for small local files. prior to 0.6, acts if \u2018yes\u2019 by default.", "choices": [ "yes", "no" ] } }, { "others": { "required": false, "description": "all arguments accepted by the file module also work here" } } ], "examples": [ { "code": "get_url url=http://example.com/path/file.conf dest=/etc/foo.conf mode=0440", "description": "Obtain and install config file" } ], "description": "Downloads files from HTTP, HTTPS, or FTP to the remote server. The remote server must have direct access to the remote resource." } } Thanks to StackOverflow, found a way to parse the first docstring in a module file ... How about this: #!/usr/bin/env python # from http://stackoverflow.com/questions/9085350/parsing-python-module-docstrings import ast import yaml M = ast.parse(''.join(open('modules/get_url'))) # DOCUMENATION = """ """ MUST be first docstring in module doc = yaml.load(M.body[0].value.s) print doc Marco: I forgot to say: that prototype of yours looks really beautiful! :) Pull request at #1054 referenced this issue Sep 17, 2012 Merged #### First attempt at standardizing DOCUMENTATION string #1054 I'm working with this experimentally in three modules (setup, raw, and get_url) for the LaTeX stuff. I think we should move the module name into the array, so instead of --- get_url: description: Downloads files from HTTP, HTTPS, or FTP to the remote server. The remote server must have direct access to the remote resource. do this: --- module: get_url description: Downloads files from HTTP, HTTPS, or FTP to the remote server. The remote server must have direct access to the remote resource. it seems more natural while fumbling with the YAML. Further, I'd like to propose two new elements: 1. short_description (or title) with a half-liner useful e.g. in titles 2. source: core or something to allow for future e.g. source: contrib modules... (mentioned in #1054) Added (partally) support to ansible version, and choiche values. I'm try to investigate how to display this information. You can see my work in progress on the LaTeX Ansible reference card/cheat sheet (currently dubbed "booklet") at https://github.com/jpmens/ansible-booklet The brunt of the work (listing the modules) is done with modules2.py (which parses module files to extract DOCUMENTATION) and a Jinja2 template. Good. Changed: #1054 now has DOCUMENTATION in setup, raw and get_url. (Won't do others until you merge :-) What I did (three distinct module files in a single commit) was probably blasphemous (and apologies for that), but I had them done anyway for testing purposes. Current YAML now looks like this (e.g. for file module): --- module: file short_description: Sets attributes of files description: > Sets attributes of files, symlinks, and directories, or removes files/symlinks/directories. Many other modules support the same options as the file module - including 'copy', 'template', and 'assmeble'. version_added: "0.1" options: - dest: description: defines the file being managed, unless when used with state=link, and then sets the destination to create a symbolic link to using 'src' required: true default: [] aliases: [] - state: description: values are 'file', 'link', 'directory', or 'absent'. If directory, all immediate subdirectories will be created if they do not exist. If 'file', the file will NOT be created if it does not exist, see the 'copy' or 'template' module if you want that behavior. If 'link', the symbolic link will be created or changed. If absent, directories will be recursively deleted, and files or symlinks will be unlinked. required: false default: file choices: [ file, link, directory, absent ] - mode: description: mode the file or directory shoudl be, such as 0644 as would be fed to chmod. English modes like 'g+x' are not yet supported examples: - code: file path=/etc/foo.conf owner=foo group=foo mode=0644 description: Example from Ansible Playbooks notes: > See also: copy, template, assemble requirements: [ ] I'm having trouble wrapping the description within an option: sometimes > works, sometimes it doesn't (when read with PyYAML)... This is back in queue for 0.8 and will be part of the new doc site, udpated description Have adopted your good idea of splitting notes and description into lists. For description this is also enabled on descriptions of individual options : --- module: get_url short_description: Downloads files from HTTP, HTTPS, or FTP to node description: - Downloads files from HTTP, HTTPS, or FTP to the remote server. The remote server must have direct access to the remote resource. version_added: "0.6" options: - url: description: - HTTP, HTTPS, or FTP URL required: true default: null aliases: [] - dest: description: - absolute path of where to download the file to. - If dest is a directory, the basename of the file on the remote server will be used. If a directory, thirsty=yes must also be set. required: true default: null - thirsty: description: - if yes, will download the file every time and replace the file if the contents change. if no, the file will only be downloaded if the destination does not exist. Generally should be 'yes' only for small local files. prior to 0.6, acts if 'yes' by default. version_added: "0.7" required: false choices: [ "yes", "no" ] default: "no" - others: description: - all arguments accepted by the file module also work here required: false examples: - code: get_url url=http://example.com/path/file.conf dest=/etc/foo.conf mode=0440 description: Obtain and install config file notes: - This module doesn't support proxies or passwords. - This is para 2 - And three # informational: requirements for nodes requirements: [ urllib2, urlparse ] Can we maybe live with this formatting in the DOCUMENTATION strings? description: - Executes a I(low-down) and dirty SSH command, not going through the module subsystem. This is useful and should only be done in two cases. - The first case is installing B(python-simplejson) on older (python 2.4 and before) hosts that need it as a dependency to run modules, since nearly all core modules require it. Another is speaking to any devices such as routers that do not have any Python installed. In any other case, using the M(shell) or M(command) module is much more appropriate. Arguments given to M(raw) are run directly through the configured remote shell and only output is returned. There is no error detection or change handler support for this module  with I() being for italic (or emphasis), B() bold, and M() for a module name. Could augment that with U() for URL, say. I've found a way of translating that to LaTeX with a custom filter in Jinja2; this is probably trivial then to make for HTML and whatever else. It looks a bit funny in above YAML, but I don't think it's too ugly :-) Update: Examples: <h2>raw</h2> Executes a <em>low-down</em> and dirty SSH command, not going through the module subsystem. This is useful and should only be done in two cases. The first case is installing <b>python-simplejson</b> on older (python 2.4 and before) hosts that need it as a dependency to run modules, since nearly all core modules require it. Another is speaking to any devices such as routers that do not have any Python installed. In any other case, using the <span class='module'>shell</span> or <span class='module'>command</span> module is much more appropriate (see <a href='http://google.com'>http://google.com</a>). Arguments given to <span class='module'>raw</span> are run directly through the configured remote shell and only output is returned. There is no error detection or change handler support for this module  \mods{raw}{raw}{ Executes a \I{low-down} and dirty SSH command, not going through the module subsystem. This is useful and should only be done in two cases. The first case is installing \B{python-simplejson} on older (python 2.4 and before) hosts that need it as a dependency to run modules, since nearly all core modules require it. Another is speaking to any devices such as routers that do not have any Python installed. In any other case, using the \M{shell} or \M{command} module is much more appropriate (see \url{http://google.com}). Arguments given to \M{raw} are run directly through the configured remote shell and only output is returned. There is no error detection or change handler support for this module } And from the same input: ;-) (not quite ready, but getting there ...) .TH GET_URL 5 "date" "version" "ANSIBLE MODULES" ." generated from /Users/jpm/Auto/pubgit/ansible/ansible/library/get_url .SH NAME get_url \- Downloads files from HTTP, HTTPS, or FTP to node ." ------ DESCRIPTION .SH DESCRIPTION .PP Downloads files from HTTP, HTTPS, or FTP to the remote server. The remote server must have direct access to the remote resource. ." ------ OPTIONS ." ." .SH OPTIONS .IP url HTTP, HTTPS, or FTP URL (required) .IP dest absolute path of where to download the file to. If dest is a directory, the basename of the file on the remote server will be used. If a directory, thirsty=yes must also be set. (required) .IP thirsty if yes, will download the file every time and replace the file if the contents change. if no, the file will only be downloaded if the destination does not exist. Generally should be 'yes' only for small local files. prior to 0.6, acts if 'yes' by default. .SS Choices yes,no. (Added in Ansible version 0.7.) .IP others all arguments accepted by the file module also work here .... referenced this issue Sep 19, 2012 Merged #### Tweak DOCUMENTATION YAML as per latest 991 #1063 Thanks Michael: updated modules as discussed above, so I can now carry on with some goodness, I hope. :) This is a super friggin' cool idea. It's a great way to make the modules+docs self contained. I think as a side effect we'll find that the module docs are out of date much less frequently, simply due to the docs+code existing in the same place. This will bring a lot of consistency to module docs too when they're converted over. When you look at the current module docs there's a mixup of YAML and CLI commands with and without ansible preflixes. I'm very BIG on documentation and I tend to have a lot of ideas (some are worse than others). So please forgive the length of this comment. I think I'm picking up on the conventions from the examples in this issue and from reading over 60e0410. Use: • I(foo) when referring to parameters • M(foo) when referring to modules • C(foo) when referring to commands For the structure can you clarify which keys are optional and which are required? The main structure supports these keys: • module - Name of module • short_description - A short string • description - A list of strings • version_added - Version string • options - List of dictionaries (keys described below) • examples - List of dictionaries with a code key and an optional description key • requirements - List of strings • notes - List of strings A parameter supports these keys for description purposes: • description - List of strings • required - Boolean, True or False • default - None, or a string • aliases - List of strings • choices - List of strings • version_added - Version string Other thoughts/questions/requests/feedback: What examples I've read so far define the options key. Everywhere else currently we are referring to them as parameters. I'm working on the docstring for the pause module now (current beta docs in RST format). Like you mentioned, +1 to having a U('title', http://foo/) directive in the future, it would be nice for in-line links (as in how pause references the 'Rolling Updates' section). Related to the above url directive idea, do you have any plans to add a see_also: keyword at the level where the options and notes keywords live currently? Thoughts on supporting a tips: keyword (in the spirit of notes:)? It's not a big deal, I'll just re-write the "tip" in the pause module into a notes entry if this request is out of scope. Looking at your example HTML output it appears that you are putting the example descriptions after the actual examples. Seems more natural to me to have them appear before the example. The Nagios module has one of the more complex documentation entries due to it effectively having three distinct operation modes. For ease of comprehension I present the parameters in three separate tables, one for each mode. My first thought was "I can has option_groups?" But, since there's only one module with this problem presently I'll probably rewrite the paramaters part and use the second description item to specify which subcommand the paramater is useable with. How do you feel about refactoring the syntax for giving examples: to accept the actual syntax you would use in a playbook? I think it would make writing examples more natural and flexible. examples: - name: Set 30 minutes of apache downtime. action: nagios action=downtime minutes=30 service=httpd host=$inventory_hostname
delegate_to: nagios.example.com

- name: Make sure admins (from a vars_file) are in the right groups.
action: user name=$item groups=wheel,skynet with_items:$cyberdine_admins


Yielding:

Set 30 minutes of apache downtime.

    - action: nagios action=downtime minutes=30 service=httpd host=$inventory_hostname delegate_to: nagios.example.com  and Make sure admins (from a vars_file) are in the right groups.  - action: user name=$item groups=wheel,skynet
with_items: $cyberdine_admins  The less ideal alternative for modules that would benefit from more complex examples is to wrap examples in quotes and then manage indentation manually. Another related idea is supporting examples_playbook (takes list of dictionaries) and examples_cli (takes a list of dictionaries) sections . This one addition by itself would totally bring consistency to how examples are formatted. module: file # ... examples_cli: - name: Create a symlink to a file. example: src=/file/to/link/to dest=/path/to/symlink owner=foo group=foo state=link - name: Set the SELinux context of a file. example: path=/some/path state=directory setype=httpd_sys_content_t  Yielding: Create a symlink to a file. $ ansible foo_hosts -m file -a "src=/file/to/link/to dest=/path/to/symlink owner=foo group=foo state=link"


and

Set the SELinux context of a file.

\$ ansible foo_hosts -m file -a "path=/some/path state=directory setype=httpd_sys_content_t"

In [1]: import re

In [2]: pattern = re.compile("(?<=U\()([^,)]+)(?:, ?([^)]+))?")

In [3]: text_strings = {
...: 'plain_url': "U(http://www.redhat.com)",
...: 'no_space': "U(http://redhat.com,Red Hat, Inc.)",
...: 'with_space': "U(http://redhat.com, Red Hat, Inc.)",
...: 'inline_U': "Visit the worlds leader in opensource: U(http://redhat.com,Red Hat, Inc.), they're awesome!"
...: }

In [4]:

In [4]: for k, v in text_strings.iteritems():
...:         print "Matching: " + k + " (" + v + ")"
...:         print pattern.search(v).groups()
...:         print ""
...:
Matching: no_space (U(http://redhat.com,Red Hat, Inc.))
('http://redhat.com', 'Red Hat, Inc.')

Matching: inline_U (Visit the worlds leader in opensource: U(http://redhat.com,Red Hat, Inc.), they're awesome!)
('http://redhat.com', 'Red Hat, Inc.')

Matching: with_space (U(http://redhat.com, Red Hat, Inc.))
('http://redhat.com', 'Red Hat, Inc.')

Matching: plain_url (U(http://www.redhat.com))
('http://www.redhat.com', None)


I'm back :P

After some test i try to undestand the requirement of version_added but Ii think we can remove it. Let me explain.

Each version of module is tagged in the git repo with appropriate version;

If the doc is available in the specified tag, the module is available otherwise it is not. This make more confortable having documentation specified for each version (some param can be add/removed, default value can change, ...) and is not required specify it in each element (eg: param XXX was add on version 0.7, ...).

The disadvantage was the requirement to update docs in each tag (I think is not necessary), but the script that generate doc can generate doc only for latest tagged version and check old version information to specify when a module/param/.. was add/removed.

Ok everyone, I've replied to the list with a summary of what is left to do. Merged in, thanks to everyone who helped with this in code or working out what needed to be included.

@jpmens -- I probably broke a few of your templates and noticed that there was a --template-dir parameter in the Makefile that I did NOT have in the git repo, so you may want to look into fixing that. If you add that flag back, please update the Makefile too.

I am almost positive I broke your LaTeX templates from a change to the way parameters in the list of options work. The problem here is I wanted a hash, not a list of one element hashes, so I changed that.

I also made some changes to the documentation for INI file so the default was marked as null, not for some parameters as that seemed to be off. Minor for sure.

I also made it so the choices are included in the docs now as a unordered list, which means there is now BETTER online docs for the things that have the new DOCUMENTATION system :)

I had previously expressed concern about how to do inline YAML in YAML for modules that have a full playbook in their docs. I found out how to do that so we don't need to worry.

Apparently it is description: | (vertical pipe) and you intent over and just include YAML inside your YAML and everything is supposed to be cool.

Anyway, I think we are very much good to go, and per the list email, our next step is to document ALL of the modules including the ones that are just action plugins (like PAUSE) and never get transferred. They can still have docstrings.

Once we have that done, we can integrate the hacking/module_formatter script with the main Makefile to generate manpages too, and your PDF generator should be complete.

Let's see how we can split this stuff up (see list) and get everything documented.

GOOD STUFF!!!!

Any questions let me know. I'd look over my changes on top of your commits -- hopefully nothing too objectionable. I was able to generate the tables much more simply by jumping into HTML inside the RST.

closed this Sep 28, 2012