Skip to content

Separate private from public API: Underscore convention or __all__ #290

@vpratz

Description

@vpratz

This is related to the documentation issue #280, where we only want to render the public API documentation, but goes a bit further.

In many cases, our current setup indicates a desired import structure (e.g. importing bayesflow.transforms.Transform instead of bayesflow.transforms.transform.Transform. This is intuitive, but it is implicit and we cannot automatically (for generating the API docs) tell that the latter is not intended usage and should be private.

As far as I can tell, there are two usual approaches:
a) prepending private modules and files with an underscore (e.g., bayesflow.transforms._transform.Transform)
b) adding public functions, classes and modules to __all__, while excluding all private ones

I think for Sphinx to work efficiently, we probably need b), either everywhere where we import functions that we want to make publicly availiable (when autosummary_imported_members = False), or everywhere where we use external modules (keras, scikit-learn, ...) with autosummary_imported_members = True.

One problem I encountered when trying this out is Python's peculiar behavior with relative imports. When doing

from .transform import Transform

we would expect that one variable is added to the namespace, Transform. But it turns out two objects are added: transform and Transform. The reason for this is explained in this thread.
For us, this means that without explicitly specifying __all__ or renaming transform to _transform, both are detected as public, even though our intention was to only make Transform publicly available.

We can do b) semi-automatically, by populating __all__ by a function like the following:

import inspect


def _add_imports_to_all(exclude=None):
    """Get global dict from stack."""
    exclude = exclude or []
    calling_module = inspect.stack()[1]
    local_stack = calling_module[0]
    global_vars = local_stack.f_globals
    if "__all__" not in global_vars:
        global_vars["__all__"] = []
    all_var = global_vars["__all__"]
    global_vars["__all__"] = [
        e for e in list(set(all_var).union(set(global_vars.keys()))) if e not in exclude and not e.startswith("_")
    ]

This would allow us to exclude unwanted relative imports (or, by modifying it a bit, only to add functions and classes, while manually specifying modules we want to add). If we also have a), we can exclude them automatically, as they start with an underscore.

Both solutions require a larger one-off effort, but have a comparatively low maintenance cost afterwards. As a) requires larger changes to the codebase, it would be beneficial to time the changes well to avoid large merge conflicts.

This issue is a bit technical, as both Python and Sphinx show somewhat peculiar behavior. If you have any questions, please send them so I can clarify if necessary. Also, if you have ideas for easier solutions, or think the effort is not worth the gains, please also share those considerations.

Metadata

Metadata

Assignees

Labels

discussionDiscuss a topic or question not necessarily with a clear output in mind.documentationImprovements or additions to documentationrefactoringSome code shall be redesigned

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions