Skip to content
This repository was archived by the owner on Aug 3, 2024. It is now read-only.
This repository was archived by the owner on Aug 3, 2024. It is now read-only.

Namespace qualification for symbol references #667

@hvr

Description

@hvr

The Problem

Right now, we distinguish only between 3 classes

  1. (by using " quotes) module names
  2. (by using ' quotes) lower-cased symbols, i.e. ordinary term level bindings
  3. (by using ' quotes) upper-cased symbols: class names, type names, data constructors, and others

The problem however is that 3. conflates more than one namespace (regardless of H2010 forbidding an identifier to refer to a class and constructor within the same scope); specifically you can still end up in a module having separate entities have the same identifier; e.g. in the following example,

module Bar where

class Foo t where

data Bar = Foo

data Doo = Bar

the Haddock identifier references

  • 'Bar.Bar' and
  • 'Bar.Foo'

are not uniquely determined.

Moreover, when referring to identifiers from packages which aren't in scope, Haddock needs to make a guess.

Finally, for rendering identifier references in haddock-marked-up .cabal descriptions, hackage-server cannot perform any type-inference, so a way to decide whether to use a t: or a v: style href is needed.

UPDATE: The necessity for this disambiguation has been increased due to HiHaddock which moves the identifier resolving/lookup into the GHC compilation pipeline at which point references in Haddock docstrings need to be resolved by GHC in a well-defined matter. Blindly guessing the category of an identifier doesn't fit well into this new scheme any more; we need to start treating docstring references in a more principled way than the stringly-typed way Haddock did previously.

Suggested solution

Variant a

Introduce a way/syntax to qualify the namespace of upper-case identifiers

Bikesheddable suggestion (which follows the way the current link-anchor generation works) for qualified identifier references:

  • 't:Foo' (or 't:Bar.Foo') refers to class Foo
  • 'v:Foo' (or 'v:Bar.Foo') refers to Foo :: Bar
  • 't:Bar' (or 't:Bar.Bar') refers to data Bar
  • 'v:Bar'(or 'v:Bar.Bar') refers to Bar :: Doo

Haddock would warn whenever it cannot uniquely disambiguate a reference, so that users know when it's needed to qualify a given reference by v: or t:.

Variant b

Use export/import-list-ish syntax to qualify value level terms for upper-case symbols (which otherwise by default (or in case of ambiguity) refer to non-value level symbols):

  • '(Foo)' or 'Bar(Foo)' or 'Bar.(Foo)' or 'Bar.Bar(Foo)' refers to Foo :: Bar
  • '(Bar)' or 'Bar.(Bar)' or 'Doo(Bar)' or 'Bar.Doo(Bar)' refers to Bar :: Doo

NOTE:

  • We allow to leave off the type name before the ()s
  • We don't have a way to explicitly qualify non-value-level symbols; consequently we cannot warn about ambiguities as there's no actionable way to silence those warnings if we in fact want to refer to non-value symbols.

Variant c

(mentioned in #667 (comment))

This is a more economic&ergonomic version of Variant a which is inspired by Python-style literal prefixes:

  • t'Foo' (or t'Bar.Foo') refers to class Foo
  • v'Foo' (or v'Bar.Foo') refers to Foo :: Bar
  • t'Bar' (or t'Bar.Bar') refers to data Bar
  • v'Bar'(or v'Bar.Bar') refers to Bar :: Doo

This also avoids the awkwardness for referring to operators as mentioned in #667 (comment):

  • v':='
  • v'+'

This exploits that this markup only requires to prefix merely a single letter for disambiguating and is extensible to additional qualifications if the need should ever arise. Another benefit of this variant is graceful degradation with older Haddock parsers which will still recognize the symbol reference and resolve it according to the old heuristics into hyperlinks; neither Variant a nor Variant b have this property.

I don't think we can do any better than Variant c; hence why I believe this to be the most convenient markup possible in terms of editing/typing friendliness. It's also reasonably non-intrusive and doesn't steal syntax you'd otherwise use in English prose typesetting.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions