Skip to content


The* url space should not include scripts #176

timbunce opened this Issue · 33 comments

9 participants


Scripts, like mcpani, have a url like

That will cause a problem if a module is ever released with the same name as an existing script.

PAUSE manages the namespace for modules but there's no such namespace management for scripts.

I suggest scripts get put in a /script/ url like

MetaCPAN member

Excellent idea. I would say this is, on some level, related to this issue: CPAN-API/cpan-api#110 ie. we need to define what should or should not exist in the module namespace.

MetaCPAN member

I agree. However we have to keep in mind that there is also documentation, which consists of POD only (.pod extension). Should they live in the module namespace or the script namespace?


They can't live in the module namespace because PAUSE manages that namespace allocation and doesn't pay attention to .pod files, only to perl code with "package ...;".

I suggest they simply use source/... for now.

I can't stress enough how important it is to think through the mapping to urls. It's better to use explicit source/... urls that include the dist version for anything that hasn't been thought-through yet.

MetaCPAN member

But there are several modules that ship with and Module.pod in the same directory. /module/Module should then show the .pod instead of the .pm, because the .pm doesn't contain any pod. That's how sco does it, and that's how we do it at the moment. That's a special case though and I generally agree that we should have only modules listed in the pause module index under the /module namespace.


@monken: But currently, when we ask to view the source of some /module/Module, we get empty content. Because the documentation is stored in a .pod and the code in a .pm and MetaCPAN does not automatically switches from one to the other.

MetaCPAN member

damn regressions, used to work in the pre-catalyst era :)

MetaCPAN member

I'm dragging @clintongormley into this.

I'd like to discuss a uri scheme that we can then implement.

  • /module/Moose::Meta::Class
  • /module/FLORA/Moose-2.0204/lib/Moose/Meta/

This will return the latest version of Moose::Meta::Class. If we find documentation for that package in the file or the corresponding .pod file, we will use that and render it as html. If we don't, we will show the source code of the package.

  • /pod/cpanm
  • /pod/Catalyst::Manual::About
  • /pod/BOBTFISH/Catalyst-Manual-5.9000/lib/Catalyst/Manual/About.pod

Those files don't have a package declaration and are thus to be handled separately. Since there is no central index that makes sure that there are only unique names, we have to apply some kind of heuristic. What I do currently is to look only in releases that are flagged as "latest". Releases are flagged as latest if they contain a package that is in the 02packages.details.txt file.

People are also requesting to cover a third case, where they want to access files independent of the release version, but the distribution.

  • /distribution/Moose/README
  • /distribution/Catalyst/lib/Catalyst/Manual/About.pod

This requires always a full path to the file, but without the full release name (author + version).

Anything else?

  • /module/Moose::Meta::Class

Good. Simple, obvious, direct, short, unambiguous.

  • /module/FLORA/Moose-2.0204/lib/Moose/Meta/

I'm really not keen on this. It seems like an abuse of the /module/ url namespace. I think /pod and /file could be used to access the pod and source of specific files.

  • /pod/cpanm

Not safe. There's no namespace management for scripts. Two dists could easily contain scripts with the same name. Let's avoid a debate about which one to show.

  • /pod/Catalyst::Manual::About

Redundant given /module/Catalyst::Manual::About

  • /pod/BOBTFISH/Catalyst-Manual-5.9000/lib/Catalyst/Manual/About.pod


  • /distribution/Moose/README
  • /distribution/Catalyst/lib/Catalyst/Manual/About.pod

That's a good idea but is ambiguous about pod vs raw.


It seems to be there are two things the url is trying to achieve: a) specifying what's being referred to, and b) specifying how to display information about it.

The What:

  • module/Foo::Bar
  • release/AUTHOR/Foo-Bar-1.23/$filepath <-- filepath could be empty to refer to the release file itself
  • distribution/Foo-Bar/$filepath <-- is this robust without an AUTHOR? Maybe allow an optional AUTHOR
  • script/Foo-Bar/$scriptname <-- just a suggestion

The How to present it:

  • pod - show pod if there is, else show source
  • file - show source
  • raw - return the content with no formatting at all
  • meta - return metacpan data about the specified thing (formatted in HTML or JSON per the Accept http header)

Since the What can include an arbitrary number of slashes it makes sense to put the How in front:

  • /$how/$what


  • /pod/module/Foo::Bar
  • /file/module/Foo::Bar
  • /raw/release/AUTHOR/Foo-Bar-1.23/Changes
  • /raw/release/AUTHOR/Foo-Bar-1.23 <-- return the tarball

Your previous examples would be:

  • /pod/module/Moose::Meta::Class
  • /pod/release/FLORA/Moose-2.0204/lib/Moose/Meta/
  • /pod/script/App::cpanm/cpanm
  • /pod/module/Catalyst::Manual::About
  • /pod/release/BOBTFISH/Catalyst-Manual-5.9000/lib/Catalyst/Manual/About.pod
  • /file/distribution/Moose/README
  • /file/distribution/Catalyst/lib/Catalyst/Manual/About.pod

And the existing /requires/module/DBI fits nicely into that scheme.

And suggests interesting extensions: /requires/distribution/Foo-Bar could show all the distributions that depend on any of the modules in Foo-Bar.

MetaCPAN member

A problem with this scheme came up that we were not able to solve.

We somehow have to define a common endpoint for documentation because documentation of modules contains links to other modules (or scripts or pods) without any indication. So it's both L<cpanm> and L<Moose>, which we have to translate to some common uri.

RE /pod/Catalyst::Manual::About being redundant to /module/...: Catalyst::Manual::About is not a module, because the documentation is in a .pod file, there is no .pm file and no package declaration. Only modules listed in the 02packages file are considered modules.


The fact is that there are two namespaces for code of scripts and modules, but a single common one for POD. But this is only a problem for single word entries (multiple words separated with '::' are always modules) without extension ("" is a script).
For the remaining ambiguous names (/\A\w+\z/), we could use this heuristics:

  • starts with capital => lookup for a module, but if fails lookup for a script
  • known pragma (lookup with Module::CoreList) => module, ?
  • else lookup for a script, but if fails lookup for a module

And note that I don't think that we should distinguish URLs for POD whether it is stored in a .pm or in a .pod if a corresponding .pm or script exists. It is the author choice to embed the POD in the code or not, and that should not impact how the doc is displayed: this is the current behavior of perldoc and this rule should not be broken.

@monken An endpoint has to be defined to present links in HTML view of POD. But this endpoint does not have to be the final destination. Of course, this would be better if that was. But for ambiguous (as defined above) POD links, a temporary endpoint could redirect to the final endpoint with an HTTP 302. So the (maybe costly) resolution of ambiguous names could be done only for files to be displayed, not for any link.


@dolmen The redirect is a good idea.

Automatically redirect a /pod/$foo to the resolved /pod/module/$foo or /pod/script/$dist/$foo if there's only one current possibility. If there's more than one then show a 'disambiguate' page that lists the possibilities so the user can choose.

(The non-ambigous cases could be cached and used to present the 'right' link in the source document when it's rendered.)

And I agree that the form the author chooses to release module pod as shouldn't alter the endpoint url for the module pod.


Perhaps a helpful illustration: links to for the script 'p'. When App::p was uploaded, that page was the documentation that came with App::p. But now, it shows something else - the docs for a different dist that also has a script called 'p' that was uploaded later. So we're not even consistent!

MetaCPAN member

Another example where /module/XXX doesn't work e.g.
doesn't match

MetaCPAN member

@ranguard why should it? If any, it should match (but doesn't). There is no index for app documentation so metacpan makes up its own.

MetaCPAN member

@monken - works - which I appreciate is probably a side affect not a feature, I was just giving another example where this hack doesn't work.

I'd like a URL to link to these that isn't a hack see issue #805

MetaCPAN member

me too, but this issue is about something else :)

MetaCPAN member

From the perspective of a pod parser:

perlpod describes the L<> construct as a link to "a Perl manual page" and has specific examples for modules (Net::Ping) and perldocs (perlsyn). (Also man pages but that's less relevant.)

It doesn't mention scripts or other dist files, but it does use the same construct to link to indexed modules and non-indexed pod (perlsyn is not in 02packages).

perlpodspec however adds scripts to the list of what an L<> can link to:

the name of a Pod page like L<Foo::Bar> (which might be a real Perl module or program in an @INC / PATH directory, or a .pod file in those places)

So I think I'd vote for the single endpoint that would make a best guess as to where to redirect them (module, then pod, then script, then whatever else), and offer a disambiguation page if the identifier is non-unique.

I also wonder if there might be value in having that decision be made by the api, but i haven't really thought that through yet.

Of course that isn't really what this ticket was about, which was offering a different endpoint (other than module) that would show the html version of pod for non-module files.

MetaCPAN member

I'm going to take a crack at a /pod/type/path controller and we'll go from there.

MetaCPAN member

So along the lines of @timbunce's "alternative" suggestion...

  • /pod/module/Moose::Meta::Class
    • Latest version of indexed module
    • Same as /module/Moose::Meta::Class
  • /pod/module/Catalyst::Manual::About
    • If .pod exists for indexed module use it.
    • If not indexed we could look for .pod and redirect them to release or distribution url or possibly give them a 404 with suggestions.
  • /pod/release/FLORA/Moose-2.0204/lib/Moose/Meta/
    • Specific version of a module file
    • Could have a rel=canonical to /module/$1 if it's indexed
    • Non-indexed could possibly rel=canonical to /pod/distribution/$1 if #796/#797 gets worked out
  • /pod/release/BOBTFISH/Catalyst-Manual-5.9000/lib/Catalyst/Manual/About.pod
    • Specific version of pod file
    • Essentially the same as previous
  • /pod/distribution/Catalyst-Manual/lib/Catalyst/Manual/About.pod
    • File in latest version of a dist
    • This suffers from #796/#797 but may be less prone to conflicts since the full file path is used.
    • If there's a conflict, show a disambiguation page.
    • Otherwise essentially the same as /pod/release/$1
  • /pod/script/cpanm
    • Look for files with exactly that basename
    • (We could prefer things in bin/, script/, or scripts/ but I'm not sure that gains us anything).
    • Offer disambiguation page (linking to release or distribution url) if not unique.
  • /pod/find/perlsyn
    • Primarily for the sake of L<> pod links which are, by specification, ambiguous.
    • Prefer:
      • Indexed module (probably redirect to /module/$1).
      • Non-indexed pod file (probably redirect to release or distribution url).
      • Script (probably redirect to release or distribution url).
    • Show Disambiguation page if not unique.
    • We wouldn't have to redirect... (not redirecting would reduce the number of API calls being made...) but it does seem the more appropriate thing to do.

For any redirects we could prefer /pod/distribution/$1 but we may want to wait
until dist-names are unique or #796/#797 otherwise gets resolved.

We could also have a /file/ controller that finds files the same way
but shows the source instead of the pod.

Old /module namespace

Should we keep the old /module/Foo::Bar as the canonical url?
It's probably the most commonly used/desired url on the site.
We could redirect non-indexed modules to one of the other urls
(to fix the /module/scriptname behavior).


MetaCPAN member

my 2 cents

  1. /script/cpanm doesn't solve the issue that anyone can upload a cpanm script which makes it not unique. IMO scripts should only be accessed through the version independent distribution endpoint (/distribution/App-cpanminus/bin/cpanm). Authors can still upload a App-cpanminus distribution and take over that link. However, that's less likely than someone to release his/her local::lib
  2. Rename /module/{arg} to /pod/{arg}. If {arg} is not a registered module (i.e. 02packages.txt) redirect to the version independent distribution endpoint (/distribution/{dist}/{full path_to file}. That way we save a redirect for most documentation that are modules, and provide a pretty stable url for scripts and .pod files.
  3. /pod/{release with version}/{full path to file} will behave just like /module/{...} today

Pod parsers will then use /pod/ as a base for L<>. In case the requested resource is not an indexed module, we redirect to the canonical version independent distribution endpoint (CVIDE). If there is more than one documentation with the same name, we show a page where the user can select where he wants to go.

To summarize:

  • /pod/Moose

    show latest Moose documentation

  • /pod/cpanm

    redirect to /distribution/App-cpanminus/bin/cpanm

  • /pod/Catalyst::Manual::About

    redirect to /distribution/Catalyst-Manual/lib/Catalyst/Manual/About.pod

  • /pod/cpan

    show list of documentation with title cpan to choose from (distributions WAIT, CPAN, App-Cpan)

  • /pod/ETHER/Moose-2.0802/lib/

    version specific documentation


fwiw, I think Tim's points/proposal, rwstauner's summary and monken's points are sound. I especially like the twist of potentially ambiguous links like /pod/cpan automatically rendering a search/disambiguation page. However, there's one quirky case to consider:

  1. miyagawa introduces cpanm. Gods and men rejoice. /pod/cpanm renders the docs.
  2. Some time later, some jerk includes a cpanm script in his distribution
  3. BOOM, all anchored links to the documentation break on the shiny, new disambiguation page

Sure, you could apply the same anchors to all disambiguation results so that the next click gets you where you want to go, but it may throw some people off.

And IMHO, I think the /pod/ should be canonical, letting /module/ die out as it will.

MetaCPAN member

@jayallen if an author wants his/her module/script to be indexed, he should add a package cpanm; line to the file and /pod/cpanm will always link to this module/script. Not doing that is just calling for trouble. It has the nice sideeffect that cpan cpanm will actually work.


Agreed. Don't add complexity to cater to laziness. :)

MetaCPAN member

I'm fine with not doing 1. (/script/blah)... i probably would have left it until last and never gotten around to it anyway.

I really like the idea of the identifier (/pod/$type/@args) because it seems simple, reads well, and allows for future expansion (which would have been nice to have from the start).

I might be able to warm up to the idea of 2. because it seems to cover the cases, however the /distribution/@path endpoint bothers me:

  • It's a misnomer (which is technically how this issue began... "incorrect" urls).
  • It will be limited to showing pod even though it sounds like it ought to be showing distribution metadata (favorites, bugs, requirements, relations, rev-deps... I'm not sure what else could go here it just seems to me like "files" isn't it).
  • It would become another endpoint that's showing pod, and we still don't have a CVIDE for non-pod files. I could easily see wanting to be able to link to a script in the examples/ directory of the latest version of a dist or a certain dist's dist.ini or cpanfile or something.

Number 3. seems too limiting.

We might be able to do a combination:
A single arg could be the magic redirect of 2. (or a disambiguation page):

  • /pod/Module::Name
  • /pod/script_name
  • /pod/Documentation::Name

and we could use the qualifiers for other entries (more than one arg):

  • /pod/release/AUTHOR/NAME-VER/@path
  • /pod/distribution/NAME/@path

Then a file endpoint could have consistent urls for source code: /file/release/..., /file/distribution/...

As for the non-unique dist names, here's a stupid idea: we could let people link to files that are in the dist of a named (indexed) module... something like /pod/module/App::cpanminus/bin/cpanm. (NOTE: I'm only mildly serious on this one, it seems silly, but it would be a solution.)

MetaCPAN member

I really like the way this discussion has progressed. This last comment from @rwstauner sounds very good to me. As far as the final (silly?) solution, I think something along those lines might be a great idea. It's a memorable URL and it would give you what you expect to get from it.

MetaCPAN member

/pod/$type/@args requires a redirect for L<> pod links if we want to establish it as the canonical url. And that is something I'm not comfortable with. That will add another huge delay to page loads.
Another reason why I think /pod/$type/@args is not a good idea for a canonical url is that $type might actually change. People will realize that their script should actually be a package and that will change $type, making the old link 404.

If authors want /pod/$module to do the right thing, they just define a proper package. It's not that hard! We shouldn't encourage people to upload dists that are not properly packaged by building workarounds.

The disambiguation page is a nice way to let authors know that the namespace is not taken yet and they simply have to define that package name.

MetaCPAN member

I agree with linking to /pod/$arg and using that as the canonical where possible (no redirect).

How do you feel about using the qualifier for the other (direct) links (release, distribution, etc)?

I think that gives us the maximum flexibility and the /pod/$single_arg gives us a good (stable) url.

Best of both worlds.


Can we use a more SEO-friendly word than pod? docs? help?


I'd say POD IS more SEO friendly since when I'm searching for perl module documentation, I'm going to be looking for POD. Did you mean more non-perl-developer-human-friendly?

MetaCPAN member

@rwstauner has merged his changes (3 months ago) - so closing

@ranguard ranguard closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.