Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace die.net links with better resources where applicable #5143

Closed
mebeim opened this issue Jan 14, 2021 · 23 comments · Fixed by #5528
Closed

Replace die.net links with better resources where applicable #5143

mebeim opened this issue Jan 14, 2021 · 23 comments · Fixed by #5528
Assignees
Labels
decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc. mass changes Changes that affect multiple pages.

Comments

@mebeim
Copy link
Member

mebeim commented Jan 14, 2021

There was a short discussion today on Gitter about pages which have "more information" links pointing to die.net. As @navarroaxel showed there is man.archlinux.org, which is for Arch Linux programs, and which could be better suited for Arch Linux specific utilities.

In general though, we have the same distro-generic set of manual pages that die.net offers on man7.org, which IMHO renders the pages a lot better (sometimes die.net pages have syntax errors due to special characters being treated strangely when rendered in HTML, furthermore man7.org pages are actually in monospace).

Therefore I suggest we convert all "more information" links which are pointing to die.net to point to man7.org instead, and in case the package/command is distro-specific, directly point to the distribution's own manual pages if available, for example:

@mebeim mebeim added decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc. mass changes Changes that affect multiple pages. labels Jan 14, 2021
@bl-ue bl-ue self-assigned this Jan 15, 2021
@sbrl
Copy link
Member

sbrl commented Jan 16, 2021

Sounds like a good idea!

@bl-ue
Copy link
Contributor

bl-ue commented Jan 31, 2021

How about https://man7.org? It's seems distro-agnostic (at least the url is), and it's updated very frequently. Two days ago as of 1/31/2021.

Once we decide, I'll go through and fix everything.

@bl-ue
Copy link
Contributor

bl-ue commented Jan 31, 2021

Oops sorry I barely even read your message @mebeim! man7.org it is.

@bl-ue
Copy link
Contributor

bl-ue commented Feb 2, 2021

I'll tackle this now.

@sbrl
Copy link
Member

sbrl commented Feb 2, 2021

Sounds good to me. Although, should a command be distribution specific (e.g. dpkg, or apt) we should reference the source distribution (in the case of that example, it would be the Debian website I'd assume).

@marchersimon
Copy link
Collaborator

marchersimon commented Mar 28, 2021

@bl-ue oh, sorry I didn't see you were working on this.

@bl-ue
Copy link
Contributor

bl-ue commented Mar 28, 2021

Don't be sorry—I'm glad! I have a lot of things I've assigned myself and I'm glad an energetic helper has shouldered one of the burdens! 🙂

@vladimyr
Copy link
Contributor

vladimyr commented Mar 29, 2021

As @navarroaxel showed there is man.archlinux.org, which is for Arch Linux programs, and which could be better suited for Arch Linux specific utilities.

Arch manpages are not exclusively used for Arch utilities. Take vmstat(8) for example. It is part of an independent project (procps) living at https://gitlab.com/procps-ng/procps. In other words, Arch folks host distro-generic stuff too.

In general though, we have the same distro-generic set of manual pages that die.net offers on man7.org, which IMHO renders the pages a lot better (sometimes die.net pages have syntax errors due to special characters being treated strangely when rendered in HTML, furthermore man7.org pages are actually in monospace).

Oh, you are concerned about quirky HTML conversions and rendering, I feel your pain. Let me show you something really nice: https://gitlab.archlinux.org/archlinux/archmanweb/-/blob/275f3ee173ae658a6eedc139c49002371192728a/archmanweb/urls.py#L12
What that means is that you can switch between different formats:

This last thing is especially cool cause it enables you to do something like this: curl -sL https://man.archlinux.org/man/vmstat.8.raw | man /dev/stdin

Also take, a look at the actual HTML source code of that vmstat(8) page, no JavaScript at all! OTOH, while I do appreciate Michael's work on man7.org it is a bit uncool to stick two (controversial) trackers into someone's face:

<!--BEGIN-SITETRACKING-->
<!-- SITETRACKING.man7.org_linux_man-pages -->

<!-- Default Statcounter code for man7.org/linux/man-pages
http://www.man7.org/linux/man-pages -->
<script type="text/javascript">
var sc_project=7422636;
var sc_invisible=1;
var sc_security="9b6714ff";
</script>
<script type="text/javascript"
src="https://www.statcounter.com/counter/counter.js"
async></script>
<noscript><div class="statcounter"><a title="Web Analytics
Made Easy - StatCounter" href="https://statcounter.com/"
target="_blank"><img class="statcounter"
src="https://c.statcounter.com/7422636/0/9b6714ff/1/"
alt="Web Analytics Made Easy -
StatCounter"></a></div></noscript>
<!-- End of Statcounter Code -->



<!-- Start of Google Analytics Code -->

<script type="text/javascript">

  var _gaq = _gaq || [];
  _gaq.push(['_setAccount', 'UA-9830363-8']);
  _gaq.push(['_trackPageview']);

  (function() {
    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
  })();

</script>

<!-- End of Google Analytics Code -->

<!--END-SITETRACKING-->

taken from https://man7.org/linux/man-pages/man8/vmstat.8.html

Therefore I suggest we convert all "more information" links which are pointing to die.net to point to man7.org instead, and in case the package/command is distro-specific, directly point to the distribution's own manual pages if available, for example:

I propose Arch manpages as default, man7.org as a fallback (in case of page not being present in Arch manpage set), and using distro-specific man frontends where needed.

@vladimyr
Copy link
Contributor

vladimyr commented Mar 29, 2021

For the record here is a complete (check)list of pages pointing to linux.die.net:

  • addpart
  • aplay
  • arecord
  • cfdisk
  • dumpe2fs
  • e2freefrag
  • e2image
  • efibootmgr
  • expect
  • filefrag
  • ifdown
  • iftop
  • ifup
  • iotop
  • isosize
  • iwconfig
  • jpegtran
  • lftp
  • logsave
  • ltrace
  • mdadm
  • named
  • pdflatex
  • pmount
  • semanage
  • snmpwalk
  • sponge
  • transmission-create
  • vipw
  • vmstat
  • xar

Generated using ripgrep:

rg -l 'linux.die.net' | xargs -n1 basename -s '.md' | sort | xargs -n1 printf '- [ ] %s\n'

@vladimyr
Copy link
Contributor

I wrote a quick and dirty script to check for manpages on Arch end:

check-manpages
#!/bin/bash
set -euo pipefail

pushd "$HOME/.tldr/cache/pages" >/dev/null 2>&1

pages=$(rg -o 'linux.die.net/man/([^/])/([^>]+)' -r '$2.$1' | sort)

for page in $pages; do
  tldrpage=$(echo "$page" | cut -d':' -f1)
  manpage=$(echo "$page" | cut -d':' -f2)
  url="https://man.archlinux.org/man/$manpage"
  status=$(curl -sI "$url" | grep 'HTTP/')
  printf '%s:\t%s\t%s\n' "$tldrpage" "$manpage" "$status"
done

popd >/dev/null 2>&1

And here is what I got:

❯ bash check-manpages | grep 404
common/iotop.md:        iotop.1 HTTP/2 404 
linux/ifdown.md:        ifdown.8        HTTP/2 404 
linux/ifup.md:  ifup.8  HTTP/2 404 
linux/pmount.md:        pmount.1        HTTP/2 404 
linux/semanage.md:      semanage.8      HTTP/2 404 
osx/xar.md:     xar.1   HTTP/2 404 

xar is obviously an outlier here because it is macOS specific and for the rest, it seems that only semanage(8) is available on man7.org (checked using inurl:"iotop.1" OR inurl:"ifdown.8" OR inurl:"ifup.8" OR inurl:"pmount.1" OR inurl:"semanage.8" site:man7.org/linux/man-pages).

@vladimyr
Copy link
Contributor

ifup and ifdown are part of ifupdown suite maintained by Debian:

pmount is also maintained by Debian folks:

iotop is an independent project - http://guichaz.free.fr/iotop/ but the author in the manpage itself says it is made (manpage, not iotop) for Debian:

I'm not aware of any RHEL or macOS specific online manpage viewers 🤷

@marchersimon
Copy link
Collaborator

@vladimyr I now replaced all links with the arch man page where possible in my PR (#5528). Should I search alternatives for the other man7.org pages too?

@vladimyr
Copy link
Contributor

@marchersimon Well, I like that you appreciate my opinion on the subject but let's wait for others to have their say and reach a consensus.

@waldyrious
Copy link
Member

I now replaced all links with the arch man page where possible in my PR (#5528).

I'd rather we reach explicit consensus here before making the changes. I'd suggest holding off that PR for now, until we reach a decision.

@mebeim, @sbrl — you two expressed preference towards man7.org; any thoughts on @vladimyr's comments?

@mebeim
Copy link
Member Author

mebeim commented Mar 30, 2021

Thanks for the ping @waldyrious.

@vladimyr I see your point and I must agree, it seems that Arch man pages are a way better alternative to man7. If we all agree, I'd suggest pointing to those if they provide arch-agnostic pages, or to the appropriate distro (e.g. Debian/Ubuntu/etc) if the tool is distro specific.

TLDR: prefer Arch / other distros to man7/die.net. Sounds good?

@waldyrious
Copy link
Member

waldyrious commented Mar 31, 2021

TLDR: prefer Arch / other distros to man7/die.net. Sounds good?

@mebeim do you also think we should prefer Ubuntu manpages over unix.com? I'm trying to understand the motivation for this comment.

@vladimyr
Copy link
Contributor

Folks, just a quick heads up, I have made some recent discoveries that actually made me alter my proposal 🙃
I'm AFK at the moment but expect a proper update on this in a few hours...

@vladimyr
Copy link
Contributor

vladimyr commented Apr 1, 2021

Ok, I'm back...

First I need to clear things up. I wasn't precise enough stating:

In other words, Arch folks host distro-generic stuff too.

They do host stuff that is not necessarily Arch specific but only if it is part of the core Arch package set or distributed through AUR so there is some truth in @mebeim's earlier statement:

As @navarroaxel showed there is man.archlinux.org, which is for Arch Linux programs, and which could be better suited for Arch Linux specific utilities.

What it means is that we still need to fallback to other manpage frontends in cases like the ones I described here: #5143 (comment) @mebeim summarized it well:

I'd suggest pointing to those if they provide arch-agnostic pages, or to the appropriate distro (e.g. Debian/Ubuntu/etc) if the tool is distro specific.

Sure that works and it is definitely an upgrade compared to linux.die.net, and a better alternative than man7.org IMHO. However, I still wished for something that will cover all our needs without having to resort to alternatives. Basically, something like https://www.unix.com/man-page-repository.php, covering many different manpage sources and actually user-friendly. (I think it goes without saying but unix.com is out due to horrendous formatting and page bloat, ranging from visual clutter straight up to ads.)

And I actually found it by reading archman's README: https://gitlab.archlinux.org/archlinux/archmanweb#similar-projects
Meet https://manned.org 🎉 It is quite an old project developed by ncdu author: https://dev.yorhel.nl and pretty much ticks all boxes:

It is important to note that manned.org search algorithm is tuned in a such way that in most cases it actually gives you the same content as archman would which is a great thing as AUR is both large and regularly updated. In cases given manpage does not belong in Arch manpage set it will transparently switch to appropriate fallback which essentially means that we don't need to do it manually.
Take pmount for example https://manned.org/pmount.1 It somehow managed to beat archman in its own court. Here is what archman gives you: https://man.archlinux.org/pmount.1 i.e. nothing although it is clearly distributed through AUR: https://aur.archlinux.org/packages/pmount OTOH manned.org found it there and gives you manpage straight from the latest published version 🚀
Let's try RHEL's semanage: https://manned.org/semanage.8 voila, straight from the CentOS manpage set. Or macOS specific xar: https://manned.org/xar.1 which serves us better than fairly older https://www.unix.com/man-page/osx/1/xar/

It is so good it feels almost like magic!

So, to summarise, my new proposal is: let's use manned.org everywhere! 🎉

@bl-ue
Copy link
Contributor

bl-ue commented Apr 1, 2021

Wow @vladimyr that's a great find! The formatting is nice and simple, and sure enough I found ip link with all of it's full documentation and even examples at the bottom: https://manned.org/ip-link

Also this is what they say on the homepage:

Indexing 5,502,148 versions of 335,112 manual pages found in 21,062,967 files of 1,414,023 packages.

Unfortunately, we need to use some other source for Gentoo pages because as they source towards the bottom of the about page (bold mine):

Suggestions for new (or old) systems to index are welcome.

It would be great to index a few more non-Linux systems such as other BSDs, Solaris/Illumos and Mac OS X. Unfortunately, those don't always follow a binary package based approach, or are otherwise less easy to properly index.

In general, systems that follow an entirely source-based distribution approach can't be indexed without compiling everything. Since that is both very resource-heavy and open to security issues, there are no plans to include manuals from such systems at the moment. So unless someone comes with a solution I hadn't thought of yet, there won't be any Gentoo manuals here. :-(

@waldyrious
Copy link
Member

Do we even have any Gentoo-only pages?

@bl-ue
Copy link
Contributor

bl-ue commented Apr 1, 2021

In fact, I have no idea! 😄 I was just giving a heads up.

@vladimyr
Copy link
Contributor

vladimyr commented Apr 1, 2021

In fact, I have no idea! 😄 I was just giving a heads up.

Which I fully appreciate as I'm known as someone who doesn't read till the end of readme like in archman's case 🙃

With that being said I'm not aware of any online viewer for Gen/Funtoo manpages? I tried searching for the emerge manpage online and this is all I got https://dev.gentoo.org/~zmedico/portage/doc/man/emerge.1.html It's basically an output of man2html hosted by Gentoo dev on his personal space and that only covers portage. Are there better sources out there?

@sbrl
Copy link
Member

sbrl commented Apr 3, 2021

Great discussion here. I'm not sure it's as clear-cut though as saying "let's use the Arch Linux man website for everything", because some commands are distro-specific. For example, apt is a Debian command, and thus we should link to the Debian man page because it's unlikely to be found in other distros. dnf is a Fedora / red hat command, so we should like to the distro-specific docs. emerge is a Gentoo command, so we should link to the Gentoo docs.

This said, there are also plenty of commands that are distro-agnostic. Picking a good (and stable / long lasting) website here is a good idea.

I do agree that the Arch Linux help pages are very helpful, but I'm unsure about blindly linking to them as a catch-all here, because some man pages there might be distro-specific. So I'm inclined to choose a more neutral contender to solve the issue here.

I especially like the suggestion to reference https://manned.org here as our catch-all. As mentioned by @vladimyr above, it seems to be well-maintained - and according to the web archive it has been around in its present form since 2012 (before which it was a website about manned space flight, which started in 2006 and ended in 2010).

Of course, for distro-specific man pages I say that we reference the distro's website, but for a catch-all https://manned.org/ is the best thing I've seen (it even has a DuckDuckGo bang: !manned).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
decision A (possibly breaking) decision regarding tldr-pages content, structure, infrastructure, etc. mass changes Changes that affect multiple pages.
Projects
None yet
6 participants