Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introducing "curldown" for libcurl man page format #12730

Closed
wants to merge 5 commits into from

Conversation

bagder
Copy link
Member

@bagder bagder commented Jan 17, 2024

overview

curldown is this new file format for libcurl man pages. It is markdown inspired with differences:

  • Each file has a set of leading headers with meta-data
  • Supports a small subset of markdown
  • Uses .md file extensions for editors/IDE/GitHub to treat them nicely
  • Generates man pages very similar to the previous ones
  • Generates man pages that still convert nicely to HTML on the website
  • Detects and highlights mentions of curl symbols automatically (when
    their man page section is specified)

tools:

  • cd2nroff: converts from curldown to nroff man page
  • nroff2cd: convert an (old) nroff man page to curldown
  • cdall: convert many nroff pages to curldown versions
  • cd2cd: verifies and updates a curldown to latest curldown

This setup generates .3 versions of all the curldown versions at build time.

CI

Since the documentation is now technically markdown in the eyes of many
things, the CI runs many more tests and checks on this documentation,
including proselint, link checkers and tests that make sure we capitalize the
first letter after a period...

tasks

  • all used .XX instructions in nroff files are handled by nroff2cd
  • actually switch to curldown files in the repository (merge those, remove the nroff versions)
  • make maketgz generate the man pages from the .md sources
  • dropped the .3 files from the dist tarball
  • docs/CURLDOWN.md documents the format and involved tools
  • render the docs when building with autotools
  • supports See-also: as a yaml list
  • new nroff converted files are >99% identical to the old nroff files (excluding the cleanups)
  • use .md file extensions
  • working out-of-tree builds
  • make all CI jobs work again
  • make proselint happy about all ~500 new markdown files
  • render the docs when building with cmake

The tests that rely on the man pages to be present will be disabled for the CI jobs that use cmake on Appveyor for now.

@bagder bagder changed the title aintroducing "curldown" for libcurl man page format introducing "curldown" for libcurl man page format Jan 17, 2024
scripts/nroff2cd Outdated Show resolved Hide resolved
@github-actions github-actions bot added the CI Continuous Integration label Jan 17, 2024
@bagder
Copy link
Member Author

bagder commented Jan 17, 2024

The CURLDOWN.md file within the PR is the embryo of documentation for the format and how to use it.

@bagder

This comment was marked as resolved.

@bagder

This comment was marked as resolved.

@bagder bagder force-pushed the bagder/curldown branch 2 times, most recently from 57f7358 to ddbf3f7 Compare January 18, 2024 09:57
@bagder bagder marked this pull request as ready for review January 18, 2024 09:57
@bagder

This comment was marked as resolved.

@mxmehl

This comment was marked as resolved.

@bagder

This comment was marked as resolved.

@bagder bagder force-pushed the bagder/curldown branch 5 times, most recently from b8c1be7 to bd659c5 Compare January 18, 2024 22:31
@bagder bagder mentioned this pull request Jan 18, 2024
@bagder
Copy link
Member Author

bagder commented Jan 18, 2024

An interesting side-effect of switching to .md is that all of a sudden, the proselint CI job got a lot of more text to complain about... 😆

my $blankline = 0;
my $header = 0;

push @desc, ".\\\" generated by cd2nroff $cd2nroff from $f\n";
Copy link
Member

@vszakats vszakats Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a local test run (I used CMake) $f remains empty for all output files:

.\" generated by cd2nroff 0.1 from 
.TH CURLINFO_APPCONNECT_TIME 3 "December 14 2023" libcurl

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the full command line used when cd2nroff is invoked?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see why

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/docs/libcurl/CMakeLists.txt b/docs/libcurl/CMakeLists.txt
index fc1283a53..0d52a7e56 100644
--- a/docs/libcurl/CMakeLists.txt
+++ b/docs/libcurl/CMakeLists.txt
@@ -34,11 +34,11 @@ function(add_manual_pages _listname)
     else()
       string(REPLACE ".3" ".md" _mdfile "${CMAKE_CURRENT_SOURCE_DIR}/${_file}")
     endif()
 
     add_custom_command(OUTPUT "${_rofffile}"
-      COMMAND ${PROJECT_SOURCE_DIR}/scripts/cd2nroff < ${_mdfile} > ${_rofffile}
+      COMMAND ${PROJECT_SOURCE_DIR}/scripts/cd2nroff ${_mdfile} > ${_rofffile}
       DEPENDS "${_mdfile}"
       VERBATIM
     )
   endforeach()
 endfunction()

Copy link
Member

@vszakats vszakats Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They look like this:

cd ./curl/_x64-win-ucrt-cmake-llvm-bld/docs/libcurl/opts && \
  ./curl/scripts/cd2nroff \
  < ./curl/docs/libcurl/opts/CURLINFO_APPCONNECT_TIME.md \
  > ./curl/_x64-win-ucrt-cmake-llvm-bld/docs/libcurl/opts/CURLINFO_APPCONNECT_TIME.3

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, my diff above changes that command line, to pass the file name instead < [filename] which is what made it blank there.

Copy link
Member

@vszakats vszakats Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! That fixed it ...but now there is another issue: The filename embedded contains the whole (absolute) path of the source filename, which breaks reproducibility and may also leak local info. (This may be CMake-specific, as CMake loves absolute paths, causing this class of issues everywhere.)

.\" generated by cd2nroff 0.1 from /<home-dir>/<path-to>/curl/docs/libcurl/opts/CURLINFO_CONNECT_TIME_T.md
.TH CURLINFO_CONNECT_TIME_T 3 "December 14 2023" libcurl

Perhaps the simplest fix is to omit this info from the generated output? Or strip all path from it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not break reproducibility, you just have to rebuild with the same paths. But yes, I'll make the script cut off the dir name. I want the file name mentioned there to help users find the source.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch to drop the source filename:

diff --git a/scripts/cd2nroff b/scripts/cd2nroff
index 08fbd768e..734cbf8f3 100755
--- a/scripts/cd2nroff
+++ b/scripts/cd2nroff
@@ -186,7 +186,7 @@ sub single {
     my $blankline = 0;
     my $header = 0;
 
-    push @desc, ".\\\" generated by cd2nroff $cd2nroff from $f\n";
+    push @desc, ".\\\" generated by cd2nroff $cd2nroff\n";
     push @desc, ".TH $title $section \"$date\" $source\n";
     while(<$fh>) {
         $line++;

To strip path:

diff --git a/scripts/cd2nroff b/scripts/cd2nroff
index 08fbd768e..d9938b6e9 100755
--- a/scripts/cd2nroff
+++ b/scripts/cd2nroff
@@ -30,6 +30,8 @@ Converts a curldown file to nroff (man page).
 =end comment
 =cut
 
+use File::Basename;
+
 my $cd2nroff = "0.1"; # to keep check
 my $dir;
 my $extension;
@@ -186,7 +188,7 @@ sub single {
     my $blankline = 0;
     my $header = 0;
 
-    push @desc, ".\\\" generated by cd2nroff $cd2nroff from $f\n";
+    push @desc, ".\\\" generated by cd2nroff $cd2nroff from " . basename($f) . "\n";
     push @desc, ".TH $title $section \"$date\" $source\n";
     while(<$fh>) {
         $line++;

@vszakats
Copy link
Member

vszakats commented Jan 22, 2024

Adding a CMake option to disable building docs:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 70ca457a9..18d8d085b 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -306,6 +306,7 @@ endif()
 
 find_package(Perl)
 
+option(BUILD_DOCS "to build manual pages" ON)
 option(ENABLE_MANUAL "to provide the built-in manual" OFF)
 
 if(ENABLE_MANUAL AND PERL_FOUND)
diff --git a/docs/libcurl/CMakeLists.txt b/docs/libcurl/CMakeLists.txt
index fc1283a53..c719c9764 100644
--- a/docs/libcurl/CMakeLists.txt
+++ b/docs/libcurl/CMakeLists.txt
@@ -54,11 +54,13 @@ add_custom_command(OUTPUT libcurl-symbols.md
   VERBATIM
 )
 
-add_manual_pages(man_MANS)
-add_custom_target(man ALL DEPENDS ${man_MANS})
-if(NOT CURL_DISABLE_INSTALL)
-  install(FILES "$<LIST:TRANSFORM,${man_MANS},PREPEND,${CMAKE_CURRENT_BINARY_DIR}/>"
-          DESTINATION ${CMAKE_INSTALL_MANDIR}/man3)
-endif()
+if(BUILD_DOCS)
+  add_manual_pages(man_MANS)
+  add_custom_target(man ALL DEPENDS ${man_MANS})
+  if(NOT CURL_DISABLE_INSTALL)
+    install(FILES "$<LIST:TRANSFORM,${man_MANS},PREPEND,${CMAKE_CURRENT_BINARY_DIR}/>"
+            DESTINATION ${CMAKE_INSTALL_MANDIR}/man3)
+  endif()
 
-add_subdirectory(opts)
+  add_subdirectory(opts)
+endif()

(Edited to be enabled by default.)

@bagder
Copy link
Member Author

bagder commented Jan 22, 2024

They are built with autotools by default and there is no option to switch that off...

@bagder
Copy link
Member Author

bagder commented Jan 22, 2024

On my machine, building all libcurl ~500 man pages takes 800 milliseconds. It is a pretty quick operation, even if my machine might be above average CPU and disk speed wise.

@vszakats
Copy link
Member

vszakats commented Jan 22, 2024

I takes almost as long as the build itself here, but I'm on fairly old gear. I'd appreciate an option to turn this off, as this will increase greatly the time for each build. When doing them by the hundreds, it adds up. (Even building curl.1/hugehelp takes a fair amount, but I can tweak that locally.) For some reason autotools never built man pages with the curl-for-win script, I don't know why, with --enable-manual set. [edit: since there was no build-phase involved, it was instantaneous and I never noticed it, but it did in fact copy the .3 files to the install target. This would make an option to disable it useful here as well after this PR.]

Edit: with this PR applied, autotools quits with:

make[3]: *** No rule to make target `curl-config.1', needed by `all-am'.  Stop.

@bagder
Copy link
Member Author

bagder commented Jan 22, 2024

Fair enough!

@vszakats
Copy link
Member

Re: autotools, it did have this working before this patch, but was unnoticably fast (being just a handful of install commands). It may be noticable after this patch. Making a --disable-docs option useful there as well.

But, for now autotools builds quit with this error (with this PR applied), so I couldn't actually test it:

make[3]: *** No rule to make target `curl-config.1', needed by `all-am'.  Stop.

@bagder
Copy link
Member Author

bagder commented Jan 22, 2024

Edit: with this PR applied, autotools quits with:

We use autotools in numerous CI builds, including windows (on appveyor) and none of them fail like that... I'm not saying there is no problem left, but I might not consider that a blocker for merging this.

@bagder
Copy link
Member Author

bagder commented Jan 22, 2024

This is the biggest PR I/we have done in a long time. I completely expect that there will be some issues popping up once we merge this, but I don't think they are worse than we can just fix them. I believe this is a good PR.

@vszakats
Copy link
Member

No worries, re-run it in a clean sandbox, and that resolved it.

@bagder bagder closed this in eefcc1b Jan 22, 2024
@gvanem
Copy link
Contributor

gvanem commented Jan 23, 2024

This scripts/cd2nroff file yet is another Windows unfriendly Perl script IMHO.
My Cygwin Perl bitches on this command:

f:/Cygwin64/bin/perl.exe ../scripts/cd2nroff ../docs/libcurl/ABI.md > ../docs/libcurl/ABI.3
../docs/libcurl/ABI.md:68:1:ERROR: no header present

But this works fine:

f:/Cygwin64/bin/perl.exe ../scripts/cd2nroff < ../docs/libcurl/ABI.md > ../docs/libcurl/ABI.3

And so does my good old StrawBerry Perl (v5.18.4 from 2014).
A newline-issue on if(/^---/) { once again. Arg!

Edit: Cygwin's Perl doesn't work at all. Hence using StrawBerry Perl.

@bagder
Copy link
Member Author

bagder commented Jan 23, 2024

It built in several Windows CI jobs, but I'm sure we can improve it.

@gvanem
Copy link
Contributor

gvanem commented Jan 23, 2024

several Windows CI jobs

Since all those CI jobs checks out with git and config.autocrlf=false presumably?
I do not know muchPerl, but this open($fh, "<:crlf", "$f") opens in text-mode?

@bagder
Copy link
Member Author

bagder commented Jan 23, 2024

Since all those CI jobs checks out with git and config.autocrlf=false presumably?

No, I can't see anything that sets config.autocrlf=false or sets it specifically for .md files. Maybe we should rather set them to eol=lf ?

I do not know muchPerl, but this open($fh, "<:crlf", "$f") opens in text-mode?

Whenever I try to guess how something works with text files on Windows, I seem to guess wrong. So I don't know.

@gvanem
Copy link
Contributor

gvanem commented Jan 23, 2024

Another thing I feel is strange. From the bottom of e.g. man CURLOPT_RESUME_FROM:
curl-man

English text, but a Norwegian date (!)

@bagder
Copy link
Member Author

bagder commented Jan 23, 2024

I bet you can fix that by changing/setting some locale environment variable(s)...

vszakats added a commit that referenced this pull request Jan 23, 2024
- cmake: enable `BUILD_DOCS` by default (this controls converting and
  installing `.3` files from `.md` sources)

- cmake: speed up generating `.3` files by using a single command per
  directory, instead of a single command per file. This reduces external
  commands by about a thousand. (There remains some CMake logic kicking
  in resulting in 500 -one per file- external `-E touch_nocreate` calls.)

- cd2nroff: add ability to process multiple input files.

- cd2nroff: add `-k` option to use the source filename to form the
  output filename. (instead of the default in-file `Title:` line.)

Follow-up to 3f08d80
Follow-up to ea0b575 #12753
Follow-up to eefcc1b #12730

Closes #12762
@gvanem
Copy link
Contributor

gvanem commented Jan 24, 2024

... some locale environment variable(s)...

Sure. LC_ALL=C or something? But man curl looks what I'd expect:
curl-man

@bagder bagder deleted the bagder/curldown branch January 24, 2024 13:59
@bagder
Copy link
Member Author

bagder commented Jan 24, 2024

That I cannot explain because to me it looks like the generate the date strings identically!

jamacku added a commit to jamacku/curl that referenced this pull request Feb 1, 2024
It was accidentally added in curl#12730

Follow-up to eefcc1b
jamacku added a commit to jamacku/curl that referenced this pull request Feb 1, 2024
It was accidentally added in curl#12730

Co-authored-by: Lukáš Zaoral <lzaoral@redhat.com>
Signed-off-by: Jan Macku <jamacku@redhat.com>

Follow-up to eefcc1b
bagder pushed a commit that referenced this pull request Feb 1, 2024
It was accidentally added in #12730

Co-authored-by: Lukáš Zaoral <lzaoral@redhat.com>
Signed-off-by: Jan Macku <jamacku@redhat.com>

Follow-up to eefcc1b
Closes #12843
vszakats added a commit that referenced this pull request Jul 9, 2024
Replace logic dealing with `.3` files to handle the Markdown syntax.

Follow-up to eefcc1b #12730
Cherry-picked from #14097
Closes #14125
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

7 participants