New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod html refactoring 2 of 5 #18950
Pod html refactoring 2 of 5 #18950
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite a lot of my comments are about style. Because this commit has moved a lot of code (which is good!) a lot of it looks like new code, and I tend to provide "the style is not consistent" to new code. But I know you didn't write all this new code, so ignoring style nits to move the structural improvements forward makes sense.
On the other hand, I did make a few comments that should be addressed, especially about $Podroot and eval.
Mostly: looks great!
if( $dirs[0] ) { | ||
unshift @dirs, $vol; | ||
} | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, another style nit: why is this not cuddled, when many/most others are? (I will comment on the style items entirely at the end of my review.)
|
||
# Find the line with the least amount of indent, as that's our "base" | ||
my @indent_levels = (sort(map { $_ =~ /^( *)./mg } @$para)); | ||
my $indent = $indent_levels[0] || ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this mechanism confusing. I suppose it relies on the fact that two spaces sorts before three spaces, which is not something that I find obvious. (I've never been good with less trivial string sorts, though.) I wonder if this would be clearer with a comment, anyway.
# Extract leading runs of spaces from each line, then sort these runs.
# Shorter strings will sort before longer strings (when they're all one
# character), so we can just take the first one as the shortest indent
# prefix. If there's none, default to an empty string, just to avoid
# uninitialized warnings.
my ($indent) = sort(map { $_ =~ /^( *)./mg } @$para);
$indent //= "";
ext/Pod-Html/t/lib/Testing.pm
Outdated
die("'run' element takes integer") unless $args->{run} =~ m/^\d+$/; | ||
|
||
my @cachelines = (); | ||
open my $IN, '<', $cache or die "Unable to open $cache for reading"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I echo my comments about capitalized variable names here.)
On 7/4/21 3:38 PM, Ricardo Signes wrote:
***@***.**** requested changes on this pull request.
Quite a lot of my comments are about style. Because this commit has
moved a lot of code (which is good!) a lot of it looks like new code,
and I tend to provide "the style is not consistent" to new code. But I
know you didn't /write/ all this new code, so ignoring style nits to
move the structural improvements forward makes sense.
On the other hand, I did make a few comments that should be addressed,
especially about $Podroot and eval.
Mostly: looks great!
Thank you for this review. I'm going to assign your comments into three
buckets:
1. Things that I will deal with right now, in this pull request.
2. Things that I won't deal with in this pull request because they will
be dealt with in one of the later, already planned pull requests.
3. Things that I won't deal with in this series of pull requests but
which we should consider once the series is applied and we're starting
from a cleaner, OO version of ext/Pod-Html/lib/Pod/Html.pm. I will keep
a running tab of such concerns at
http://thenceforward.net/perl/misc/pod-html-remaining-issues.txt.
(Note that if you want a sneak preview of what the code will look like
after the 5 planned pull requests are applied, go to
https://github.com/jkeenan/perl5/tree/pod-html-oo-20210512-02/ext/Pod-Html.)
------------------------------------------------------------------------
[snip]
+ext/Pod-Html/lib/Pod/Html/Auxiliary.pm Helper functions for Pod-Html
Not a hill I'd die on, but I believe "Util" is a bit more conventional
for this sort of thing. Consider:
Renamed in updated p.r.
------------------------------------------------------------------------
[snip]
+our $VERSION = 1.29;
+eval $VERSION;
I have no quarrel with using |eval|, but I believe if you look at the
other uses of eval for mangling |$VERSION| you will see this:
|$VERSION = eval $VERSION; |
Only evaluating the version will return a value that is then thrown
away. *You should assign the result back to |$VERSION|.*
Corrected in updated p.r.
------------------------------------------------------------------------
[snip]
use Exporter 'import';
-our $VERSION = 1.28;
-our @export = qw(pod2html htmlify);
-our @EXPORT_OK = qw(anchorify relativize_url);
+our $VERSION = 1.29;
+eval $VERSION;
+our @isa = qw(Exporter);
I believe you were recently removing inheriting from Exporter from core
code. Possibly you plan to do that in a subsequent commit. If not,
though, I can't tell why this would be here. In this case, the commit
message does /not/ clarify. *I believe this line is a mistake* given
that you're importing |import| from Exporter above.
Corrected in updated p.r.
------------------------------------------------------------------------
[snip]
In ext/Pod-Html/lib/Pod/Html.pm
<#18950 (comment)>:
>
-my($Title, $Header);
+my $Podroot;
I see that there is also Podroot in %Globals. Is this variable still
required? (It's difficult to tell from the reviewing interface, so I am
rudely punting the question back to you. If it's required, okay. If it's
now unused, let's delete it!
At this point, this $Podroot is still needed. If you look at the
current (v1.28) ext/Pod-Html/lib/Pod-Html.pm, you will see that this is
used inside _save_page() at line 689. At this point in the refactoring,
I couldn't figure out how to handle this, as _save_page() is used as a
callback inside that horrid chain of methods at line 327-8. But this
will be handled in pull request 5 of 5.
------------------------------------------------------------------------
[snip]
+ chdir($globals->{Podroot}) ||
+ die "$0: error changing to directory $globals->{Podroot}: $!\n";
I have an extremely nitpicky point to raise. Here, you write:
|chdir($x) || die "..."; |
Later, about twenty lines down, you write this:
|open ... or die "..."; |
This means sometimes you're ending with the operator and sometime
starting with the operator, when breaking at an operator. This is
inconsistent. If I was king of the universe I would put only one of
these as acceptable in the style guide (and it would be to put the
operator on the second line, not first). That said, making this coherent
across all the code sounds like a slog that should not block moving
everything else forward. So, I raise this as something you might enjoy
fixing (as I sometimes enjoy somewhat mind-numbing tasks) or might just
prefer to ignore.
This we file under: Pod-Html Nits the Picking of Which Will Be Deferred
until after Pull Request 5 (P-HNtPoWWBDuaPR5)
------------------------------------------------------------------------
[snip]
+ foreach my $key (keys %{$Pagesref}) {
+ if($_updirs_only) {
Another tiny style nit: The code is very inconsistent as to whether a
space is put between opening parens and keywords like |if| or |while|.
P-HNtPoWWBDuaPR5
------------------------------------------------------------------------
In ext/Pod-Html/lib/Pod/Html.pm
<#18950 (comment)>:
>
+sub identify_input {
+ my $globals = shift;
my $input;
unless ***@***.*** && $ARGV[0]) {
It would be nice to extract all the code that looks at ***@***.***| up into
some top layer and otherwise pass an arrayref around. That sounds
tedious, though, and like *it shouldn't block this work*.
I agree. Similar points raised by oodler in earlier (off-list) review.
Let's open an issue once this series of pull requests is applied.
------------------------------------------------------------------------
[snip]
- warn "loading directory cache\n" if $Verbose;
+ warn "loading directory cache\n" if $globals->{Verbose};
Is your plan, in the end, that these subroutines should be methods on an
object?
Yes. We will achieve that in pull request 5 of 5.
I ask because clearly all these |warn| calls predicated on
Verbose should become calls to |$self->log| or something. I know we're
only on step 2 of 5 though!
That wasn't part of my original plan, but makes sense. Again, let's
open an issue for it once we've got an OO module to look at.
------------------------------------------------------------------------
[snip]
+ my $FHOUT;
Why is this named in all caps? Convention (in the wild as well as in
perlstyle) is to use all capital names for constants. Sometimes this is
expanded to globals. *This variable is lexical in scope and duration, so
should probably be lowercase.* Possibly I have missed some reason.
Changed in updated p.r. -- although I've both seen and written a lot of
code that uppercases all filehandles, whether global or lexical.
------------------------------------------------------------------------
[snip]
+=head1 NAME
+
+Pod::Html::Auxiliary - helper functions for Pod-Html
+
+=head1 SUBROUTINES
+
+=head2 C<parse_command_line()>
+
+TK
(We're gonna get this in some future commit?)
Yes. At this point in the refactoring process the subroutines were not
yet stable.
------------------------------------------------------------------------
[snip]
+ if (defined($vol) && $vol) {
+ $vol =~ s/:$// if $^O eq 'VMS';
+ $vol = uc $vol if $^O eq 'MSWin32';
+
+ if( $dirs[0] ) {
+ unshift @Dirs, $vol;
+ }
+ else {
Sorry, another style nit: why is this not cuddled, when many/most others
are? (I will comment on the style items entirely at the end of my review.)
P-HNtPoWWBDuaPR5
------------------------------------------------------------------------
[snip]
+
+sub html_escape {
+ my $rest = $_[0];
+ $rest =~ s/&/&/g;
+ $rest =~ s/</</g;
+ $rest =~ s/>/>/g;
+ $rest =~ s/"/"/g;
+ $rest =~ s/([[:^print:]])/sprintf("&#x%x;", ord($1))/aeg;
All these operators line up but the last one, how sad!
P-HNtPoWWBDuaPR5
------------------------------------------------------------------------
In ext/Pod-Html/lib/Pod/Html/Auxiliary.pm
<#18950 (comment)>:
> +
+Remove any level of indentation (spaces or tabs) from each code block
+consistently. Adapted from:
+https://metacpan.org/source/HAARG/MetaCPAN-Pod-XHTML-0.002001/lib/Pod/Simple/Role/StripVerbatimIndent.pm
+
+=cut
+
+sub trim_leading_whitespace {
+ my ($para) = @_;
+
+ # Start by converting tabs to spaces
+ @$para = Text::Tabs::expand(@$para);
+
+ # Find the line with the least amount of indent, as that's our "base"
+ my @indent_levels = (sort(map { $_ =~ /^( *)./mg } @$para));
+ my $indent = $indent_levels[0] || "";
I find this mechanism confusing. I suppose it relies on the fact that
two spaces sorts before three spaces, which is not something that /I/
find obvious. (I've never been good with less trivial string sorts,
though.) I wonder if this would be clearer with a comment, anyway.
This was already present when I began this refactoring, so ...
P-HNtPoWWBDuaPR5
|# Extract leading runs of spaces from each line, then sort these runs.
# Shorter strings will sort before longer strings (when they're all one
# character), so we can just take the first one as the shortest indent #
prefix. If there's none, default to an empty string, just to avoid #
uninitialized warnings. my ($indent) = sort(map { $_ =~ /^( *)./mg }
@$para); $indent //= ""; |
------------------------------------------------------------------------
[snip]
+ my @cachelines = ();
+ open my $IN, '<', $cache or die "Unable to open $cache for reading";
(I echo my comments about capitalized variable names here.)
As above.
Thank you very much.
Jim Keenan
|
Per rjbs code review in Perl#18950
Remove superfluous assignment to @isa. Per rjbs code review in Perl#18950
Per rjbs code review in Perl#18950
A method to assist in debugging cache problems. Should assist in resolving Perl#12271. Signed-off-by: James E Keenan <jkeenan@cpan.org>
Touch-up t/htmlview.t
Group all the "global" variables into a hashref $globals. This variable will be the return value of init_globals() within pod2html(), will be passed through, and augmented by, parse_command_line() and then used through the balance of pod2html(). In essence, so far this is just a renaming of variables. Remove declarations of superseded variables. Pass $globals to get_cache(), cache_key() and load_cache().
This package will hold helper subroutines used within the main package or in tests. They can be placed in a separate module and imported into Pod::Html because they won't depend on having the globals passed as an argument. They will also be potentially independently testable. Start with html_escape(). Move anchorify(), htmlify() to Auxiliary.pm. Also _unixify -- now as unixify(). Move relativize_url() to Auxiliary. Move usage() to Auxiliary. Move trim_leading_whitespace to Auxiliary. Move parse_command_line() to Auxiliary. Keep porting tests happy. Increment $VERSION. Run: ./perl -Ilib regen/lib_cleanup.pl anchorify.t, eol.t: Correct excessive corrections. Standardize setting of $VERSION.
We now start placing parts of sub pod2html() into separate subs so that we reduce the length of pod2html(), making it more readable and (ultimately) preparing for method calls. Signed-off-by: James E Keenan <jkeenan@cpan.org>
Signed-off-by: James E Keenan <jkeenan@cpan.org>
This clears up a bit of semantic confusion. We were using one lexically scoped variable -- my $parser -- to hold two different objects: one for parsing input, one for writing output. We can encapsulate the working of the input parser to make the code more readable. Signed-off-by: James E Keenan <jkeenan@cpan.org>
Encapsulate! Encapsulate! Signed-off-by: James E Keenan <jkeenan@cpan.org>
Encapsulate more code within pod2html(). The return value is for convenience. It's the setting inside $globals that counts. Signed-off-by: James E Keenan <jkeenan@cpan.org>
Further encapsulation of code to improve readability. Signed-off-by: James E Keenan <jkeenan@cpan.org>
Further encapsulation of code internal to pod2html(). At this point we face obstacles: %Pages; $output. Signed-off-by: James E Keenan <jkeenan@cpan.org>
Per rjbs code review in Perl#18950
Remove superfluous assignment to @isa. Per rjbs code review in Perl#18950
Per rjbs code review in Perl#18950
9322b29
to
ca8c8b6
Compare
Per rjbs code review in Perl#18950 Signed-off-by: James E Keenan <jkeenan@cpan.org>
Remove superfluous assignment to @isa. Per rjbs code review in Perl#18950 Signed-off-by: James E Keenan <jkeenan@cpan.org>
Per rjbs code review in Perl#18950 Signed-off-by: James E Keenan <jkeenan@cpan.org>
Per rjbs code review in Perl#18950 Signed-off-by: James E Keenan <jkeenan@cpan.org>
Remove superfluous assignment to @isa. Per rjbs code review in Perl#18950 Signed-off-by: James E Keenan <jkeenan@cpan.org>
Per rjbs code review in Perl#18950 Signed-off-by: James E Keenan <jkeenan@cpan.org>
Per rjbs code review in Perl#18950
Remove superfluous assignment to @isa. Per rjbs code review in Perl#18950
Per rjbs code review in Perl#18950
Per rjbs code review in Perl/perl5#18950
Remove superfluous assignment to @isa. Per rjbs code review in Perl/perl5#18950
Per rjbs code review in Perl/perl5#18950
This is the second of
twofive pull requests in support of #18894. These p.r.s are intended to make the internals of Pod-Html less messy. Ultimately Pod::Html will hold a series of methods calls on an object. In this commit we start to gather together most of the loose lexical variables holding state into an object. We also move auxiliary functions into a package of their own.