Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI application crashed instantiating MVCFactory #38524

Merged
merged 14 commits into from
Aug 20, 2022
Merged

CLI application crashed instantiating MVCFactory #38524

merged 14 commits into from
Aug 20, 2022

Conversation

nikosdion
Copy link
Contributor

@nikosdion nikosdion commented Aug 18, 2022

Pull Request for Issue #38518 .

Summary of Changes

The CLI application (cli/joomla.php) now accepts a --live-site=URL option to set up the URL of your site.

If you do not define this option it falls back to the $live_site in configuration.php. If that is empty it uses the fake URL http://www.example.com https://joomla.invalid/set/by/console/application just so that the CLI application does not crash.

Testing Instructions

  • Install a Joomla 4.2.0 site
  • Install the minimum reproduction plugin: plg_console_derp.zip
  • Go to System, Manage, Plugins and enable the Console - Derp plugin we just installed.
  • Open your command line terminal and go to your site's cli directory e.g. cd /var/www/test/cli
  • TEST 1: Run the CLI application as php joomla.php
  • TEST 2: Run the CLI application with THE FULL path to the joomla.php file, e.g. php /var/www/test/cli/joomla.php

Actual result BEFORE applying this Pull Request

Test 1 works: you get a list of the available CLI commands.

Test 2 returns an error: Could not parse the requested URI http:///var/www/test/cli/joomla.php

Expected result AFTER applying this Pull Request

Test 1 works: you get a list of the available CLI commands.

Test 2 works: you get a list of the available CLI commands.

Documentation Changes Required

Add the following to the documentation of the CLI application:

  • --live-site=URL Tells the CLI application what is the URL to your site. This is mainly used for creating URLs in articles, emails etc. If you do not specify this parameter Joomla will read the $live_site parameter in your configuration.php file. If that is empty, or if the URL you specified is invalid, the URL http://www.example.com will be used instead.

Co-authored-by: Richard Fath <richard67@users.noreply.github.com>
@roland-d
Copy link
Contributor

@nikosdion This will not be forgotten, we are putting out RC1 now and there will be an RC2 with at least this PR over the weekend.

@richard67
Copy link
Member

I have tested this item ✅ successfully on b0242c2


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/38524.

@HLeithner
Copy link
Member

2 thing I'm thinking about.

  1. is --live-site really a good name? I mean actually it's the base-uri (I know in the configuration it's called like that) but if we get multidomain support I would expect it get removed in the configuration
  2. Should we consider the TLS configuration?
  3. (it's always more then the number you write in the first line) is it good to use http://example.com maybe 1. https:// would be better 2nd something more joomla specific? like https://unknownhostname.joomla.org which we can catch and redirect to a help site?

@nikosdion
Copy link
Contributor Author

is --live-site really a good name? I mean actually it's the base-uri (I know in the configuration it's called like that) but if we get multidomain support I would expect it get removed in the configuration

Yes, to keep it consistent with configuration.php. It's been called that for 17 years. People mostly understand it intuitively. I agree it's the Base URI but this means nothing to most people and would yield confusing results if searched. Also, I was trying to find a name which does not conflict with existing 3PD commands because if we define it in the application scope and you try to also define the same option in the command scope you get an exception.

Should we consider the TLS configuration?

Good point. We can check if it's HTTPS and set the $_SERVER['HTTPS'] as well in this case.

(it's always more then the number you write in the first line) is it good to use http://example.com/ maybe 1. https:// would be better 2nd something more joomla specific? like https://unknownhostname.joomla.org/ which we can catch and redirect to a help site?

Also a good point, as long as we use the IANA RFC 2606, the same RFC I used when I chose example.com. In this case we should be using https://joomla.invalid per the aforementioned RFC.

I recommend AGAINST using a joomla.org subdomain. What if Joomla decides in the future to change its name or TLD? What if Joomla goes away and the org properties disappear? What if we somehow screw up DNS and end up with open subdomain redirection issues? This would cause security concerns. Using a joomla.invalid domain protects us since these are de facto unroutable domain names in any DNS resolver.

@HLeithner
Copy link
Member

is --live-site really a good name? I mean actually it's the base-uri (I know in the configuration it's called like that) but if we get multidomain support I would expect it get removed in the configuration

Yes, to keep it consistent with configuration.php. It's been called that for 17 years. People mostly understand it intuitively. I agree it's the Base URI but this means nothing to most people and would yield confusing results if searched. Also, I was trying to find a name which does not conflict with existing 3PD commands because if we define it in the application scope and you try to also define the same option in the command scope you get an exception.

Valid argumentation

(it's always more then the number you write in the first line) is it good to use http://example.com/ maybe 1. https:// would be better 2nd something more joomla specific? like https://unknownhostname.joomla.org/ which we can catch and redirect to a help site?

Also a good point, as long as we use the IANA RFC 2606, the same RFC I used when I chose example.com. In this case we should be using https://joomla.invalid per the aforementioned RFC.

I recommend AGAINST using a joomla.org subdomain. What if Joomla decides in the future to change its name or TLD? What if Joomla goes away and the org properties disappear? What if we somehow screw up DNS and end up with open subdomain redirection issues? This would cause security concerns. Using a joomla.invalid domain protects us since these are de facto unroutable domain names in any DNS resolver.

Yep in this case please take "https://joomla.invalid/set/by/console/application" or something which is really unique and can be easily found on google.

@laoneo
Copy link
Member

laoneo commented Aug 19, 2022

Thanks for this. Finally this issue is addressed. Just wondering when this one gets merged, if we should introduce a similar functionality in https://github.com/joomla/joomla-cms/blob/4.2-dev/libraries/src/Application/CliApplication.php, even when it is deprecated.

@alikon
Copy link
Contributor

alikon commented Aug 19, 2022

I have tested this item ✅ successfully on dad7d0c


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/38524.

@alikon
Copy link
Contributor

alikon commented Aug 19, 2022

RTC


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/38524.

@joomla-cms-bot joomla-cms-bot added the RTC This Pull Request is Ready To Commit label Aug 19, 2022
@nikosdion
Copy link
Contributor Author

I've updated this with the minor changes requested by @HLeithner.

@laoneo No, the CliApplication superclass is used to create custom CLI scripts. In this case the 3PD knows if they will need the real or a fake URL to the site and write their code accordingly. Source: I had been doing that for ten years before Joomla 4.0 was released in all of my software.

@laoneo
Copy link
Member

laoneo commented Aug 20, 2022

Have been the last commits just CS or do we need new tests?

@richard67
Copy link
Member

Have been the last commits just CS or do we need new tests?

By review I’d say the changes are ok, mainly CS and a little optimization, but maybe one quick retest could be good to make sure I haven’t missed some silly typo.

@HLeithner
Copy link
Member

Just saying your last commit b6b004b using ternary is the same as using else which should be avoided, In the end it's exactly the same as before only less readable ;-)

The 1d2f32b was much better in my opinion. Because it's more readable.

@nikosdion
Copy link
Contributor Author

nikosdion commented Aug 20, 2022 via email

@richard67
Copy link
Member

Maybe I am neurodivergent and just don’t know about it? I prefer the ternary, too.

@HLeithner
Copy link
Member

It's not only the "avoid using else" thing it's also about early return which is more or less part of avoiding else programming style.
At least for me it makes life much easier if I see a return because I know the rest of the function doesn't intressted me anymore. The same for else, I see the variable is set before and after the if it's the same was before if the if^^ doesn't evaluate to ture.

This article looks good to explains it better then me https://dev.to/darlantc/avoid-using-else-write-a-better-code-12l9

What's really hard for me to read is ternary operator using multiple lines and stacked multiple times. If I would like such unreadable code I would write perl or ecmascript.

@nikosdion
Copy link
Contributor Author

I know all about early returns — actually, the case against if-else — and I know where they come from (I've done x86 assembly programming in my youth). In this case the early assignment makes sense from both a human working memory and code micro-optimisation point of view.

        $liveSite = '';

        if ($input->hasParameterOption(['--live-site', false])) {
            $liveSite = $input->getParameterOption(['--live-site'], '');
        }

This works great because in most cases the --live-site command-line option is NOT set. If in most cases it was set the ternary would work best:

        $liveSite = $input->hasParameterOption(['--live-site', false])
            ? $input->getParameterOption(['--live-site'], '')
            : '';

Why? Because the happy path of the code is the one where the condition is true and we can save the cost of assigning an empty value.

However, we could NOT write it as:

$liveSite = $input->getParameterOption(['--live-site'], '');

if (!$input->hasParameterOption(['--live-site', false])) {
    $liveSite = '';
}

even though an axiomatic adherence to the early return rule would say we could. That's because the early assignment causes artefacts, namely it will throw an Exception if the option --live-site is not set.

That is to say, the preferred way to write the same thing depends not only on how we read the code and what our code style rule dictates but also on the abstract knowledge of how our application is used and what are the artefacts caused by each code path.

This is something most articles, including the one you linked to, fail to note, instead presenting an axiomatic point of view that if-else (and, by extent, ternaries) are inherently bad. Remember, folks, no syntax and syntactic sugar exists unless it has a valid use case.

Now, the next ternary — actually, Elvis operator — on that code block reads

$liveSite = $liveSite ?: $this->get('live_site', 'https://joomla.invalid/set/by/console/application');

This is idiomatic but it's definitely NOT the same as writing Perl, thank you very much. The whole point of introducing the Elvis operator in PHP 5.3 was exactly to avoid constructs like this:

if (empty($liveSite)) {
  $liveSite = 'https://joomla.invalid/set/by/console/application';
}

You might thing that the latter case is more readable but it actually adds one more thing in the reader's working memory (parsing a block of code adds one more level of mental work than reading a line with an Elvis operator as long as you are familiar with the Elvis operator). The problem is that according to research conducted in the late 1960s and early 1970s people can only hold about 4 items in their mind without a problem and a maximum of 7 before struggling so much they give up.

But wait, there's more! At 4 items in your working memory you still have enough mental capacity to identify unexpected situations and react to them, e.g. spotting a logic error (bug). When you are running at full capacity, holding seven items in your memory, you don't have any mental capacity to spot the unexpected.

This is actually the human side of the case against if-else constructs. Using if-else constructs leads to nested if-else constructs which bloat our working memory and make us blind to bugs inside the inner constructs. Early returns address exactly that. The fact that they also optimise the jump points of machine code and make the code faster to execute is a happy artefact and the machine side of the case against if-else constructs. The machine side of the case against if-else constructs predates the neuroscience research which is why you have only heard about this bit and not how it affects humans reading your code.

Sorry, I went on a bit of a tangent here. The point is, you must really know why you apply a rule, not apply it axiomatically. In this case we can remove the ternary but had it been the case that --live-site is specified more often than it's not then we should have left the ternary in regardless of preference and coding style.

@richard67
Copy link
Member

@nikosdion Maybe you can update the description of this PR to the right fallback URL, just for documentation purpose?

Currently it says:

If you do not define this option it falls back to the $live_site in configuration.php. If that is empty it uses the fake URL http://www.example.com just so that the CLI application does not crash.

Just replace the http://www.example.com by https://joomla.invalid/set/by/console/application for future readers.

Or maybe, if the "invalid" in the domain might be too scary for some readers, find another way to describe it.

@richard67
Copy link
Member

I have tested this item ✅ successfully on bb9cbd7


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/38524.

@nikosdion
Copy link
Contributor Author

@richard67 Done!

@richard67
Copy link
Member

I've just tested again and also debugged that the --live-site parameter - if given - works as desired. So I think RTC is still valid.

@richard67
Copy link
Member

I've restored @alikon 's test result since the changes after that were just code style.

@richard67 richard67 added this to the Joomla! 4.2.1 milestone Aug 20, 2022
@HLeithner
Copy link
Member

Sorry, I went on a bit of a tangent here. The point is, you must really know why you apply a rule, not apply it axiomatically. In this case we can remove the ternary but had it been the case that --live-site is specified more often than it's not then we should have left the ternary in regardless of preference and coding style.

Didn't wanted to start a codestyle/optimization debate here, it's also a bit offtopic^^

We can do this in it's own thread but in the end I think it will end in opinions and will not fit all so let us focus on more important topics for the moment ;-)

@nikosdion
Copy link
Contributor Author

@HLeithner I said it's a tangent 😆 Don't worry, I am not debating the code style. I am just noting that the Joomla code style is PSR-12 and section 5.1 does NOT mention early returns. That is to say, the rule about early returns and avoiding if-else blocks is a preference, not a code style, and we should know where it comes from and why we do it.

I actually DO AGREE with simplified if-blocks and I've spent a lot of time about a decade ago refactoring my own code after attending J and Beyond where Rafael Dohms did a presentation on code callisthenics. Having done that for a decade I also know the exceptions (discovered them the hard way) and learned why we do what we do. I just disseminated that knowledge just in case someone else needs it when they try to apply the simplified if-block rule and they hit a brick wall.

In the end of the day, code is a unique style of prose which has to be readable by humans and machines alike. It stands at the intersection of human and applied sciences. The more we understand about both, the better code we can all write.

@roland-d roland-d merged commit b775fdb into joomla:4.2-dev Aug 20, 2022
@joomla-cms-bot joomla-cms-bot removed the RTC This Pull Request is Ready To Commit label Aug 20, 2022
@roland-d
Copy link
Contributor

Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants