Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3] Ensure that url alias segments correspond to real aliases of articles/categories #32879

Closed
wants to merge 3 commits into from
Closed

Conversation

PhilETaylor
Copy link
Contributor

@PhilETaylor PhilETaylor commented Mar 26, 2021

Issue #32490 and many others

Summary of Changes

This PR attempts to close a long standing issue where you can manipulate Joomla's urls and the router will still work.

This can be easily seen on the official Joomla.org site at:

https://www.joomla.org/announcements/BLAHBLAHBLAHBLAHBLAHBLAH/5834-LALALALALALALLALALALALALA

https://www.joomla.org/announcements/i/HACK/HACKED/YOUR/SITE/YOU/SUCK/5834-LALALALALALALLALALALALALA

Where BLAHBLAHBLAHBLAHBLAHBLAH is a made up category name and LALALALALALALLALALALALALA is a made up article alias.

That fake url still loads the "Joomla 3.9.25 Release" article which has id 5834. It should, IMHO, and in the opinion of others, be a 404, as LALALALALALALLALALALALALA could be Joomla-Sucks or anything else SEO that a hacker wants to use.

Testing Instructions

Enabled SEF and rewrite. (Legacy, not modern)

Create yourself a category tree of lets say three categories (as a real test, you choose how many you want, it should still work), each a child of the last, and then an article (with alias my-article) in the bottom most category

Top Most -> Middle Most -> Bottom Most -> my-article

Visit the home page and you should see the link to My Article in the Latest Articles Module, click it and get a SEF url of

https://example.com/9-top-most/middle-most/bottom-most/3-my-article

Where 9 is the id of my category "Bottom Most" and 3 is the id of My Article (Yours might be different ids)

Actual result BEFORE applying this Pull Request

Before this PR you can manipulate the url and replace one or more of aliases with HHHH like below, these urls STILL WORK:

https://example.com/9-top-most/middle-most/bottom-most/3-HAHA
https://example.com/9-top-most/middle-most/HAHA/3-HAHA
https://example.com/9-top-most/HAHA/HAHA/3-HAHA
https://example.com/9-HAHA/HAHA/HAHA/3-HAHA

Expected result AFTER applying this Pull Request

After this P, when you manipulate the url and replace one or more of aliases with HHHH like below, these urls DONT WORK: and now lead correctly to a 404 page.

https://example.com/9-top-most/middle-most/bottom-most/3-HAHA
https://example.com/9-top-most/middle-most/HAHA/3-HAHA
https://example.com/9-top-most/HAHA/HAHA/3-HAHA
https://example.com/9-HAHA/HAHA/HAHA/3-HAHA

But accessing the full generated url with aliases correct STILL WORKS

https://example.com/9-top-most/middle-most/bottom-most/3-my-article

Documentation Changes Required

This is backward compatible apparently not, people disagree with me, Cool. Not my decision to make.

All the new code is doing is running 2n more queries to validate the aliases provided against either the alias of a category or article with a known id, or in the case of nested categories, that ANY category in the db has the provided alias (nor perfect, I know, but much better than it currently is, and this is only for the Middle Most and Top Most category if there are 3 or more nested categories).

There is one more case I wanted to address but ran out of time and that is with the new MODERN routing its possible to generate a url like:

https://example.com/?view=article&id=3:my-article&catid=9

where my-article is the alias of the My Article (id:3) article. This can also be manipulated like:

https://example.com/?view=article&id=3:HAHA&catid=9

and that url will still work correctly. This still needs addressing and has been forked to #32880 as not to be forgotten

// cc @brianteeman @Ruud68

…/categories

Signed-off-by: Phil E. Taylor <phil@phil-taylor.com>
@PhilETaylor PhilETaylor changed the title Ensure that url alias segments correspond to real aliases of articles/categories [3] Ensure that url alias segments correspond to real aliases of articles/categories Mar 26, 2021
@brianteeman
Copy link
Contributor

This is backward compatible,

Not really. Someone might have been using this "bug" intentionally. I can see how the article part is b/c but not the category part.

As such I cant see how this could be introduced before at least 3.10 as a minimum

@PhilETaylor

This comment was marked as abuse.

@brianteeman
Copy link
Contributor

Tell me how someone would use this "bug" intentionally, and for what purpose

The how is by manually using the fake link as an href

The why is "who the heck knows" but as its been possible for 15 years ....

@PhilETaylor

This comment was marked as abuse.

@brianteeman
Copy link
Contributor

As you keep correctly saying "institutional memory" is important. We've fixed one of these things before and it caused uproar

@PhilETaylor

This comment was marked as abuse.

@Hackwar
Copy link
Member

Hackwar commented Mar 26, 2021

I agree with you, that this is a bug, but I also agree with Brian that it isn't B/C.

@PhilETaylor

This comment was marked as abuse.

Phil Taylor and others added 2 commits March 26, 2021 20:22
Co-authored-by: Quy <quy@fluxbb.org>
Co-authored-by: Quy <quy@fluxbb.org>
@ReLater
Copy link
Contributor

ReLater commented Mar 27, 2021

This pr would break 2 bigger sites of my customers. They use the actual behavior for shortcut urls in lectures and papers (university) and on trade fairs etc. (component dealer). The only b\c solution would be an additional setting.

Besides search engines, there is also a real user world.

@PhilETaylor

This comment was marked as abuse.

@PhilETaylor

This comment was marked as abuse.

@Hackwar
Copy link
Member

Hackwar commented Mar 27, 2021

Notice: This has been largely fixed in 4.0. It's not that this isn't an issue, but that I don't think this is solvable in a B/C way and thus has been fixed in 4.0 instead of changing this in 3.6 with the addition of modern routing.

@Hackwar
Copy link
Member

Hackwar commented Mar 27, 2021

"Largely" means, that the IDs are validated and the number of segments when IDs are still enabled both for categories and articles. Missing or additional segments will throw errors. If you disable IDs, then the aliases are all validated as well.

@PhilETaylor

This comment was marked as abuse.

@PhilETaylor

This comment was marked as abuse.

@Hackwar
Copy link
Member

Hackwar commented Mar 27, 2021

Please first talk to the 3.10 maintainer (@zero-24) if he is willing to accept such a change in 3.10. The production team decided some time ago that we wouldn't add changes to 3.10 which aren't necessary to migrate to 4.0. In the end it is in the discretion of the maintainer if he wants to add it or not.

@PhilETaylor

This comment was marked as abuse.

@Hackwar
Copy link
Member

Hackwar commented Mar 27, 2021

This would definitely be judged as a new feature and I would be really surprised if @HLeithner would accept this for a release in 3.9.

@PhilETaylor

This comment was marked as abuse.

@Hackwar
Copy link
Member

Hackwar commented Mar 27, 2021

I'm only telling you what I know is the current status. Would you be happier if I hadn't written this here? Like, after you invested another 5 hours into this issue, only to be shot down? As I said, it is in the discretion of the maintainers and those are @zero-24 and @HLeithner and I said to talk to them first before you waste time on this. Besides that, I disagree with the development on Joomla 4.0. Not only are we moving forward with Joomla 4.0, we also have 4.1 and 4.2 already in the pipeline. You can point your new features to the respective branches already and have a promise when these will be released. However, in the end, we can't keep adding features wherever we want, because we would never get finished with 4.0, have no incentive to switch to 4.0 and delay everything even further. I don't like it either, but that is the reality.

@PhilETaylor

This comment was marked as abuse.

@PhilETaylor PhilETaylor deleted the tighteupurls branch March 27, 2021 15:59
@PhilETaylor

This comment was marked as abuse.

@HLeithner
Copy link
Member

@PhilETaylor I can't see which comments are deleted, the github audit log only tells me that (you?) deleted 5 comments. But I think you didn't deleted it your self and complains that someone else deleted them^^

Can you help me which comment got deleted?

@PhilETaylor

This comment was marked as abuse.

@brianteeman
Copy link
Contributor

Sometimes the cache can be quite aggressive.

@HLeithner
Copy link
Member

Sorry, I was mistaken. Multiple GitHub tabs open, and some tabs were not being updated in realtime like normal so it seemed that comments I made in one tab, when later revisiting an already open tab to the same issue, were not there.

I did later remove some that were just off-topic and not adding anything constructive to the issue later in the day once the project had taken further action to secure its admin rights.

Thanks for following this up.

ok thanks for the confirmation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants