-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BooruBridge similar to WordPressBridge? #402
Comments
don't bother, it's nasty: I burned my eyes doing the refactoring of the .ooru. bridges :) I really don't like the fact that a full url can be used as a parameter: to many things could go wrong if the right verifications are not done, the first of them being the RSS-bridge host used as a proxy for whatever purpose. Also as said in #401, I have an upcoming patch that will add information about HTTPS status of each sites : even if they share the same content structure, their HTTPS status can be very different, which means that bridges can access data over a secure connection for some while using an insecure connection for others |
I'll take your comment from #401 here, so we can have one common platform:
There are probably hundreds of thousands of WordPress sites out there, so the possible amount of bridges that can be implemented is just impossible to maintain. This is why in the past a whole bunch of compatible WordPress bridges were removed with the strict requirement to use the general WordPress bridge instead. An array of supported URLs however is a different matter and only requires the change of one single line in order to add/remove another site. So in order to limit the input to a given set of values we just need a drop-down list.
So, for as long as we can paste custom URLs, it would be possible to request HTTP for a site that actually supports HTTPS, right? A drop-down list will limit the input to specific sites and can also enforce secure connections. Additionally, by using the two-layer approach (sub-elements), we can even add additional pages for each site (for specific forums and such).
😞 |
Except that users may not be aware of the undergoing CMS used by the site they want to see supported, therefore they will not go check in the WhateverBridge drop-down list. The current way of listing bridges according to the site they are related to should be conserved. I imagine a more generic way :
Consider the above as pulled out of my head without any real strong thinking about its feasibility or bad consequences. |
Yes @pmaziere it was actually the purpose for Xbooru, initially ^^ http://www.numerama.com/magazine/33328-des-sites-de-mangas-hentai-bloques-en-france.html Some year ago I used booru bridges in combination with Autoblogs. While boorus have the same principles, several CMS exist. Booruproject being a platform like Wordpress.com, there's also Danbooru, Gelbooru, and the more evolved Moebooru (used by Konachan, Yande.re,..) it's PHP fork MyImouto (Lolibooru), Shimmie (Milbooru)... Trying to pinpoint recurring elements to use in unified bridge could be tough, but could really help reducing the number of different bridges (and allowing easier addition of more supported websites) |
That's not what I'm talking about: I meant using any flaw in the url processing code (or lack of processing, such as what was done in the VkBridge) to download whatever data from the RSS-Bridge provider host, unrelated to the initial RSS-Bridge goal.
that's what have been done in cd361f8 |
So, we need a better way to guide the users to the correct bridge. This is a usability issue, not a bridge issue.
We can do the same by adding more information to existing bridges. Imagine something like this: class CMSBridge extends BridgeAbstract {
// We don't need this anymore (because of SITES)
//
// const NAME = "CMS Bridge";
// const URI = "https://cms.com";
// const DESCRITPION = "A bridge to cover sites generated by CMS";
const PARAMETERS = array(
"u" => array(
"name" => "username",
"required" => true
)
);
const SITES = array(
"site1" => array(
"name" => "MySite",
"uri" => "https://my-site.com",
"description" => "Returns the latest news from 'MySite'"
),
"site2" => array(
"name" => "Other site",
"uri" => "https://other-site.com",
"description" => "Returns the latest new from 'other-site.com'"
)
);
// And so on, and so forth...
} We can render this into multiple cards, where each card uses the exact same PARAMETERS and collectData implementation. Of course for sites that require additional functions we can either create a new bridge or do optional stuff based on the current site. The whitelist can then either apply to the entire CMSBridge, or just a few SITES within the bridge (like "site1" could be whitelisted like "CMSBridge_site1" or something similar. I also imagine a more generic way of actually displaying bridges: Right now we got about 120 bridges, which are rendered into a very long page. This is a very inefficient way of doing things and in my opinion already exceeded the point of being user-unfriendly. Adding more bridges/cards will just make things worse. Instead of displaying all bridges on one page, we should rather provide a simple list of domain names (with some kind of search filter), so the user can select or search the site of their choice. Next the user can choose options and select the output format like they can do now. |
For something like WordPressBridge, it's going to be a huge const SITES array.
do you mean something like an alphabetic index and a search box ? |
That would be the preferred solution. 👍 I just pushed a sandbox branch on my fork to attempt what I wrote earlier: https://github.com/LogMANOriginal/rss-bridge/tree/MultiSiteBridges That branch only has the WordPress bridge working with three sites. I tested everything except caching and whitelisting and it is most certainly possible. Still, changing all bridges to use the SITES constant (or a JSON file for that matter) is no good solution, but using another base class for CMS based bridges sounds like a good alternative.
Yes, exactly. |
I agree.
Bridge::setInputs(array $inputs, $queriedContext) becomes Bridge::setInputs(array $inputs, $queriedContext, $parameters=static::PARAMETERS) Then we could reorganize the class hierarchy as follows: Bridge
=> FeedExpander where JSON should define a feed URI in addition to the site URI as you did in your branch with the SITES constant, JSON file defines
And the alphabetical index can be implemented and the search box could be implemented independently from the above. I'll put the HTTPS status feature I implemented on hold since it would need to be adapted to all the above changes |
These are some quite heavy changes. You raise some really interesting points that got me thinking for a couple of hours. Finally I got an idea that might be easier to implement with less refactoring (sorry in advanced for the long comment 😅): First of all, right now BridgeAbstract contains a lot of functions which really should not be part of it. The purpose of BridgeAbstract is to provide a base class that covers standard implementations for all functions defined in the interface. So we should place following functions into separate and maybe already existing classes:
I have thought about it and came to one important conclusion: The purpose of BridgeAbstract is to be a base class for all bridges and it is not supposed to be a factory for bridges. Which brings me to the next point: We need a bridge factory class. Right now we build bridge cards based on the files on disk. This is a very simple solution which now hit its limits. What we need is a factory class that builds bridges based on regular bridge classes AND based on JSON-formatted data with some base class. In order to do that we have to provide some kind of indicator in order to distinguish between "normal" bridges and "multi" bridges. At the same time both types should still be able to make use of all standard functions in BridgeAbstract. One of the easiest ways to do that, is to introduce another interface for "multi" bridges. Let's call it "MultiBridge" for now: interface MultiBridge {} The interface does not necessarily need to define any functions (though I think it will). However, using that additional interface we are already able to distinguish bridges by using something like Of course there are many more details to work out, but I think this way we are very flexible and could even introduce additional bridge variations in the future.
By using the interface approach it shouldn't be a problem to continue your project, since existing bridges wouldn't be touched in any destructive way. Actually I'm pretty sure we only need to change very little of the existing code and can focus on new functions entirely... probably™ 😜 I hope I was able to convey my idea. At the weekend I should be able to write some code examples based on the real code. On a side note: I think it makes sense to have TWO interfaces and actually define the const::NAME, const::URI, etc... in the interface in order to make things more specific. (We can actually have another layer between BridgeAbstract and the bridges in order to get base classes for MultiBridge and RegularBridge and thus have more intelligent getName, getURI, etc... functions.) |
From my point of view, these do not justify to be methods of a class: simple functions should be sufficient, but I may have missed something.
Input validation methods are related to the bridge parameters and won't be used elsewhere, so I fail to see what could justify their removal from BridgeAbstract ? I may need more explanations to understand your two interface solution: I don't see how multiple interfaces would be more appropriate than a simple class hierarchy. An interface defines elements that a class must implement to be considered valid. If different classes with no hierarchical relationship shares these elements, then the interface is justified, but when the classes can be represented as a hierarchy, where is the need for this interface ? |
👍
A bridge should do input validation, but it should not implement the actual code for doing so. Right now a bridge overriding validateNumberValue would be absolutely correct, but in reality we don't want that.
It's not more or less appropriate, just a different solution I think is easier to implement, also because it has absolutely no impact on existing bridges. Both solutions should be equally fine in the long run.
I'll try to explain how I think about it: Bridges and MultiBridges are like Cars and Trucks (talking about objects). They are on the same level of the hierarchy and only share a common parent (BridgeAbstract respectively BridgeInterface, or Vehicle). However there are distinct differences between them, which are not covered by the common base. Bridge provides NAME, URI, DESCRIPTION by design and the BridgeInterface already defines functions to access them via getName, getURI and getDescription. MultiBridge on the other hand does not know NAME, URI or DESCRIPTION, but instead requires us to tell it these parameters based on the JSON-formatted file on disk. To close the circle, Bridge cannot work with any additional parameters received from a JSON-formatted file. This is where another interface comes into play: MultiBridge will still adhere to the BridgeInterface, but it will also adhere to the additional MultiBridgeInterface which requires MultiBridge to implement additional functions for the caller to be able to set the additional data. There is one very important factor here: MultiBridge is not responsible for loading data from disk. This is done by the caller.
There are many ways to do the same thing. An interface just defines functions, where an abstract class can already implement a certain set of functions. Interfaces however clearly state how the "contract" between the caller and callee works, without any distraction of implementation, where abstract classes are much harder to comprehend. For our small classes this is not a very big factor, but you can see the impact for very complex interfaces. I have pushed a new branch on my fork to show a possible interface solution: https://github.com/LogMANOriginal/rss-bridge/tree/InterfaceMultiBridge Again, this is far from being final, but it already does work with the three sites for WordPressBridge. All existing bridges are still working as before btw. Please notice that HTMLUtils is very messy as I just duplicated the original code with some adjustments for MultiBridges. This can be done in a much more cleaner way by using functions. |
That's where we diverge: I consider the MultiBridge class to be a child of the misnamed BridgeAbstract.
So you don't consider a MultiBridge class at all, but rather another interface implemented by existing bridges, while I consider a new MultiBridge class derived from Bridge Abstract from which existing bridges like WordPressBridge could derive.
I fail to understand the benefit of the above. In your implementation you will have to replicate the setSite method in each bridges implementing the MultiBridge interface, while this could be done once and for all by a MultiBridge class. Why should the caller (either index.php or HTMLUtils if I understand correctly) should be the one loading the data from disk ? Again, the loading code could be implemented once and for all in a MultiBridge class.
I agree with the clarity of interfaces over abstract classes, but in the case of the MultiBridge interface, it does not apply since the method defined in the interface will always be implemented in the same way for each bridges. All I wrote above came as long as I read your post. I guess that I should implement my solution, and then we could compare it to yours to see what would be the more interesting to finally use. |
In case you were still waiting for the implementation I proposed above, sorry but I can't find time to work on it for the moment |
Thanks for letting me know. This is not a high priority for me right now, but I still intend to make this work. I have thought a bit about your solution and I'll do some tests myself, though this will have to wait until my next holidays. |
I haven't had time to comment on this issue previously, but as I have time now, here it is : I think that implementing multi-bridge is a good idea, it could be used for bridges providing feeds from sites in different languages, but not using the same code, etc... However, I strongly disagree on the idea of removing wordpress bridge, and not allowing the use of custom urls. I think that removes a great functionality of rss-bridge, that is used by many (including me). I do understand, however, the security reasons that motivate your idea, but rss-bridge is never using any 'dangerous' php function, such as exec, etc..., and the only time some arbitrary data are not displayed immediately, they are passed through the json_encode/json_decode function. I do agree, though, that some people may not want external users to access their rss-bridge instance and fetch arbitrary feeds, which is not possible for the moment. I think we could add an optional HTTP authentication to address this issue, as it is easy to implement and supported by a great majority of feed readers. |
Me too, though once multi-bridges are implemented, generic bridges can optionally be disabled via the whitelist. That way we get both. I'll try to get this working before 2017 arrives 😅 |
This is slightly off-topic of course and the proposed solution about derivative bridges sounds great, but I do have one question: a WordPress bridge sounds rather… redundant? |
Yes, writing bridges for sites using WordPress is very redundant (as with most CMS I guess). The only thing changing is the URL and maybe a few deviations in the HTML code due to different versions. Of course having a single bridge accepting any URL is working very well but you loose control to some extend as literally any compatible site can be fetched (including ones you maybe don't want to be associated with). I still haven't found time (and the urge) to work out a working solution and due to my final exams closing in this will have to wait for quite a bit unfortunately. |
The HTML can vary a fair bit with custom themes. I was referring more to the fact that it comes with decent RSS/Atom built in. Even if not explicitly exposed, a /feed at the end of the URL tends to do the trick (or equivalent query string). Of course you could easily kill that if you really wanted to so I guess that must be an issue? :-) |
The current WordPress bridge is actually making use of the build in options, see https://github.com/RSS-Bridge/rss-bridge/blob/master/bridges/WordPressBridge.php#L70-L72 If that option is disabled nothing will show up and a custom bridge is necessary. |
I've got to say that only makes me more confused as to the purpose of this bridge. What do people use it for? |
Good question. I actually don't use it myself. It might be useful to anyone who doesn't know about the /feed option I guess. A lot of sites actually provide carefully hidden feeds with no way to reach them naturally. Being able to just paste the URL and receiving a feed is helpful. Of course now that I think about it instead of building a new feed we could just forward users to the right address without actually downloading random stuff... |
Most of wordpress websites are offering truncated RSS feeds. |
Ah right, I get it. Thanks for explaining. Incidentally, is there a generic expander bridge where you just pass a feed and a CSS class to extract? |
I started working on this, but never pushed it to RSS-bridge. |
Yes, definitely. There are actually two "bridges" I've been thinking of adding, although I wasn't completely sure to what extent they'd fit into this project:
if ($action == 'permit') {
if (preg_match("/$string/", $item->get_title()))
$display = true;
}
elseif ($action == 'block') {
if (!preg_match("/$string/", $item->get_title()))
$display = true;
} |
See the comment RSS-Bridge#402 (comment) Split off from RSS-Bridge#535
In case it is not obvious - I lost interest in this topic 😄 |
Bridge information were exposed and accessed via public constants which doesn't work if you want to generate bridges dynamically as discussed in RSS-Bridge#402
See the comment RSS-Bridge#402 (comment) Split off from RSS-Bridge#535
There are a lot of bridges that extend basically from BooruprojectBridge or DanbooruBridge:
Most bridges overwrite some constants and don't add any additional functions, so maybe it is possible to do something similar to WordPressBridge where you can insert the URL and request data for any compatible site. I'm not (yet) familiar with the site contents, but is "booru" something similar to WordPress which we can utilize with a more general approach?
The text was updated successfully, but these errors were encountered: