-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search on global fediverse #824
Comments
No you can't for now. I a few ideas to do such thing:
|
After some chit-chat about that, we discussed of a few possibilities:
|
One downside of, as an instance administrator, following everything is that there's a lot less ability to moderate or curate or reliably categorize content. This can create problems for search quality, but also for community safety or legality. Systems like the automated flood-fill discovery above would place a somewhat arbitrary moderation burden on every administrator, if I understand the description correctly. On the other hand, a centralized server centralizes that moderation/curation/abuse-response role, for better or for worse. Personally, as an administrator, I would not want this responsibility. I would disable a promiscuous federation feature like this, since I wouldn't want to be both the admin of my instance's content and the view of the entire fediverse that my instance provides. Other admins may be more open to take on that role, but then how will those instances deal with problems of illegal or otherwise undesirable content that threatens the instance or its users? A central search server or multiple central search servers would avoid putting this extra responsibility on every instance administrator, and that may end up being a more useful service for users who are new to peertube anyway. |
Isn't possible to have a endpoint listing all the available instances with each instance categorized by theme? (themes could be "nature", "general", "politics", etc) |
@Booteille, who would decide what the categories mean and what content to include or exclude? this is still a moderation job that has to be done; perhaps tags of some kind could help the instance admins and search admins share this work, but the work still exists. |
Hi all and thx @ballsystemlord for opening this issue! I'm quite new to the concept of PeerTube but very interested in this concept because as home server administrator I want to support decentral IT service concepts in general. Promoting PeerTube among my friends the fact that there is currently no "global search" was unfortunately somehow the "walk-away-point" for using PeerTube. Looking for the "global PeerTube search" I found this issue which is extremely interesting for me related to the pro/contra list. Actually I don't know which solution would be the best to preserve a strict decentral concept but what I can say for now: The explanation video What is PeerTube? (english subtitles) is not enough to explain the users how to use PeerTube, especially related to video search. People not used to decentral concepts simply don't understand the problem of a central search and refuse service with a different usability concept (even though the different concept is for their own good). Anyway for now I would suggest: There should be an explanation text under every search box of a PeerTube node (or a symbol linking to it) that
Edit: Maybe the "there is no global search" explanation should be placed above any search results as well with a text like:
I understand that a sustainable solution of this issue needs a lot of time but until then PeerTube as software needs to pick up the people where they currently are (and the are in expectation of a global search). Explaining users why there is no global search would be the best solution that doesn't make anything wrong, just better. I'm very said to say but In the current state of the PeerTube usability the mass of the people won't accept it as long they are not educated better in decentral concepts (which unfortunately has to be done by PeerTube - What is PeerTube? (english subtitles) is just a (great) start for the needed "elucidation"). |
We could use Yacy to solve this issue.
then it should output only peertube links about "cats". Now we can integrate this into peertube's website and pass search results from Yacy into peertube's UI seamlessly, without having the user to type any special commands, only the keywords he needs results for. Note: It's worth mentioning that for this to work properly, most likely we'll need to crawl peertube websites ourselves (it's very simple to do), just as shown in this video. |
@Zig-03 If it's possible to plug into Yacy, we can run Yacy right along side Peertube. Yacy will index each and every node, instances, channels & users as members of an instance explore the fediverse. When searching, we can plug into Yacy for search results (We don't have to do everything on our own! We can stand on the shoulders of other free software). @scanlime As far as I can tell, An instance owner shouldn't have to moderate content being hosted on other instances. Our instance will not host the content permanently (Unless you manually specify to seed them) so I don't think we have to worry about any content related issues. |
@buoyantair @Zig-03 while doing something with Yacy outside of the PeerTube codebase is certainly fine, I don't see us requiring Yacy on instances just to bring them more global search results. It is yet another external tool and it doesn't simplify the deployment of PeerTube at all. |
@rigelk Why don't we explicitly ask the instance owner at installation time? + We can give them a cli tool to install new plugins (say Global search in this case). This means that we will have the current instance-follow-specific search by default and anyone else looking for global search can just install and enable them? This would not only mean that we don't have to integrate it of sorts into our code base (We just send the Yacy server search strings and it gives us back search results to display on our main Peertube client) but rather just interface around it? |
Do we really want instances to be searching among servers they aren’t normally following? For instances that want “everything”, they’ll be following as many servers as possible anyway. Many instances though don’t actually want all the content, they’re trying to be more focused. Do you as an administrator want the ability to include search results for videos that aren’t otherwise available? Do you as a user want to force all servers to include global search results? I’m not sure what the intended result is here, and it might be worth making sure that the technical capability you’re envisioning will be useful and enabled by admins. |
I agree with @scanlime |
You guys are ignoring one REALLY big thing with respect to global search engines like Yacy. Evil instances of peertube. Let me elaborate. If I'm running Google (Heaven forbid!), I can tell my search engine to not go to certain websites, to profile what websites users visit, and to preform a fuzzy search of the web database I have and rule out websites containing certain keywords that should not be used together (Like "C event oriented multi-threaded programming made easy", with "easy" being the operative keyword. :) ). We can't expect to accomplish this like Google does. We can't have a global search without a set of moderators (Requiring no JS and telling people's browsers to disable it when viewing peertube sites would set the bar for evil sites much higher though). I recommend the following (Sorry I don't know much JS so I can't much help): You already maintain a list of peertube instances, this could be decentralized so that each instance has a list and users would not have to request this information from only the main site. Advantages: Drawbacks: |
Hello, |
@silicium14 that is a very good idea for a first step. I bet the final solution will be something like that, with the decentralization given by Yacy, and fair&open recommendation algorithms along the way. Decoupling hosting and research seems to me an obvious improvement. |
Thanks. It is better than nothing, but we can already see the problem as people who host that search engine are already decided to censor search results. Probably not because they wanted to, but to avoid being the sole person responsible for whatever is shared. Maybe something like that should be hosted on a TOR network to be truly uncensored. |
In the US (a "free country") Tor has the noted drawback that most places that offer internet access for free, block access to the sites from which you get tor and tails. Many block connections to the network and any other proxies that they're aware of. Some even go so far as to block access to the websites where you can download linux distros which might have tor installed. I speak with over 4 years experience hopping from one internet cafe to another. |
This is great. And definitely needed.
|
I want to take on the work to evolve this into a YouTube-style interface where videos across all instances can be viewed. I'm glad to see there is recent discussion on this topic. Censorship on YouTube continues to grow, and at an accelerated rate. The time for this is now. |
Hi, there's an idea that I don't see having already been discussed and that I think could be relevant for this global search feature: you might get some useful inspiration from the way distributed search engines such as Yacy (for instance) work. As a disclaimer, I don't know much about them nor about their inner workings, but I know they exist and it seems to me that they might be a relevant model for PeerTube. What do you think? 🙂 (sorry if what I'm bringing up is already covered in previous discussions, I honestly didn't take the time to read the detail of all the options mentioned) |
On Wed, 27 May 2020 08:15:39 -0700 Thomas Kuntz ***@***.***> wrote:
Hi, there's an idea that I don't see having already been discussed and
that I think could be relevant for this global search feature: you
might get some useful inspiration from the way [distributed search
engines](https://en.wikipedia.org/wiki/Distributed_search_engine) such
as [Yacy](https://en.wikipedia.org/wiki/YaCy) (for instance) work. As a
disclaimer, I don't know much about them nor about their inner
workings, but I know they exist and it seems to me that they might be a
relevant model for PeerTube. What do you think? 🙂
IIRC, I did think of using Yacy as a base or whole search engine for
peertube. I decided against because, as I said earlier, a search engine,
even distributed, can be attacked by govs that favor censorship.
|
Well, I don't really get your point. What I understand is that you're saying that you don't think we should use Yacy because it's a search engine and that any search engine, even distributed, is vulnerable to censorship and should thus be avoided. But if you follow this logic, that would mean we have no search engine at all, which means no search feature. Maybe when I say "search engine" you think of external services like Bing or Google, but any piece of software that looks for specific content in a larger pool of content is a search engine. That includes the search feature in a blog, in Twitter, on Mastodon or on your local file system for instance. So building the "search on the global fediverse" feature would definitely amount to building a search engine into PeerTube. And, on the Internet, a distributed search engine is as close as you get to being censorship-resistant. :) So let me clarify: I don't suggest to use Yacy itself, nor any other third-party already-existing service. I suggest to build a mechanism similar to the one Yacy (or other distributed search engines) uses into PeerTube to power a fediverse-wide search feature. That is, a mechanism in which each instance indexes a part of the content on the fediverse (making up a "local index" on each instance) and in which, when a search is performed, requests are sent to other peer instances, searches are performed on those instances' indexes, and results are combined by the instance or user who made the search request (or maybe by a centralized "raking server"?). That's just the base idea, and in fact it's somehow similar to the third option mentioned in this comment |
Wrong. Everything can be potentially abused/attacked but there got to be something to abuse first. |
Sorry, my bad. I should have re-read my comment above. |
This is not just a problem of malicious actors. The challenge and art of searching online is to cherry-pick valuable information and separate it from the noise. Any search engine have a lot of noise. |
Most search engines have some sort of filters (I guess YaCy has them too) and we could use them to remove all the noise. For example, in google you can paste this Ok, that's nice, YaCy has them too! https://wiki.yacy.net/index.php/En:SearchParameters If we need some custom search parameters that would fit our use case - we could submit a PR on yaCy's github page! |
IMHO YaCy is great for indexing web pagaes and RSS feeds but federated universe should have its own built-in search based on DHT, similar to aMule's Kademlia implementation. |
For french who want to speak about that : https://framacolibri.org/t/recherche-globale-federee/8155 |
Implemented in #2852 |
Sorry for missusing this Issue but I don't know where else to ask: I was used to use https://peertube-index.net/ for federated search requests but since some the website seems offline. Are there any alternatives? |
Hi. Take a look at https://sepiasearch.org/ |
Thank you very much! |
I can find no search bar and your faq.md and online faq do not list a method of searching peertube leaving me with only the option of manually going to each site and searching it.
Is there something I am not seeing or is there a tool, like a browser plugin or cmdline tool like surfraw to search all of peertube?
Thanks!
The text was updated successfully, but these errors were encountered: