New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize search categories #1410
Comments
|
Can we have a TODO list of items that should be fixed or eliminated? Like
Mark things that were fixed (or that should not be fixed) as ✓. @coke what did you do to get this list? Can you update the list in this ticket? |
|
Ping :-) |
|
The CompUnit category has disappeared. As for the rest, I think it's better to create a test for what's left there; not clear to me what's the desired target number of categories, if any. |
|
Still not too clear how categories are created... I'll check this out. |
But some of the others mentioned in #1410 might still be useful. Phasers and Asynchronous phasers, for example. Also some reflow.
:tick: Reflow :tick: Eliminates Unneeded Caps :tick: Eliminates uneeded categories refs #1410
|
The problem is that this generates a |
Category creationCurrently, categories are assigned using the The problem is How to obtain these valuesYou can download Perl6::Documentable and execute: use Perl6::Documentable::Registry;
my $registry = Perl6::Documentable::Registry.new(
:$cache,
:$topdir,
:dirs(["Language", "Type", "Programs", "Native"]),
:verbose($v)
);
$registry.compose;
# json list containing all search entries
say $registry.generate-search-index();Or you can go to SolutionI do not know what should I do with these. We need to discuss how
Let me know your opinions. |
My small idea:
I suspect a lot of warnings will be produced for the first run, but once we'll get rid of them, It will be easy to maintain an understandable and solid set of search categories. The second important thing is, of course, to have this list somewhere documented on docs contributing page and mention it in warning message, e.g. "See allowed categories at foo.bar.com", so that people will be able to select a correct one. It will put a bit of an end to all this |
|
Just to be clear: even without an ill will intended, humans are not best when it comes to being perfect, so when someone lazy like me adds an anchor in Judging from the fact we have categories like "Buildall (Method)" it is not always even clear for people what should be in |
|
Mm, I like your idea @Altai-man, it's doable. Now the problem is to define the list of posible values for |
Fix index syntax of pointy block Probably also with #1410
|
An update: current "categories" are:
So in three years it became more messy. |
|
I can add a test to catch that we don't any new ones, at least, until we decide what the correct listing is. Any interest? |
This would be very helpful! I am working at "deciding" the correct listing this exact moment... |
|
Questions / things to note:
What categories I suggest (in parentheses I put items from what we have now that will be absorbed, if something is not present it should be removed):
What bothers me is that we currently have "Language", "Syntax", "Reference". What is there is often mis-categorized (e.g. builtin I imagine "Syntax" explains syntax bits (quoting, keywords, etc), "Reference" explains semantic bits (what twigils are, things like that) and "Language" is, uhh... I know it contains pages under "Language" category, but still as search items their titles are not always very welcome. But anyway, we probably can live with them not to over-complicate. Thanks for reading to this point... What should be done next:
|
|
I was unable to |
|
Do Then you do: and grep on lines produced by it. If you don't like lines, you can look at implementation of |
It's got its own braid and all. So yes.
Probably
I remember vaguely we've been there already. Let me search back issues.
OK
Why? And please, no redirects... Let's just try and have things that are programmed and tested and well specified.
All these categories are reasonable; but I'd like to see which ones are removed. That's probably more significant.
Hard to say. They were already there when I arrived. Would probably need to dig into the blame for those lines.
Syntax is generally those things that are pure syntax and are not a function or anything like that like if or do. But here's the thing, I think we're mixing two different things here. One's search term categorization, which is something, other is page categorization, which is... totally different and created somewhere else.
I'd rather see a list of what needs to disappear.
The thing is page generation is tested under CircleCI, which is the only one that has Documentable. I'd like to keep it there, if possible. Documentable is not a stable module, and I'd really not like it to be a dependence of this, so if it's needed for some test, please take it to the CircleCI build.
While making sure that there's no big change in URLs. There shouldn't be, at least at the path level, but just to make it clear. |
|
Re Python, see #2355 |
A "Should we choose X or Y" question does not get on with a "Probably" answer.
Because someone has indexed https://docs.raku.org/language/functions#Blocks_and_lambdas so wrong it created a page with URL literally being This is why it should be re-indexed and this URL should point to something else, thus a redirect. "No redirects" is not an option, we have fallen soldiers, had before and will be forced to have since now. "No redirects" means "Abandon them". And it does not interfere in having a stable implementation in any way.
^ removed completely. Everything else from the list just migrates to new categories suggested (corresponding items are noted in parentheses after category names).
Yes, that's the feeling I have. But we can live with it for now, I'd say.
See above, plus what should be absorbed is already proposed above.
Why? |
Well, leaning towards "yes", but don't have a strong opinion for this.
But I would say that's an example of bad indexing, not a systemic problem. And that can be fixed now. Probably, going forward, there should be a way of banning this kind of things (but I don't really see how we can prevent all possible mistakes)
Still, I'd rather have no redirects. Right now there are a few tweaks you have to make (mainly to serve files with no extension as HTML), as well as some special treatment for things that have a "." in its name. If we want to ban that kind of things in the index, so be it: let's add a test, or whatever to avoid that. But "solving" it with a redirect is simply kicking the ball down the field.
Well, I should maybe qualify that. First, expanding it to meaning "don't try to solve any problem with the document generation using infrastructure". Second, qualifying it to mean "anything that needs special treatment should be well specified and dealt with within the doc generation framework".
See #2355. Please reopen it if you really have a strong opinion.
This is all originated in the 101 page that was incorporated coming from somewhere else. It probably makes sense to eliminate them, but then again, indexing policy is not something that should be done in an ad-hoc way. And then, making an accept-list the default policy does not really solve the issues related to indexing that are there: #3458 and #3520, for instance. Also #2575 which was closed and probably should not. We don't even have an unified criterium for category naming; these above should probably be banned just on the basis of using parentheses...
And this one on the basis of using all CAPS.
We should keep the all whitespace search category. Just kidding.
Except for Python, (and maybe classes? I really have no idea about that one) I mostly agree. The problem is not that we agree on these categories (or not), is that we need to create a spec for categories, and have all existing ones follow that spec, raising errors a warning if someone creates a new category, and an error if they are not up to spec. This applies to Python, for instance, and mostly to any of them. We can discuss all the way to Mendocino and back if Python should be in (or we should also add Ruby or Perl; BTW, all perl2x categories are special-treated in the documentation, IARC), but at the end of the day it's a judgment call. Having a search category spec or rule that can be enforced will put us on a different ground.
The main problem, the way I see it, is that it's not tested against what we want to achieve from it. It's unit tested (and that's a big improvement over the htmlify.p6 we had before), it's tested for build errors (in CircleCI), but there's no test that checks if what's generated will fit what we already have in the deployed docs. That's bit us (hard) several times in the past. It might be that the coverage of the unit tests is not really complete, and for the time being we don't have a coverage test in Raku to check that. |
The way is, IMO: 1)Spec a list of allowed categories (what this issue is about).
The first issue you refer to relies upon us having a spec/standard of search categories and points to this issue. So this issue must be resolved first and I proposed a solution above.
So let's create it. I made a proposal above stating the categories. If something is obviously wrong with that, let's tweak it. If it sounds sane, let's go with it, document it and close this ticket after the test is done and docs are adapted.
Let's make it stable then. Having a standard for search categories is a step towards that. It does not suddenly solves every single issue, but we won't get anywhere without solving specs one by one because we don't have them or they are not so complete just now. I understand your worries and not wanting to deal with possible messing up in process, it was shaky and partially is. Maybe it is worth to work on this in a calm branch then? Potential changes won't affect anything on master there and the new tooling will work with a branch easily. When/if it is stable enough and nicer indeed than it is now, it is not a great deal to just migrate changes. |
|
Well, some category errors are in this branch, so they should be fixed. I've already created an issue for that. We need to be on the same page, however, regarding categories. You say accept list, but as I say, that's simply kicking down the ball. It means that we will have to discuss every single addition to the accept list. Let's try to do this: let's look at what the current list of categories has in common, and what makes them acceptable. Let's iterate until we deduce a set of rules from them. In this process, let's also take into account that documentation pages have metadata that includes categories, so we might want to have some common criteria for both. Eventually, we'll get a rule-based accept-list, but also a rule to test new categories as they are created. Also, I think that this procedure should take place in the problem-solving repo, for maximum audience. When we have that, let's get back here and solve this issue. Would that be acceptable for everyone? |
Returning back to this. I don't think it should be such a bureaucratic process for such a trivial issue. What this issue is really about? We have a language. Its design is more or less done over last 20 years. In a language we have things. We document them. When user searches, we want to categorize things for them into some categories to ease the process. So, basically, we have classes, roles, routines, methods, infix ops, etc etc. Things like that. There was no such a list of categories from the start and people did ad-hoc solutions they saw fitting at the time when the decision was made. Alas, it resulted in various mistakes (when totally incorrect things started to count as a category) or just inconsistent styling (e.g. uppercase vs lowercase). What I suggest is to simply write a list here (there is a suggestion above, if you don't like it - comment what exactly is wrong there until we all satisfied). Document it, write a test and gradually update our docs we have now to adhere to the list. That will resolve this issue, once and for all, really. There is no need for a whole process of acceptance or rules for new categories. New categories - will there be some?
Do we really need one? Now, after 20 years, we have, say, classes and roles. Maybe tomorrow someone will invent If you really insist, I will initiate a problem solving ticket and be the one who suggests a solution (a copy-paste of what is above, but still), because I agree it is hard to make a decision for everyone to adhere when you feel responsible. |
|
It's probably the best to have something working. But I would still like to have something resembling a "bureaucratic process", probably in a different thread, because I've not been here for a long time, but it's long enough to see how unsolved problems tend to show up again a few months down the line. |
|
So we had Raku/problem-solving#250 for more than 2 weeks and no progress. |
|
So after 5 years, I believe this can be closed, as we have a standard list of categories set and clarified and reflected in docs. |
The current build generates the following categories of items for the (search) index.
Some of these have potential overlap, like Class & Type; others, like 0B (Radix Form) or Proc::Async Object shouldn't be top level categories at all. Items with only one entry are also suspect, like "Parameter"
The text was updated successfully, but these errors were encountered: