Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special pages? #9

Closed
arthurpsmith opened this issue May 18, 2020 · 28 comments
Closed

Special pages? #9

arthurpsmith opened this issue May 18, 2020 · 28 comments

Comments

@arthurpsmith
Copy link
Contributor

Hi Denny - I'm wondering if you've had a chance to think about/look at what some of the Special pages should do? I'm particularly interested in:

  • WhatLinksHere - show the Zobjects that make use of this object in any way
  • Lists by type - list all functions with labels in a particular language, for example
  • Other search functionality maybe?
    I've taken a peak at the Mediawiki extension documentation on this, but it will take a bit more time to figure out how best to do this I think, so if you have any advice I'd appreciate it!
@vrandezo
Copy link
Contributor

Yes re WhatLinksHere.

Lists by type I would make not functions with labels in a particular language, but rather ZObjects according to their Z1K1. That already will get us quite far.

Other search functionality would be ZFunctions by return type and ZFunctions by types of input.

If we had that that would already be a huge help!

@arthurpsmith
Copy link
Contributor Author

Ok, pull request #11 fixes WhatLinksHere!

@arthurpsmith
Copy link
Contributor Author

arthurpsmith commented May 26, 2020

Thanks for merging; so I'm working on the other Special pages by type. I have it basically working, but I'd like to put a label by each Z object id on the pages (if possible for WhatLinksHere too). The "Z36(zid)" method seems to produce very nice text for a label, but it's super-slow to run it as is done in AbstractTextContent.php (it would be nice to speed that up too!) So alternatives I've been considering for fixing this are:
(1) add a database table with zid, language, and the Z36(zid) value as text, for quick lookup (I've already added a table to handle the type info, so this isn't too crazy an idea - but maybe we don't want to cache these things so permanently?)
(2) set up eneyj to run as a server (i.e. respond continuously to any request), I think this would significantly cut the time for the Z36 run vs command-line script, as the vast majority of the time used seems to be start-up time; also this would allow other things like the API and tests to be run more quickly similarly, it might also fix security issues with running command-line scripts!
(3) Some sort of hybrid as (2) may still be a little slow, where there's a database cache but it expires?

Any strong feelings on any of this?

@arthurpsmith
Copy link
Contributor Author

arthurpsmith commented May 26, 2020

Actually I got a little distracted there with the alternatives - Z36 doesn't quite do what I'm looking for anyway. I'm going to try adding labels just using the getLabel approach for now, and see how bad that is. But feedback on running eneyj as a server (and how that might be done) is still of interest to me!

@arthurpsmith
Copy link
Contributor Author

to follow up - getLabel is quite speedy and works well, going with that for now

@vrandezo
Copy link
Contributor

Yes, we'll probably need to run eneyj as a server anyway at some point. Also this would improve performance not just for the start up but also for the caching, which is quite considerable.

I haven't looked into that yet.

Thank you so much for making WhatLinksHere work! I am very much looking forward to the other special pages!

@arthurpsmith
Copy link
Contributor Author

Ok, see pull request #14 !

@vrandezo
Copy link
Contributor

vrandezo commented May 28, 2020

I am trying out your pull request, but the DB table doesn't seem to be created, so the special pages fail. I started with a fresh docker, i.e. running

docker build --no-cache -t repo/wikilambda .

Error message:

Step 12/13 : RUN cd /var/www/html &&     php maintenance/importTextFiles.php -s "Import data" --prefix "M:" --overwrite extensions/AbstractText/eneyj/data/Z* ---> Running in ac97891257a3
Importing 540 pages...
Wikimedia\Rdbms\DBQueryError from line 1603 of /var/www/html/includes/libs/rdbms/database/Database.php: A database query error has occurred. Did you forget to run your application's database schema updater after upgrading? 
Query: SELECT  att_type,att_position  FROM abstract_text_type    WHERE att_zobject = 'Z1'  
Function: AbstractText\TypesRepo::getObjectData
Error: 1 no such table: abstract_text_type

#0 /var/www/html/includes/libs/rdbms/database/Database.php(1574): Wikimedia\Rdbms\Database->getQueryExceptionAndLog('no such table: ...', 1, 'SELECT  att_typ...', 'AbstractText\\Ty...')

@arthurpsmith
Copy link
Contributor Author

Uh-oh, I should have tried restarting the docker instance from scratch!
I'll have to dig into what I missed. To add the table you do need to run the db update maintenance script:

php maintenance/update.php

@arthurpsmith
Copy link
Contributor Author

Hmm, maybe that just needs to be added to the docker startup. I'll try that.

@arthurpsmith
Copy link
Contributor Author

Ok - see pull request #15 - I don't know if there's a better way to do that, but this works!

@vrandezo
Copy link
Contributor

I am trying it again, but the Functions by Parameter and FUnctions by Return type don't seem to work?

@vrandezo
Copy link
Contributor

The objects by type works beautifully and so fast!

@arthurpsmith
Copy link
Contributor Author

I am trying it again, but the Functions by Parameter and FUnctions by Return type don't seem to work?

Are you going to http://localhost:8081/index.php/Special:FunctionsByReturnType and
http://localhost:8081/index.php/Special:FunctionsByArguments

(or whatever port you're using)
They should also be linked from the Special:SpecialPages page...

@vrandezo
Copy link
Contributor

Screenshot 2020-05-28 at 15 17 08

Yes, I see the special pages, but there is no list.

There's also nothing unusual in the logs.

@arthurpsmith
Copy link
Contributor Author

Here's what I see right now for that page:
specialpage
I'll try rebuilding from scratch with Docker again though and see if there's something missing.

@arthurpsmith
Copy link
Contributor Author

By the way, adding ?uselang=de works pretty nicely (though these functions need name translations!):
specialpage_de

@arthurpsmith
Copy link
Contributor Author

Ok, I did the docker rebuild and I'm seeing the same issue - digging in!
By the way, my "eneyj server" draft makes the docker build much much faster!! Still needs a bit of work on robustifying it though.

@arthurpsmith
Copy link
Contributor Author

The problem appears to come from the sequence of Z object imports. If a function is imported before its types, then it doesn't get to the code that adds the return and argument types. I've tweaked this to fix this specific problem, but maybe we should adjust the import sequence to import types first? Or maybe that's not really necessary? The import process doesn't really need to do as much as it does right now...

@thadguidry
Copy link

@arthurpsmith
Copy link
Contributor Author

@vrandezo see pull request #17 - that should fix the issue when building from scratch with docker. An alternate workaround is to log into the docker container and run the refreshLinks.php maintenance script, which will reprocess everything and does work this time because the types have already been imported! But with #17 everything should work from scratch.

@thadguidry as far as I know the Dockerfile here will work with or without buildkit - I'm not too familiar with that though, what are the advantages?

@thadguidry
Copy link

@arthurpsmith try it out, its very simple to use, just add the environment variable as described in docs. The advantages are:

By integrating BuildKit, users should see an improvement on performance, storage management, feature functionality, and security.

It makes repeated builds much faster, since it introduces "smarts" around caching (cache this layer, skip it, or build from scratch again) to know about changes. Akin, to other build caching in MAVEN, Buildr, etc.

@arthurpsmith
Copy link
Contributor Author

@thadguidry thanks, I'll have to try it. However, the slowness in import is almost entirely due to the text rendering running 2 eneyj scripts via shell calls, so that means it's starting up node 1000+ times; otherwise the standard docker build caching is plenty fast!

@thadguidry
Copy link

@arthurpsmith Sure. This info might be useful for you later as well, if ever needed. https://www.docker.com/blog/advanced-dockerfiles-faster-builds-and-smaller-images-using-buildkit-and-multistage-builds/

@arthurpsmith
Copy link
Contributor Author

@thadguidry Ah I see - the main advantage I think is where you need parts of an image, but want to throw away the rest of it to keep the final image small. I don't think that's something that is an issue here, but it might be if we diverge more from the Mediawiki base image. For now I think everything we're adding in the builds (extensions, npm modules, mediawiki updates etc.) is stuff we need to keep around.

@thadguidry
Copy link

@arthurpsmith Exactly my thinking and why I mentioned it.... for your future. :-)

@vrandezo
Copy link
Contributor

Thank you so much @arthurpsmith !

The special pages are really great, they make navigating the wiki so much easier!

I have tested them and think we can close this issue now. Thank you!

@arthurpsmith
Copy link
Contributor Author

Great! Closing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants