Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support non-Wikimedia wikis with auto-name detection and default image support (other none wikimedia wikis) #5

Closed
pirate486743186 opened this issue Jun 27, 2015 · 26 comments

Comments

@pirate486743186
Copy link

I'm actually using xowa for a wiki that is not affiliated with wikimedia.
They are some configuration problems, since the program assumes all wikis are wikimedia related.

How do i fix the configuration files for it?
Where is the configuration file?

Images:

It uses commons, but the wiki doesn't use commons.
I would like to selectively disable commons just for that one.

It doesn't use the images from the wiki it self.
Probably it assumes some naming convention about the link that of course doesn't apply here.

The image dump is only 90 MB, not like the scary terabites of wikipedia. I would like to use offline images, but the help page sais this feature is obsolete.

Minor issues:
It messed up the name of the wiki, it is listed a the name of the folder the dump was in. Also it uses xowa picture as the logo.

@gnosygnu
Copy link
Owner

Hey, thanks for the interest in XOWA.

Out of curiosity, is this a wikia wiki? If so, I was able to help someone with configuring a Xenoblade wiki. See https://sourceforge.net/p/xowa/discussion/general/thread/91f5df9f/#ed2d .

I think this should work with other wikis as well. If you let me know which one, I can give it a try.

As for your questions:

How do i fix the configuration files for it?
Where is the configuration file?

Well, for now, your best place is home/wiki/Help:Options/Config_script . You'd have to put in lines like the following:

app.wikis.get('xenoblade.wikia.com').files.wkrs.get('fs.dir') {
        orig_dir        = '~{<>xowa_root_dir<>}wiki/xenoblade.wikia.com/file/orig/';
        thumb_dir       = '~{<>xowa_root_dir<>}wiki/xenoblade.wikia.com/file/thumbs/';
}

This would tell XOWA to use the images that are saved in /orig/. Note that they have to be saved there beforehand; XOWA doesn't know how to download non-wikimedia images

It messed up the name of the wiki, it is listed a the name of the folder the dump was in.

You should just be able to rename the folder. For example, if the folder is /xowa/wiki/messed_up_name, just rename it to /xowa/wiki/correct_name. Your history links and bookmarks will break, but XOWA doesn't store the information anywhere else

Also it uses xowa picture as the logo.

XOWA stores the logo here: /xowa/user/anonymous/wiki/messed_up_name/html/logo.png . Just make you rename this directory, if you rename your wiki

Hope this helps.

@pirate486743186
Copy link
Author

It's nethackwiki
http://nethackwiki.com/wiki/Main_Page

The local images seam to work. fixed the logo. You should put this in the help page.
It builds it's own thumbnails and crushes the old ones?
Hmm, i think their image dump is a bit sloppy, their thumb folder is 120MB, while the whole thing archived is 140MB....
I put an empty thumb folder in the path, and it seams to be working correctly...

When i click on an image, in the page of the image i get the text of the page but i don't get the image it self. Not supported?

the categories are not supported yet right?

You are saying, only local images for non wikimedia wikis?

*Renaming the folder didn't work. :'( *
It appears on the list of wikis on the left, but it doesn't work.
"could not find fille:///site/nethackwiki.com"
if i try to manually copy paste the url
nethackwiki.com/wiki/Main_Page
It complains it couldn't find it.

The template
http://nethackwiki.com/wiki/Template:Monster
is a bit broken.

It can get an attribute "name", if it doesn't get it, it tries with the title of the article.
then it call an other template {{monsym|}} to get the symbol of the monster.
monsym template works correctly.

In xowa, when the attribute name is given, the symbol is trashed.

For example here
http://nethackwiki.com/wiki/Vecna
In the infobox, the purple L is the symbol of the monster.
In xowa it appears as "{{Monsym/Template:#replace:vecna}}" (it should be a purple L)

@gnosygnu
Copy link
Owner

It's nethackwiki

Cool. I actually used to play this game. :)

The local images seam to work. fixed the logo. You should put this in the help page.

Okay. I'll add it to Help:CSS and link to it from the Wikia article.

Just so you know: you can always click "View HTML" and figure out where everything is coming from. Since everything's offline, all the files are sitting on your system somewhere.

It builds it's own thumbnails and crushes the old ones?

Yes. XOWA only loads original files. Thumbs are generated on the fly. It won't use pre-existing thumbs from a dump (I don't know the file-naming convention, and they would probably vary per method-of-generation)

I put an empty thumb folder in the path, and it seams to be working correctly...

Yup. That's fine. I should have specified that it should be empty. It's basically a temp directory

When i click on an image, in the page of the image i get the text of the page but i don't get the image it self. Not supported?

Hmm... This should work, but I may have to add more code for non-WMF wikis. Which image / page?

Also, can you give me the link for the offline xml dump (images too would be great). I can pull it down and work through it on my side.

the categories are not supported yet right?

This is "harder". WMF wikis have a separate table for categories. For example, https://dumps.wikimedia.org/simplewiki/20150603/simplewiki-20150603-categorylinks.sql.gz .

The nethack wiki probably only has an XML dump, without any of the table dumps.

This table dump can be regenerated by parsing every page, but I don't have this set up yet.

You are saying, only local images for non wikimedia wikis?

Yes, no downloading of images for non-WMF wikis . I might add support for this in the future, but I'm worried that there's too much variation in terms of the absolute file location.

*Renaming the folder didn't work. :'( *

Yeah, I was wrong. I forgot I changed the code a while ago and baked the domain name into two places:

  • The actual file names within /xowa/wiki/nethackwiki.com. For example, rename the .xowa file to nethackwiki.com-text.xowa
  • The database data inside the xowa_dbs table in nethackwiki.com-text.xowa. You'll need to open up the db with the sqlite3.exe shell and update the sql

I'll add a rename wiki function in a future version. Out of curiosity, why don't you reimport the dump again under the correct name?

The template http://nethackwiki.com/wiki/Template:Monster is a bit broken.

Yeah, this too will probably need the wiki set up on my side. I looked at the wikitext now, and nothing there looks particularly strange.

@pirate486743186
Copy link
Author

" Cool. I actually used to play this game. :)"
Why i'm not surprised :P

"(I don't know the file-naming convention, and they would probably vary per method-of-generation)"
That would explain why it ended up been 120MB and the admin didn't notice.

"Hmm... This should work, but i may have to add more code for non-WMF wikis. Which image / page?"
....all of them. I couldn't see anything special in the HTML either
for example this one
http://nethackwiki.com/wiki/File:Sanctum.png

dumps
http://nethackwiki.com/nethackwiki_current.xml.gz
http://nethackwiki.com/nethackwiki_current_images.tar.gz
They have a custom mediawiki extension that builds tty maps
For example that "screenshot" in the main page, is not a screenshot
it's the only thing that is normal not to work.
You'll see them trashed here and there.

"reimport the dump"
I was planing to, i'm still testing and would be to much of a bother to reimport within an appropriately named folder just for the name.

@gnosygnu
Copy link
Owner

That would explain why it ended up been 120MB and the admin didn't notice.

Actually, the thumbs are correct and fairly standard. I should change the code to use them, but for now, I'm hoping regenerating new ones shouldn't be an issue

http://nethackwiki.com/wiki/File:Sanctum.png

Thanks for the example. This is an issue with XOWA, and I have to change the code to handle the non-WMF wikis. It's not an easy change, so it may take a few weeks.

dumps
http://nethackwiki.com/nethackwiki_current.xml.gz
http://nethackwiki.com/nethackwiki_current_images.tar.gz

Cool. Very useful.

They have a custom mediawiki extension that builds tty maps
For example that "screenshot" in the main page, is not a screenshot

Yeah, it took some time, but I see what they do. They use javascript to load /wiki/Special:RandomInCategory/Main_Page_rotation .

  jQuery( document ).ready( function( $ ) {
    $('#mainpage-ttyscreen').load('/wiki/Special:RandomInCategory/Main_Page_rotation .ttyscreen');
  } );

This isn't an easy change for me to do, primarily because I'd have to get RandomInCategory working (and Category has problems as we discussed earlier)

I know this is highly visible (being on the front page), but I probably won't be able to get around to this for a while.

I was planing to, i'm still testing and would be to much of a bother to reimport within an appropriately named folder just for the name.

Okay. I'll add a rename wiki in a future release, but for now your best bet is just to reimport.

Also:

The template http://nethackwiki.com/wiki/Template:Monster is a bit broken.

This is due to nethackwiki implementing #replace and other non-WMF string functions. I'd have to add support for this. I'll try to work on this first over the other items listed here.

@pirate486743186
Copy link
Author

you misunderstood. I wasn't asking to implement using the thumbs or about the front page.
The way the thumbs work, it's good enough (a part not been able to see the full image in the image page). About the tty maps, i meant to say that they are custom changes, so you shouldn’t bother with them.

I think you should rather have an import option for non-wikimedia wikis, that asks for the name and image configuration during import. I saw that the xml contained information about the name of the site, so you could give that as the default option.

You mean they have more custom stuff? If you try to implement all custom changes of wikis, you'll never reach the bottom of it. You can't instead edit the template in wikinethack, so that it works in both the site and xowa? The template almost work's correctly.

@gnosygnu
Copy link
Owner

gnosygnu commented Jul 1, 2015

you misunderstood. I wasn't asking to implement using the thumbs or about the front page.

Sorry. I didn't mean to imply that you were making a request. I only meant that XOWA probably should use the official thumbs. I didn't put in this code, because I didn't think that anyone ever released dumps of thumbs. Now that I know they exist, I realize I should probably use them instead of generating new ones in parallel. (even though like you said these new ones work just as well)

The tty maps is more of a wish thing. I thought the maps looked neat, and would have wanted XOWA to show a more authentic nethack front page. Otherwise, yes, it is a bit of work.

I think you should rather have an import option for non-wikimedia wikis, that asks for the name and image configuration during import. I saw that the xml contained information about the name of the site, so you could give that as the default option.

Hmm... I can definitely default to the XML contained name as opposed to the dump name. The image configuration is a bit more work, but I'll try to work on this as well

You mean they have more custom stuff? If you try to implement all custom changes of wikis, you'll never reach the bottom of it.

Yeah, agreed. That is a danger, and generally I steer clear of them. However, the #replace function is pretty easy to implement and it does exist in MediaWiki (though it's not used in the WMF Wikipedias).

You can't instead edit the template in wikinethack, so that it works in both the site and xowa? The template almost work's correctly.

Well, I could ask the user to add Module namespaces (Lua code), but I think that it would just be easier to implement #replace.

@pirate486743186
Copy link
Author

"that anyone ever released dumps of thumbs"
Here it was "released" out of sloppiness rather then anything else.
I think you can really skip using thumb dumps

"tty maps"
I was thinking of a script to "compile" the xml dump, so that they get displayed correctly.
But this is beyond the scope of xowa.

requests:
About the name, at least: default to the name in the xml, and ask during import if it has to be changed. You could also add a delete wiki option.

About the image configuration, at least: put instructions in the Help page and a link to it from the import page. And ask during import, if it should use commons, and an option to activate deactivate commons in the config pages.

@gnosygnu
Copy link
Owner

gnosygnu commented Jul 3, 2015

I think you can really skip using thumb dumps

Ok. It's on my to-do list, but I'll make it low in priority.

I was thinking of a script to "compile" the xml dump, so that they get displayed correctly.
But this is beyond the scope of xowa.

Again, a sort of low-priority to-do list item. I'm more interested in the Special:RandomByCategory item.

About the name, at least: default to the name in the xml, and ask during import if it has to be changed. You could also add a delete wiki option.

Yeah, but I usually don't like to ask for prompts. The imports were designed to be scriptable, and prompts get in the way.

I'll add an option for wiki_name_type and default this to the xml_type. The user can always change it to the dump name.

You could also add a delete wiki option.

Deleting a wiki is still very simple. Just remove the directory!

About the image configuration, at least: put instructions in the Help page and a link to it from the import page. And ask during import, if it should use commons, and an option to activate deactivate commons in the config pages.

Agreed. The instructions I gave you was really meant for power-users. I'll come up with a more automatic way (one that doesn't use a custom script). This is a higher to-do priority than the others, though it may still be a few weeks.

@pirate486743186
Copy link
Author

"Yeah, but I usually don't like to ask for prompts."
yea an option like this "name_of_wiki=Default"
You change it, or you leave it.

"delete wiki"
It's not obvious that deleting the folders will also remove the links on the sidebar.
Or brake the program totally and refuse to start....
You could put this in the help pages

@gnosygnu
Copy link
Owner

gnosygnu commented Jul 4, 2015

yea an option like this "name_of_wiki=Default"

Okay. Cool. That's what I was thinking.

It's not obvious that deleting the folders will also remove the links on the sidebar.
Or brake the program totally and refuse to start....

Yeah, XOWA is meant to be portable. So long as you don't move anything in the /bin/ folder (or the root folder, like xowa.gfs), moving anything else should just work. Content may be missing, but nothing should just break. Otherwise, it's a bug.

You could put this in the help pages

Yup, will do. Thanks!

@pirate486743186
Copy link
Author

beh >:P
I reimported, but now the images i alredy had visited don't work.
The first time, it was trying to download images from commons, this kind of mess it up.
I think, pages i first visit now work as expected.

Some of the pages of the images, now display a thumb, but the link is to the commons folder.
So this explains why the images didn't display the first time.

I tried to delete the commons folder but it didn't do anything. simplewiki still works correctly however.

@gnosygnu
Copy link
Owner

gnosygnu commented Jul 5, 2015

Hmm.. Did you try updating the option at home/wiki/Help:Options/Config_script ?

app.wikis.get('nethackwikicom').files.wkrs.get('fs.dir') {
        orig_dir        = '~{<>xowa_root_dir<>}wiki/nethackwiki.com/file/orig/';
        thumb_dir       = '~{<>xowa_root_dir<>}wiki/nethackwiki.com/file/thumbs/';
}

@pirate486743186
Copy link
Author

I alredy changed the paths in the script, i didn't forget that.
app.wikis.get('NetHackwiki').files.wkrs.get('fs.dir') {
orig_dir = '{<>xowa_root_dir<>}/wiki/NetHackwiki/images/';
thumb_dir = '
{<>xowa_root_dir<>}/wiki/NetHackwiki/images/thumb/';
}

I think it's commons that messed things up.
The first time, it did download some pictures as if it depended to commons.

@gnosygnu
Copy link
Owner

gnosygnu commented Jul 5, 2015

Hmm.... The only other thing I can think of did you restart XOWA after this change? The Config_script is one of the few options that requires a restart

Otherwise, I'm on IRC for a while if you want to try some live troubleshooting there. See http://webchat.freenode.net/?channels=#xowa

Thanks

@pirate486743186
Copy link
Author

I deleted a suspicious sqlite file in the image folder, that fixed the issue. Maybe note this in the image help pages.

It still, a little bit strange though. If i had visited the image page when it was downloading from commons, it displays the thumbnail in the image page, but tries to link to the image in the commons folder. If i hadn't visited the image page, then no thumbnail is displayed. So, commons still does something.

And one more request. To be able to set up the logo at import and in the the config pages. Or add instructions in the help pages.

@gnosygnu
Copy link
Owner

gnosygnu commented Jul 6, 2015

I deleted a suspicious sqlite file in the image folder, that fixed the issue.

Hmm... Out of curiosity, was it "^orig_regy.sqlite3"? If so, did you move this sqlite database over with the images?

It still, a little bit strange though. If i had visited the image page

Well, if you're talking about a "File" page like "File:Sanctum.jpg" then that's possible. I still have to rework the "File" pages for non-wikimedia wikis

And one more request. To be able to set up the logo at import and in the the config pages. Or add instructions in the help pages.

Ok. Setting up the logo may be difficult, as I can't automatically download it like I do for the Wikimedia wikis. Instructions could definitely work.

Also, I'll have the {{#replace}} fixed for either next week or the week after. This fixes the Monster Box issue for Vecna.

@pirate486743186
Copy link
Author

"did you move this sqlite database over with the images?"
Yep, i did that :P . I moved the whole folder.

Yea the "File" page.
Just make sure that there is an option during import if you want commons or not.

Ok

@gnosygnu
Copy link
Owner

gnosygnu commented Jul 8, 2015

Yep, i did that :P . I moved the whole folder.

Ok. That looks like a bug. The "^orig_regy.sqlite3" stores absolute paths, not relative paths. Deleting it was the easiest way. I'll need to add this to the list of things needing fixes. :|

Yea the "File" page.

Noted.

Just make sure that there is an option during import if you want commons or not.

Yeah, it'll probably be a few releases, but it will get in there.

Thanks.

@gnosygnu
Copy link
Owner

I added support for {{#replace}} and other string functions in tonight's v2.7.2. http://nethackwiki.com/wiki/Template:Monster now works. I confirmed with nethackwiki.com/wiki/Vecna .

At this point, I think these are the actionable items left in this issue before I close it

  • Option to use dump name of wiki when importing
  • Automatically set up images (no custom script) Set up images through a simple option, not by a custom script
  • Fix broken File pages
  • [Added] Use relative paths for images, not absolute ones

Other items that are on my todo list, but probably won't be done for a while

  • Show tty maps on front page
  • Use thumbs from dump (instead of auto-generating them)

Let me know if I missed anything.

Thanks.

@pirate486743186
Copy link
Author

The requests:

"Option to dump name"
I'm not aware of that use of the word. But setting a name at import is fine. And the name from the xml as the default value is also good.

"Automatically set up images"
You mean a gui import and configuration option
Or at least instructions in the help pages.

Ask during import, if it should use commons, and an option to activate deactivate commons in the config pages. I think this is important, it messes up the wiki by default otherwise.

like you said before, have relative paths instead of absolute paths in the sqlite database.

Well, your todo list are not request from me.
Maybe you misunderstood about the tty maps. They are tty maps all over the place.
They are a custom plugin, taking some new wiki stuff, and building a nice map. So i'm not expecting anything from you about it.
For example check sokoban and it's wiki source. The tty maps are completely messed up
"xowa://nethackwiki/wiki/Sokoban"
and what it was supposed to look like
"http://nethackwiki.com/wiki/Sokoban"
I don't think that the thumbs are very important.

@gnosygnu gnosygnu self-assigned this Jul 17, 2015
@gnosygnu
Copy link
Owner

"Option to dump name"

I'm not aware of that use of the word. But setting a name at import is fine. And the name from the xml as the default value is also good.

Sorry: accidentally dropped a word. "Option to use dump name". Otherwise, I agree with you on using xml name as default.

"Automatically set up images"

You mean a gui import and configuration option
Or at least instructions in the help pages.

Yeah, meant "non-manual" way of setting up images. So a better config option than specifying a script. I changed the comment above.

Ask during import, if it should use commons, and an option to activate deactivate commons in the config pages. I think this is important, it messes up the wiki by default otherwise.

Understood.

like you said before, have relative paths instead of absolute paths in the sqlite database.

Thanks. Forgot to include this. Added above.

For example check sokoban and it's wiki source. The tty maps are completely messed up
"xowa://nethackwiki/wiki/Sokoban"

Thanks for the example. You're right. This is something different than the front page. Nethack wiki has its own custom PHP exception <replacecharsblock>. See http://nethackwiki.com/wiki/User:Paxed/ReplaceCharsBlock

The extension is pretty simple, but it won't work in XOWA b/c it's PHP. I may experiment with this in the future. Basically, I'd allow the user to register custom Lua handlers for <special_tags>. The user would have to rewrite the PHP code in Lua, but they should be able to achieve similar effects. Definitely a future todo.

I don't think that the thumbs are very important.

Yup. agreed here as well.

@mwang141
Copy link

curious, how about http://memory-alpha.wikia.com/wiki/Portal:Main , the Wikia for Star trek fan.. cheers.

@gnosygnu gnosygnu changed the title other none wikimedia wikis Better support non-Wikimedia wikis with auto-name detection and default image support (other none wikimedia wikis) Oct 22, 2016
@gnosygnu
Copy link
Owner

Hi. Thanks for the post. I didn't realize how long this issue was open for.

curious, how about http://memory-alpha.wikia.com/wiki/Portal:Main , the Wikia for Star trek fan.. cheers.

Not sure I understand the request.

Other than that, it works like the other Wikias. The CSS / images are missing, but I'm afraid that's going to take a while to me to tackle. I have my hands full with Wikimedia wikis.

Let me know if you need more info. Thanks.

@gnosygnu
Copy link
Owner

I'm going to mark this item resolved. They should be all handled by v4.2.

To repeat from above, the actionable items in this issue were:

  • Option to use dump name of wiki when importing:
    • The wiki name will come from the dump's folder, not the dump name. Using the dump folder name was easier.
  • Automatically set up images (no custom script) Set up images through a simple option, not by a custom script
    • This was supported in v4.1. XOWA will now automatically assume that there are images in C:\xowa\your_wiki\file\orig. There's no need for the complicated startup script.
  • Fix broken File pages
    • This was fixed in v4.1
  • [Added] Use relative paths for images, not absolute ones
    • This was also fixed in v4.1

If there's anything else, let me know, and we'll track it in a new ticket. Thanks.

@gnosygnu
Copy link
Owner

As mentioned above, these issues were handled in v4.2. I'm going to close out the ticket now. Thanks again for the report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants