New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP_Server: Handle edge cases for /fsys (Image and link problem in http_server mode) #524
Comments
Hi! Thanks for creating the issue. Unfortunately, I'm not able to reproduce the issue. When you get a chance, can you try my steps below? If they still fail, please upload the session log file and I'll review it. The session.txt file would be at Also, one random thought: since you said only en.wikipedia.org is affected, can you also navigate to this page and paste the output here? Thanks
|
Following the steps you described, the images were not displayed in either of the pages, and also I was not able to follow the link to Sun since it referred to The two files I attached to this message are: The output of
|
Thanks for the logs. I missed this bit from your earlier post:
I've been testing with 'en.wikipedia.org Wikitext (2019-05)' which does work. Unfortunately, Articles does not. This became obvious as I went through the logs Let me go through the error logs a little more. I'll push out a fix sometime this week. Sorry for the bug, and thanks for following up. |
Having just built my own html for enwiki (2019-06-01), I have been examining that source and can see where @lawnowner is coming from the file Http_server_wrk.java is supposed to cope with the "/wiki/" -> "/en.wikipedia.org/wiki/" translation in the The problem is that it is too strict, instead of
relaxing this slightly
now transposes the html page properly However ... There is a second (and I think more serious bug)
Note inside the 'src' attribute 'g:/xowa' |
Hi! I fixed this in tonight's release. See https://github.com/gnosygnu/xowa/releases/tag/v4.6.3.1908 Also, I like to give credit to users for finding bugs. Right now, there's a line in the Change Log like this: Thanks! |
@gnosygnu That's kind of you. I think you can credit ctd for that. Thanks for the fix! |
@lawnowner: No problem. @desb42: Oops! My email is sometimes unreliabe. I missed your last comment above. I took a look at https://github.com/desb42/xowa/commits/http_server and it matches my commit except for the regex on As for the root cause, I'm still trying to figure out what it is. I had hoped I had fixed it in an earlier version, but since you're getting it now, it must still be occurring. I'll open up a separate issue for it later. Thanks! |
Added more general logic to handle /home/lnxusr as well as G:/xowa with the commit above. Thanks |
Interesting implementation of the regex Just a small query. Would it be an idea to place the call to |
Yeah, I'm not a fan of regex, and there were some rules that are hard to express (> 300). That said, I'll probably use regex more as I port over the MediaWiki parser
Honestly, there's a lot going on there which can be optimized (all the String replaces) That said, the suggestion is pretty straightforward so I added it per the above commit. Thanks! |
Unfortunately, now that I have merged the commits into my sources a problem has arisen There are two 'regex' conversions going on
in my case root_dir_http is
Also there is another line
which does not use quotes My version of this routine is
the definition of root_dir_fsys needs to be changed too
|
Oops. Missed all these edge cases. Let me add this back to to-do and have a fix for the weekend. Thanks |
Okay. So I ended up generalizing the logic even more and just doing something like |
Unfortunately, there are two replacements going on As an example, I build the world in G:\xowa I then copy the whole wiki to G:\xowa_dev I then need to convert both file:///g:/xowa and file:///g:/xowa_dev I think the hard coding in the html build should be avoided (somehow) |
Ugh. Should have realized that's why I didn't go with the simple fix before. Slip of mind at late hour last night. Speaking of late hour, I went back to the previous version and tried a more "general" version that handles both issues. In brief, I look for
Yup, created a new issue at #553 After I get through the current issues in the TODO board, I'll spend some time going through all the HTML Databases ones since they're blockers to creating any new HTML dumps |
Hello, thanks for the new release, much appreciated. However, it is buggy on my end for some reason. Here is the problem: I've been using Xowa version 4.5.21.1808 on Windows 10 64-bit in http_server mode with offline databases for two Wikis, one of them being en.wikipedia.org, without any problems whatsoever. After the release of version 4.6.2.1907, I downloaded the new version as well as the new databases for en.wikipedia.org, namely the 'en.wikipedia.org Articles (2019-05)' and 'en.wikipedia.org Images (2019-05)', via the URLs listed on xowa.org/home/wiki/Wiki_setup/English_wikis. After downloading and extracting, I verified md5 checksums of all databases. Now everything is fine when Xowa's built-in browser is used, but when Firefox or Chrome is used via Xowa's http_server mode, images in a page will load as empty boxes (although correctly sized), and all links in a rendered page are relative to wiki/ whereas they should be relative to the en.wikipedia.org/wiki/ (e.g. 127.0.0.1:8080/wiki/Navigable instead of 127.0.0.1:8080/en.wikipedia.org/wiki/Navigable) and thus links on a page don't work once the page is opened by typing in the address bar. The link problem exists only for en.wikipedia.org, not the other wiki, but also note that I'm using old databases for the other wiki. Thanks.
The text was updated successfully, but these errors were encountered: