New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dumpgenerator.py: check if the filename actually contains a file extension (Wikia) #212
Comments
Southparkfan, 04/01/2015 20:18:
For individual wikis it's easier to visit Special:Statistics and click |
Hi, I am the admin of this wiki. |
nicolas, 04/01/2015 22:53:
The help page is a pile of lies. Check |
Thanks, okay, it’s good to know. |
For the naming issue Southparkfan found, the options could be :
so it’s something that looks possible. |
Don't give up hopes for the images.tar; from what I can see, it's nicolas, 05/01/2015 00:21:
It sure is. :) We know the true filename, so we can just use -O to force I wish we could just use content-disposition |
Right, using this regex (surely improvable):
and this substitution rule:
I was able to turn *images.txt into a .sh script which renames quite correctly all the misnamed pictures downloaded from Wikia. |
I have the same problem, the txt is created, but the images doesn't downloaded because the filename contain "?".. I fix this problem! I edit the line 998 changing |
Thanks @brunosso for the patch. I fixed it and it works fine! |
I think this needs to be reopened. Wikia now has the following new URLs with the same /revision/ nuance:
|
Yeah, it gives errors again... (This is a fandom site)
|
A wiki founder at Orain wants the images of their wiki at Wikia (http://donjon.wikia.com) imported to their wiki at Orain. I tried to download the images with "python dumpgenerator.py --api http://donjon.wikia.com --images", but no luck. The file names in the images/ folder are like "latest?cp=XXXXXXXXXXXXXX" where the XXX-string is a timestamp in the YmdHis format.
When I looked at the donjonwikiacom-20150104-images.txt file I saw entries like these:
latest?cb=20120816112532 http://vignette4.wikia.nocookie.net/donjonbd/images/8/89/Wiki-wordmark.png/revision/latest?cb=20120816112532 Nclm
latest?cb=20120816114055 http://vignette3.wikia.nocookie.net/donjonbd/images/6/64/Favicon.ico/revision/latest?cb=20120816114055 Nclm
This is because file names include the "/revision/latest?cb=XXXXXXXXXXXXXXXXX" part:
http://donjon.wikia.com/api.php?action=query&list=allimages
To avoid this problem, I'll see if I can write a PHP script to download images from Wikia.
The text was updated successfully, but these errors were encountered: