Skip to content

Commit

Permalink
Circumvent API exception when trying to use index.php
Browse files Browse the repository at this point in the history
$ python dumpgenerator.py --xml --index=http://meritbadge.org/wiki/index.php
fails on at least one MediaWiki 1.12 wiki:

Trying generating a new dump into a new directory...
Loading page titles from namespaces = all
Excluding titles from namespaces = None
Traceback (most recent call last):
  File "dumpgenerator.py", line 2211, in <module>
    main()
  File "dumpgenerator.py", line 2203, in main
    createNewDump(config=config, other=other)
  File "dumpgenerator.py", line 1766, in createNewDump
    getPageTitles(config=config, session=other['session'])
  File "dumpgenerator.py", line 400, in getPageTitles
    test = getJSON(r)
  File "dumpgenerator.py", line 1708, in getJSON
    return request.json()
  File "/usr/lib/python2.7/site-packages/requests/models.py", line 892, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib64/python2.7/site-packages/simplejson/__init__.py", line 516, in loads
    return _default_decoder.decode(s)
  File "/usr/lib64/python2.7/site-packages/simplejson/decoder.py", line 374, in decode
    obj, end = self.raw_decode(s)
  File "/usr/lib64/python2.7/site-packages/simplejson/decoder.py", line 404, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
  • Loading branch information
nemobis committed May 18, 2018
1 parent d4f0869 commit 27cbdfd
Showing 1 changed file with 5 additions and 2 deletions.
7 changes: 5 additions & 2 deletions dumpgenerator.py
Original file line number Diff line number Diff line change
Expand Up @@ -397,8 +397,11 @@ def getPageTitles(config={}, session=None):
titles = []
if 'api' in config and config['api']:
r = session.post(config['api'], params={'action': 'query', 'list': 'allpages', 'format': 'json'}, timeout=30)
test = getJSON(r)
if ('warnings' in test and 'allpages' in test['warnings'] and '*' in test['warnings']['allpages']
try:
test = getJSON(r)
except:
test = None
if not test or ('warnings' in test and 'allpages' in test['warnings'] and '*' in test['warnings']['allpages']
and test['warnings']['allpages']['*'] == 'The "allpages" module has been disabled.'):
titles = getPageTitlesScraper(config=config, session=session)
else:
Expand Down

0 comments on commit 27cbdfd

Please sign in to comment.