-
Notifications
You must be signed in to change notification settings - Fork 87
Closed
Labels
Description
Terms
- I have searched all open bug reports
- I agree to follow Scribe-Data's Code of Conduct
Behavior
Description
(scribedev) shashankmittal@ShashanksLaptop Scribe-Data % python3 src/scribe_data/extract_transform/wikidata/update_data.py '["German"]' '["nouns", "verbs"]'
Data updated: 0%| | 0/2 [00:00<?, ?dirs/s]Querying and formatting German nouns
Data updated: 0%| | 0/2 [01:00<?, ?dirs/s]
Traceback (most recent call last):
File "/Users/shashankmittal/Documents/Developer/scribe/Scribe-Data/src/scribe_data/extract_transform/wikidata/update_data.py", line 141, in <module>
results = sparql.query().convert()
File "/opt/anaconda3/envs/scribedev/lib/python3.10/site-packages/SPARQLWrapper/Wrapper.py", line 1196, in convert
return self._convertJSON()
File "/opt/anaconda3/envs/scribedev/lib/python3.10/site-packages/SPARQLWrapper/Wrapper.py", line 1059, in _convertJSON
json_str = json.loads(self.response.read().decode("utf-8"))
File "/opt/anaconda3/envs/scribedev/lib/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/opt/anaconda3/envs/scribedev/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/opt/anaconda3/envs/scribedev/lib/python3.10/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 320797 column 115 (char 6713171)Query builder Link
The query time limit is reached that's why results = sparql.query().convert() in update_data.py is throwing json.decoder.JSONDecodeError due to Invalid control character at: line 320797 column 115 (char 6713171) in sparql.query().response as it contains the timeout error logs.
Suggested Changes
- Considered splitting SPARQL query into smaller queries, such as one query for nouns and another for pronouns, or querying for singular and plural forms separately.
- Still got
Query timeout limit reachederror as total number of nouns and pronouns for German are165869. Verified here. - Use
LIMITandOFFSETto split into multiple queries.
andrewtavis
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Done