-
Notifications
You must be signed in to change notification settings - Fork 601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document is empty #43
Comments
seconded |
Closing due to inactivity 👍 |
I am getting this same issue |
I had same error as @LiquidPrototype @thorsummoner
breaking the while loop when encountering the empty html seems to avoid the error
|
Worth checking out my repo I made to get twitter data for bot identification purposes. I was using this repo until it broke for me too. https://github.com/jamesacampbell/botrnot and it is |
I'm getting this error when I try to run:
Traceback (most recent call last):
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\site-packages\pyquery\pyquery.py", line 95, in fromstring
result = getattr(etree, meth)(context)
File "src\lxml\etree.pyx", line 3213, in lxml.etree.fromstring
File "src\lxml\parser.pxi", line 1876, in lxml.etree._parseMemoryDocument
File "src\lxml\parser.pxi", line 1764, in lxml.etree._parseDoc
File "src\lxml\parser.pxi", line 1126, in lxml.etree._BaseParser._parseDoc
File "src\lxml\parser.pxi", line 600, in lxml.etree._ParserContext._handleParseResultDoc
File "src\lxml\parser.pxi", line 710, in lxml.etree._handleParseResult
File "src\lxml\parser.pxi", line 639, in lxml.etree._raiseParseError
File "", line 17
lxml.etree.XMLSyntaxError: Start tag expected, '<' not found, line 17, column 1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\USER\Desktop\Twitter Stock Market\current.py", line 16, in
for tweet in get_tweets('trump', pages=3):
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\site-packages\twitter_scraper.py", line 78, in get_tweets
yield from gen_tweets(pages)
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\site-packages\twitter_scraper.py", line 26, in gen_tweets
url='bunk', default_encoding='utf-8')
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\site-packages\requests_html.py", line 419, in init
element=PyQuery(html)('html') or PyQuery(f'{html}')('html'),
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\site-packages\pyquery\pyquery.py", line 255, in init
elements = fromstring(context, self.parser)
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\site-packages\pyquery\pyquery.py", line 99, in fromstring
result = getattr(lxml.html, meth)(context)
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\site-packages\lxml\html_init_.py", line 876, in fromstring
doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
File "C:\Users\USER\AppData\Local\Programs\Python\Python36\lib\site-packages\lxml\html_init_.py", line 765, in document_fromstring
"Document is empty")
lxml.etree.ParserError: Document is empty
The text was updated successfully, but these errors were encountered: