You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Before creating an issue, first upgrade cfscrape with pip install -U cfscrape and see if you're still experiencing the problem. Please also confirm your Node version (node --version or nodejs --version) is version 10 or higher.
Make sure the website you're having issues with is actually using anti-bot protection by Cloudflare and not a competitor like Imperva Incapsula or Sucuri. And if you're using an anonymizing proxy, a VPN, or Tor, Cloudflare often flags those IPs and may block you or present you with a captcha as a result.
Please confirm the following statements and check the boxes before creating an issue:
I've upgraded cfscrape with pip install -U cfscrape
I'm using Node version 10 or higher
The site protection I'm having issues with is from Cloudflare
I'm not using Tor, a VPN, or an anonymizing proxy
Python version number
Run python --version and paste the output below:
cfscrape version number
Run pip show cfscrape and paste the output below:
Code snippet involved with the issue
2020-06-16 18:42:03 [scrapy.utils.log] INFO: Scrapy 1.6.0 started (bot: scraping)
2020-06-16 18:42:03 [scrapy.utils.log] INFO: Versions: lxml 4.5.0.0, libxml2 2.9.9, cssselect 1.1.0, parsel 1.5.2, w3lib 1.21.0, Twisted 20.3.0, Python 3.7.7 (default, May 6 2020, 04:59:01) - [Clang 4.0.1 (tags/RELEASE_401/final)], pyOpenSSL 19.1.0 (OpenSSL 1.1.1g 21 Apr 2020), cryptography 2.9.2, Platform Darwin-19.5.0-x86_64-i386-64bit
2020-06-16 18:42:03 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'scraping', 'CONCURRENT_REQUESTS': 32, 'CONCURRENT_REQUESTS_PER_DOMAIN': 32, 'COOKIES_ENABLED': False, 'DOWNLOAD_DELAY': 2, 'DOWNLOAD_TIMEOUT': 600, 'DUPEFILTER_CLASS': 'scrapy_splash.SplashAwareDupeFilter', 'FEED_FORMAT': 'csv', 'FEED_URI': 'results/%(name)s_%(time)s.csv', 'HTTPCACHE_ENABLED': True, 'HTTPCACHE_EXPIRATION_SECS': 43200, 'HTTPCACHE_STORAGE': 'scrapy_splash.SplashAwareFSCacheStorage', 'NEWSPIDER_MODULE': 'scraping.spiders', 'SPIDER_MODULES': ['scraping.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36'}
2020-06-16 18:42:03 [scrapy.extensions.telnet] INFO: Telnet Password: e179fe629b29425b
2020-06-16 18:42:03 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.memusage.MemoryUsage',
'scrapy.extensions.feedexport.FeedExporter',
'scrapy.extensions.logstats.LogStats']
>>>>>>>>>>>>>>>>>__init__(MODES)<<<<<<<<<<<<<<<<<
2020-06-16 18:42:03 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
'scrapy.downloadermiddlewares.retry.RetryMiddleware',
'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
'scrapy_crawlera.CrawleraMiddleware',
'scrapy_splash.SplashCookiesMiddleware',
'scrapy_splash.SplashMiddleware',
'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
'scrapy.downloadermiddlewares.stats.DownloaderStats',
'scrapy.downloadermiddlewares.httpcache.HttpCacheMiddleware']
2020-06-16 18:42:03 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
'scrapy_splash.SplashDeduplicateArgsMiddleware',
'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
'scrapy.spidermiddlewares.referer.RefererMiddleware',
'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
'scrapy.spidermiddlewares.depth.DepthMiddleware']
2020-06-16 18:42:03 [scrapy.middleware] INFO: Enabled item pipelines:
['scraping.pipelines.ScrapingPipeline']
2020-06-16 18:42:03 [urllib3.connectionpool] DEBUG: Starting new HTTPS connection (1): www.modes.com:443
2020-06-16 18:42:04 [urllib3.connectionpool] DEBUG: https://www.modes.com:443 "GET /jp/shopping/woman HTTP/1.1" 503 None
Unhandled error in Deferred:
2020-06-16 18:42:04 [twisted] CRITICAL: Unhandled error in Deferred:
Traceback (most recent call last):
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/scrapy/crawler.py", line 172, in crawl
return self._crawl(crawler, *args, **kwargs)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/scrapy/crawler.py", line 176, in _crawl
d = crawler.crawl(*args, **kwargs)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/twisted/internet/defer.py", line 1613, in unwindGenerator
return _cancellableInlineCallbacks(gen)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/twisted/internet/defer.py", line 1529, in _cancellableInlineCallbacks
_inlineCallbacks(None, g, status)
--- <exception caught here> ---
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/scrapy/crawler.py", line 81, in crawl
start_requests = iter(self.spider.start_requests())
File "/Users/rnrnstar/github/Spiders/scraping/spiders/modes.py", line 41, in start_requests
data = scraper.get("https://www.modes.com/jp/shopping/woman").content
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/requests/sessions.py", line 543, in get
return self.request('GET', url, **kwargs)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/cfscrape/__init__.py", line 129, in request
resp = self.solve_cf_challenge(resp, **kwargs)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/cfscrape/__init__.py", line 207, in solve_cf_challenge
answer, delay = self.solve_challenge(body, domain)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/cfscrape/__init__.py", line 299, in solve_challenge
% BUG_REPORT
builtins.ValueError: Unable to identify Cloudflare IUAM Javascript on website. Cloudflare may have changed their technique, or there may be a bug in the script.
Please read https://github.com/Anorov/cloudflare-scrape#updates, then file a bug report at https://github.com/Anorov/cloudflare-scrape/issues."
2020-06-16 18:42:04 [twisted] CRITICAL:
Traceback (most recent call last):
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/cfscrape/__init__.py", line 259, in solve_challenge
javascript, flags=re.S
AttributeError: 'NoneType' object has no attribute 'groups'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/scrapy/crawler.py", line 81, in crawl
start_requests = iter(self.spider.start_requests())
File "/Users/rnrnstar/github/Spiders/scraping/spiders/modes.py", line 41, in start_requests
data = scraper.get("https://www.modes.com/jp/shopping/woman").content
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/requests/sessions.py", line 543, in get
return self.request('GET', url, **kwargs)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/cfscrape/__init__.py", line 129, in request
resp = self.solve_cf_challenge(resp, **kwargs)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/cfscrape/__init__.py", line 207, in solve_cf_challenge
answer, delay = self.solve_challenge(body, domain)
File "/Users/rnrnstar/opt/anaconda3/envs/python_modules/lib/python3.7/site-packages/cfscrape/__init__.py", line 299, in solve_challenge
% BUG_REPORT
ValueError: Unable to identify Cloudflare IUAM Javascript on website. Cloudflare may have changed their technique, or there may be a bug in the script.
Please read https://github.com/Anorov/cloudflare-scrape#updates, then file a bug report at https://github.com/Anorov/cloudflare-scrape/issues."
Complete exception and traceback
(If the problem doesn't involve an exception being raised, leave this blank)
URL of the Cloudflare-protected page
[LINK GOES HERE]
URL of Pastebin/Gist with HTML source of protected page
[LINK GOES HERE]
The text was updated successfully, but these errors were encountered:
Before creating an issue, first upgrade cfscrape with
pip install -U cfscrape
and see if you're still experiencing the problem. Please also confirm your Node version (node --version
ornodejs --version
) is version 10 or higher.Make sure the website you're having issues with is actually using anti-bot protection by Cloudflare and not a competitor like Imperva Incapsula or Sucuri. And if you're using an anonymizing proxy, a VPN, or Tor, Cloudflare often flags those IPs and may block you or present you with a captcha as a result.
Please confirm the following statements and check the boxes before creating an issue:
pip install -U cfscrape
Python version number
Run
python --version
and paste the output below:cfscrape version number
Run
pip show cfscrape
and paste the output below:Code snippet involved with the issue
Complete exception and traceback
(If the problem doesn't involve an exception being raised, leave this blank)
URL of the Cloudflare-protected page
[LINK GOES HERE]
URL of Pastebin/Gist with HTML source of protected page
[LINK GOES HERE]
The text was updated successfully, but these errors were encountered: