Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scraps forever #280

Closed
mirfan899 opened this issue May 5, 2023 · 3 comments
Closed

Scraps forever #280

mirfan899 opened this issue May 5, 2023 · 3 comments

Comments

@mirfan899
Copy link

Here is the list of URLs I'm trying to scrape, which are stuck and never finishes.

https://www.si.com/showcase/fitness/best-boxing-gloves
https://www.verywellfit.com/best-boxing-gloves-4158917
https://www.rollingstone.com/product-recommendations/lifestyle/best-boxing-gloves-1234690811/
https://www.gearpatrol.com/fitness/g40446087/best-boxing-gloves/
https://boxingglovesreviews.com/top-ten-boxing-gloves/
https://sweetscienceoffighting.com/best-boxing-gloves/
https://www.shape.com/fitness/gear/best-boxing-gloves
https://www.t3.com/features/best-boxing-gloves
https://bleacherreport.com/articles/1286577-breaking-down-different-brands-of-boxing-gloves-worn-by-the-pros
https://www.youtube.com/watch?v=tWoucO2nIlE
https://expertboxing.com/best-boxing-gloves-review
https://thekarateblog.com/best-boxing-gloves/
https://boxupnation.com/blogs/news/my-top-5-favorite-boxing-glove-brands-and-why
https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131
https://www.tabletenniscoach.me.uk/sport-equipment-guides/best-boxing-gloves-for-beginners/
https://myboxinglife.com/best-boxing-gloves-for-beginners/
https://www.youtube.com/watch?v=rHepbZOCxfY
https://wayofmartialarts.com/best-boxing-gloves-worth-your-money/
https://www.hayabusafight.com/products/t3-boxing-gloves
https://www.dickssportinggoods.com/o/best-boxing-gloves-for-pad-work
https://revgear.com/gear/boxing-gloves/
https://blog.joinfightcamp.com/boxing-equipment/how-to-choose-the-best-boxing-gloves-for-beginners/
https://www.ebay.com/t/Boxing-Gloves/30102/bn_1943751
https://cletoreyesboxing.com/
https://www.walmart.com/c/lists/top-rated-boxing-gloves
https://www.ringsport.com.au/blogs/ringsport-blog/boxing-glove-guide-part-1
https://made4fighters.com/blogs/default-blog/top-womens-boxing-gloves
https://m.timesofindia.com/most-searched-products/sports-equipment/boxing-gloves-for-beginners-best-picks/articleshow/97912567.cms
https://www.everlast.com/fight/boxing/gloves
https://www.msmfightshop.com/blogs/news/top-3-boxing-gloves-in-the-world
https://www.quora.com/What-companies-make-the-best-quality-boxing-gloves
https://www.titleboxing.com/gloves
https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-professionals/articleshow/97128538.cms
https://skilspo.com/gb/blog/1_how-to-choose-the-best-boxing-gloves.html
https://bravose.com/collections/training-gloves
https://sanabulsports.com/blogs/news/the-best-boxing-gloves-for-training
https://anthonyjoshua.com/blogs/news/anthony-joshua-how-to-choose-the-best-boxing-gloves
https://www.nakmuaywholesale.com/top-3-boxing-gloves-for-small-hands-2022/
https://mmagearaddict.com/best-boxing-gloves/
https://issuu.com/punchequipment/docs/get_the_best_boxing_gloves_for_a_winning_performan
https://tufwear-germany.de/en/blogs/news/was-sind-die-besten-boxhandschuhe-der-boxhandschuh-guide-fur-deinen-kauf
https://yokkao.com/pages/boxing-gloves-guide
https://topboxer.com/collections/boxing-gloves
https://warriorpunch.com/best-boxing-gloves-for-beginners/
https://nypost.com/article/best-boxing-equipment-per-experts/
https://origympersonaltrainercourses.co.uk/blog/best-boxing-gloves
https://www.infinitudefight.com/buy-the-best-boxing-gloves/
https://cashkaro.com/blog/best-boxing-gloves-in-india/201246
https://www.popsugar.com/fitness/Best-Boxing-Gloves-Women-45472473
https://kdvr.com/reviews/br/sports-fitness-br/boxing-br/best-title-boxing-gloves/
https://www.expertreviews.co.uk/health-and-grooming/1407584/best-boxing-gloves
https://branded.disruptsports.com/blogs/blog/which-boxing-gloves-to-buy-for-beginners
https://www.flipkart.com/sports/boxing/boxing-gloves/pr?sid=abc%2Cppq%2Cbb6&page=2
https://www.reddit.com/r/amateur_boxing/comments/2ykhau/the_top_15_best_boxing_gloves_ranking_the_best/
https://fightquality.com/2018/10/12/best-custom-gloves/
https://fightingadvice.com/best-boxing-gloves-under-200/
https://glovesaddict.com/best-boxing-gloves-on-amazon/
https://www.k2promos.com/best-beginner-boxing-gloves/
https://absolutelymartialarts.com/best-boxing-gloves-beginners/
https://www.healthyprinciples.co.uk/best-boxing-gloves-for-kids-review/
https://breakinggrips.com/best-kids-boxing-gloves/
https://www.proboxingequipment.com/Boxing-Gloves_c_196.html
https://www.mmahive.com/best-boxing-gloves-for-wrist-support/
https://bwsgym.com/etiquette-produit/best-boxing-gloves/
https://www.dontwasteyourmoney.com/products/hawk-sports-heavy-bag-boxing-gloves/
https://www.bestproducts.com/fitness/equipment/g1009/boxing-gloves-mitts/
https://www.wbcme.co.uk/ringside/best-boxing-gloves-for-beginners/
https://www.momjunction.com/articles/best-boxing-gloves-for-kids_00514921/
https://middleeasy.com/reviews/gear/gloves-cardio-kickboxing/
https://www.fightingking.com/boxing-gloves-brands-reviews/
https://www.mightyfighter.com/top-10-best-boxing-gloves/
https://www.stylecraze.com/articles/best-heavy-bag-gloves/
https://linealboxing.com/best-boxing-glove-brands-2022/
https://blackbeltmag.com/best-boxing-gloves
https://smartmma.com/best-boxing-gloves-for-heavy-bag/
https://www.fullcontactway.com/best-sparring-gloves/
https://www.attacktheback.com/best-cheap-boxing-gloves/
https://www.boxingear.com/shop-2/grant-gloves/lace-up/best-boxing-gloves-for-sparring-grant-gloves/
https://www.kreedon.com/best-boxing-gloves-brands/
https://bestreviews.com/sports-fitness/boxing/best-boxing-gloves
https://cletoreyesuk.com/blogs/news/what-are-the-best-boxing-gloves-for-beginners
https://www.fitnessbaddies.com/amateur-boxing-gloves/
https://www.boxingison.com/best-boxing-gloves-for-training-and-sparring/
https://boxingready.com/ringside/best-boxing-gloves-wrist-support/
https://www.msn.com/en-gb/lifestyle/rf-best-products-uk/best-boxing-gloves-for-men-12oz-reviews
https://www.pragmaticmom.com/2019/11/best-boxing-gloves-for-women/
https://thewiredshopper.com/best-boxing-gloves-to-buy/
https://www.standard.co.uk/shopping/esbest/health-fitness/fitness-wear/best-womens-boxing-gloves-for-beginners-a4272321.html
https://www.gloveworx.com/blog/how-choose-best-boxing-gloves-beginners/
https://www.lowkickmma.com/best-boxing-gloves/
https://www.sportsdirect.com/boxing/boxing-gloves
https://themmaguru.com/best-youth-boxing-gloves/
https://brawlbros.com/best-boxing-gloves-on-amazon/
https://thechamplair.com/sports/best-beginners-boxing-gloves/
https://www.dmarge.com/best-boxing-gloves
https://www.nytimes.com/video/style/1194840632119/gear-test-boxing-gloves.html
https://findbestboxinggloves.com/best-boxing-gloves-for-heavy-bag-the-complete-guide/
https://www.hungry4fitness.co.uk/post/10-best-boxing-mitts-an-ultimate-guide
https://www.gearhungry.com/best-boxing-gloves/
https://hiconsumption.com/best-boxing-gloves/

Here is the log

/home/irfan/.pyenv/versions/TES/bin/python /home/irfan/PycharmProjects/TES-SAAS/tests/scprapping.py 
2023-05-05 06:52:32 [scrapy.utils.log] INFO: Scrapy 2.6.1 started (bot: scrapybot)
2023-05-05 06:52:32 [scrapy.utils.log] INFO: Versions: lxml 4.9.2.0, libxml2 2.9.14, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 22.4.0, Python 3.7.9 (default, Jan 23 2022, 07:32:51) - [GCC 7.5.0], pyOpenSSL 22.0.0 (OpenSSL 3.0.3 3 May 2022), cryptography 37.0.2, Platform Linux-5.4.0-148-generic-x86_64-with-debian-bullseye-sid
2023-05-05 06:52:32 [scrapy.crawler] INFO: Overridden settings:
{'ROBOTSTXT_OBEY': True,
 'SPIDER_LOADER_WARN_ONLY': True,
 'USER_AGENT': 'advertools/0.13.2'}
2023-05-05 06:52:32 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2023-05-05 06:52:32 [scrapy.extensions.telnet] INFO: Telnet Password: 2dcb88ca688b5e23
2023-05-05 06:52:32 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
2023-05-05 06:52:33 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2023-05-05 06:52:33 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2023-05-05 06:52:33 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2023-05-05 06:52:33 [scrapy.core.engine] INFO: Spider opened
2023-05-05 06:52:33 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2023-05-05 06:52:33 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2023-05-05 06:52:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://sweetscienceoffighting.com/robots.txt> (referer: None)
2023-05-05 06:52:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.rollingstone.com/robots.txt> (referer: None)
2023-05-05 06:52:33 [filelock] DEBUG: Attempting to acquire lock 140227121181328 on /home/irfan/.cache/python-tldextract/3.7.9.final__TES__f2586e__tldextract-3.3.0/publicsuffix.org-tlds/de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2023-05-05 06:52:33 [filelock] DEBUG: Lock 140227121181328 acquired on /home/irfan/.cache/python-tldextract/3.7.9.final__TES__f2586e__tldextract-3.3.0/publicsuffix.org-tlds/de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2023-05-05 06:52:33 [filelock] DEBUG: Attempting to release lock 140227121181328 on /home/irfan/.cache/python-tldextract/3.7.9.final__TES__f2586e__tldextract-3.3.0/publicsuffix.org-tlds/de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2023-05-05 06:52:33 [filelock] DEBUG: Lock 140227121181328 released on /home/irfan/.cache/python-tldextract/3.7.9.final__TES__f2586e__tldextract-3.3.0/publicsuffix.org-tlds/de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2023-05-05 06:52:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.t3.com/robots.txt> (referer: None)
2023-05-05 06:52:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.si.com/robots.txt> (referer: None)
2023-05-05 06:52:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gearpatrol.com/robots.txt> (referer: None)
2023-05-05 06:52:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.shape.com/robots.txt> (referer: None)
2023-05-05 06:52:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.verywellfit.com/robots.txt> (referer: None)
2023-05-05 06:52:33 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.si.com/showcase/fitness/best-boxing-gloves> (referer: None)
2023-05-05 06:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxingglovesreviews.com/robots.txt> (referer: None)
2023-05-05 06:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.t3.com/features/best-boxing-gloves> (referer: None)
2023-05-05 06:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bleacherreport.com/robots.txt> (referer: None)
2023-05-05 06:52:34 [scrapy.core.scraper] DEBUG: Scraped from <403 https://www.si.com/showcase/fitness/best-boxing-gloves>
2023-05-05 06:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.rollingstone.com/product-recommendations/lifestyle/best-boxing-gloves-1234690811/> (referer: None)
2023-05-05 06:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://expertboxing.com/robots.txt> (referer: None)
2023-05-05 06:52:34 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.t3.com/features/best-boxing-gloves>
2023-05-05 06:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.youtube.com/robots.txt> (referer: None)
2023-05-05 06:52:34 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.rollingstone.com/product-recommendations/lifestyle/best-boxing-gloves-1234690811/>
2023-05-05 06:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.verywellfit.com/best-boxing-gloves-4158917> (referer: None)
2023-05-05 06:52:34 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.verywellfit.com/best-boxing-gloves-4158917>
2023-05-05 06:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.shape.com/fitness/gear/best-boxing-gloves> (referer: None)
2023-05-05 06:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://thekarateblog.com/robots.txt> (referer: None)
2023-05-05 06:52:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.shape.com/fitness/gear/best-boxing-gloves>
2023-05-05 06:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://sweetscienceoffighting.com/best-boxing-gloves/> (referer: None)
2023-05-05 06:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gearpatrol.com/fitness/g40446087/best-boxing-gloves/> (referer: None)
2023-05-05 06:52:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://sweetscienceoffighting.com/best-boxing-gloves/>
2023-05-05 06:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.com/robots.txt> (referer: None)
2023-05-05 06:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxupnation.com/robots.txt> (referer: None)
2023-05-05 06:52:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.gearpatrol.com/fitness/g40446087/best-boxing-gloves/>
2023-05-05 06:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.tabletenniscoach.me.uk/robots.txt> (referer: None)
2023-05-05 06:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.youtube.com/watch?v=tWoucO2nIlE> (referer: None)
2023-05-05 06:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bleacherreport.com/articles/1286577-breaking-down-different-brands-of-boxing-gloves-worn-by-the-pros> (referer: None)
2023-05-05 06:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxingglovesreviews.com/top-ten-boxing-gloves/> (referer: None)
2023-05-05 06:52:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.youtube.com/watch?v=tWoucO2nIlE>
2023-05-05 06:52:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://bleacherreport.com/articles/1286577-breaking-down-different-brands-of-boxing-gloves-worn-by-the-pros>
2023-05-05 06:52:36 [scrapy.core.scraper] DEBUG: Scraped from <200 https://boxingglovesreviews.com/top-ten-boxing-gloves/>
2023-05-05 06:52:36 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131> (failed 1 times): 429 Unknown Status
2023-05-05 06:52:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxupnation.com/blogs/news/my-top-5-favorite-boxing-glove-brands-and-why> (referer: None)
2023-05-05 06:52:36 [scrapy.core.scraper] DEBUG: Scraped from <200 https://boxupnation.com/blogs/news/my-top-5-favorite-boxing-glove-brands-and-why>
2023-05-05 06:52:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://wayofmartialarts.com/robots.txt> (referer: None)
2023-05-05 06:52:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://thekarateblog.com/best-boxing-gloves/> (referer: None)
2023-05-05 06:52:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://myboxinglife.com/robots.txt> (referer: None)
2023-05-05 06:52:36 [scrapy.core.scraper] DEBUG: Scraped from <200 https://thekarateblog.com/best-boxing-gloves/>
2023-05-05 06:52:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.tabletenniscoach.me.uk/sport-equipment-guides/best-boxing-gloves-for-beginners/> (referer: None)
2023-05-05 06:52:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://expertboxing.com/best-boxing-gloves-review> (referer: None)
2023-05-05 06:52:36 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.dickssportinggoods.com/robots.txt> (referer: None)
2023-05-05 06:52:36 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131> (failed 2 times): 429 Unknown Status
2023-05-05 06:52:36 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.tabletenniscoach.me.uk/sport-equipment-guides/best-boxing-gloves-for-beginners/>
2023-05-05 06:52:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://expertboxing.com/best-boxing-gloves-review>
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.youtube.com/watch?v=rHepbZOCxfY> (referer: None)
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.hayabusafight.com/robots.txt> (referer: None)
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://revgear.com/robots.txt> (referer: None)
2023-05-05 06:52:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.youtube.com/watch?v=rHepbZOCxfY>
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.dickssportinggoods.com/o/best-boxing-gloves-for-pad-work> (referer: None)
2023-05-05 06:52:37 [scrapy.core.scraper] DEBUG: Scraped from <403 https://www.dickssportinggoods.com/o/best-boxing-gloves-for-pad-work>
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://myboxinglife.com/best-boxing-gloves-for-beginners/> (referer: None)
2023-05-05 06:52:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://myboxinglife.com/best-boxing-gloves-for-beginners/>
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://blog.joinfightcamp.com/robots.txt> (referer: None)
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.ebay.com/robots.txt> (referer: None)
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.walmart.com/robots.txt> (referer: None)
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.ringsport.com.au/robots.txt> (referer: None)
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cletoreyesboxing.com/robots.txt> (referer: None)
2023-05-05 06:52:37 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131> (failed 3 times): 429 Unknown Status
2023-05-05 06:52:37 [scrapy.core.engine] DEBUG: Crawled (429) <GET https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131> (referer: None) ['partial']
2023-05-05 06:52:38 [scrapy.core.scraper] DEBUG: Scraped from <429 https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131>
2023-05-05 06:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://made4fighters.com/robots.txt> (referer: None)
2023-05-05 06:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.ringsport.com.au/blogs/ringsport-blog/boxing-glove-guide-part-1> (referer: None)
2023-05-05 06:52:38 [seo_spider] ERROR: Invalid control character at: line 5 column 19 (char 78) 200 https://www.ringsport.com.au/blogs/ringsport-blog/boxing-glove-guide-part-1
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 761, in parse
    response.css('script[type="application/ld+json"]::text').getall()]
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 760, in <listcomp>
    ld = [json.loads(s.replace('\r', '')) for s in
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 5 column 19 (char 78)
2023-05-05 06:52:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.ringsport.com.au/blogs/ringsport-blog/boxing-glove-guide-part-1>
2023-05-05 06:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://blog.joinfightcamp.com/boxing-equipment/how-to-choose-the-best-boxing-gloves-for-beginners/> (referer: None)
2023-05-05 06:52:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://blog.joinfightcamp.com/boxing-equipment/how-to-choose-the-best-boxing-gloves-for-beginners/>
2023-05-05 06:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.ebay.com/t/Boxing-Gloves/30102/bn_1943751> (referer: None)
2023-05-05 06:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://wayofmartialarts.com/best-boxing-gloves-worth-your-money/> (referer: None)
2023-05-05 06:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.msmfightshop.com/robots.txt> (referer: None)
2023-05-05 06:52:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.ebay.com/t/Boxing-Gloves/30102/bn_1943751>
2023-05-05 06:52:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://wayofmartialarts.com/best-boxing-gloves-worth-your-money/>
2023-05-05 06:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://made4fighters.com/blogs/default-blog/top-womens-boxing-gloves> (referer: None)
2023-05-05 06:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.quora.com/robots.txt> (referer: None)
2023-05-05 06:52:38 [scrapy.downloadermiddlewares.robotstxt] DEBUG: Forbidden by robots.txt: <GET https://www.quora.com/What-companies-make-the-best-quality-boxing-gloves>
2023-05-05 06:52:39 [seo_spider] ERROR: Invalid control character at: line 20 column 226 (char 698) 200 https://made4fighters.com/blogs/default-blog/top-womens-boxing-gloves
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 761, in parse
    response.css('script[type="application/ld+json"]::text').getall()]
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 760, in <listcomp>
    ld = [json.loads(s.replace('\r', '')) for s in
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 20 column 226 (char 698)
2023-05-05 06:52:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://made4fighters.com/blogs/default-blog/top-womens-boxing-gloves>
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.hayabusafight.com/products/t3-boxing-gloves> (referer: None)
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.msmfightshop.com/blogs/news/top-3-boxing-gloves-in-the-world> (referer: None)
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.everlast.com/robots.txt> (referer: None)
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cletoreyesboxing.com/> (referer: None)
2023-05-05 06:52:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.hayabusafight.com/products/t3-boxing-gloves>
2023-05-05 06:52:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.msmfightshop.com/blogs/news/top-3-boxing-gloves-in-the-world>
2023-05-05 06:52:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://cletoreyesboxing.com/>
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://revgear.com/gear/boxing-gloves/> (referer: None)
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://m.timesofindia.com/robots.txt> (referer: None)
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.walmart.com/c/lists/top-rated-boxing-gloves> (referer: None)
2023-05-05 06:52:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://revgear.com/gear/boxing-gloves/>
2023-05-05 06:52:39 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-beginners-best-picks/articleshow/97912567.cms?from=mdr> from <GET https://m.timesofindia.com/most-searched-products/sports-equipment/boxing-gloves-for-beginners-best-picks/articleshow/97912567.cms>
2023-05-05 06:52:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.walmart.com/c/lists/top-rated-boxing-gloves>
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.titleboxing.com/robots.txt> (referer: None)
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bravose.com/robots.txt> (referer: None)
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://sanabulsports.com/robots.txt> (referer: None)
2023-05-05 06:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://timesofindia.indiatimes.com/robots.txt> (referer: None)
2023-05-05 06:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://anthonyjoshua.com/robots.txt> (referer: None)
2023-05-05 06:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://sanabulsports.com/blogs/news/the-best-boxing-gloves-for-training> (referer: None)
2023-05-05 06:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.everlast.com/fight/boxing/gloves> (referer: None)
2023-05-05 06:52:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://sanabulsports.com/blogs/news/the-best-boxing-gloves-for-training>
2023-05-05 06:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.nakmuaywholesale.com/robots.txt> (referer: None)
2023-05-05 06:52:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.everlast.com/fight/boxing/gloves>
2023-05-05 06:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://mmagearaddict.com/robots.txt> (referer: None)
2023-05-05 06:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://anthonyjoshua.com/blogs/news/anthony-joshua-how-to-choose-the-best-boxing-gloves> (referer: None)
2023-05-05 06:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bravose.com/collections/training-gloves> (referer: None)
2023-05-05 06:52:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://anthonyjoshua.com/blogs/news/anthony-joshua-how-to-choose-the-best-boxing-gloves>
2023-05-05 06:52:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://bravose.com/collections/training-gloves>
2023-05-05 06:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://issuu.com/robots.txt> (referer: None)
2023-05-05 06:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://tufwear-germany.de/robots.txt> (referer: None)
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.titleboxing.com/gloves> (referer: None)
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-professionals/articleshow/97128538.cms> (referer: None)
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://yokkao.com/robots.txt> (referer: None)
2023-05-05 06:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.titleboxing.com/gloves>
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://tufwear-germany.de/en/blogs/news/was-sind-die-besten-boxhandschuhe-der-boxhandschuh-guide-fur-deinen-kauf> (referer: None)
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://mmagearaddict.com/best-boxing-gloves/> (referer: None)
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://topboxer.com/robots.txt> (referer: None)
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.nakmuaywholesale.com/top-3-boxing-gloves-for-small-hands-2022/> (referer: None)
2023-05-05 06:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-professionals/articleshow/97128538.cms>
2023-05-05 06:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://tufwear-germany.de/en/blogs/news/was-sind-die-besten-boxhandschuhe-der-boxhandschuh-guide-fur-deinen-kauf>
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://issuu.com/punchequipment/docs/get_the_best_boxing_gloves_for_a_winning_performan> (referer: None)
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://nypost.com/robots.txt> (referer: None)
2023-05-05 06:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://mmagearaddict.com/best-boxing-gloves/>
2023-05-05 06:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.nakmuaywholesale.com/top-3-boxing-gloves-for-small-hands-2022/>
2023-05-05 06:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://issuu.com/punchequipment/docs/get_the_best_boxing_gloves_for_a_winning_performan>
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-beginners-best-picks/articleshow/97912567.cms?from=mdr> (referer: None)
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://warriorpunch.com/robots.txt> (referer: None)
2023-05-05 06:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-beginners-best-picks/articleshow/97912567.cms?from=mdr>
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://yokkao.com/pages/boxing-gloves-guide> (referer: None)
2023-05-05 06:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://topboxer.com/collections/boxing-gloves> (referer: None)
2023-05-05 06:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://yokkao.com/pages/boxing-gloves-guide>
2023-05-05 06:52:41 [seo_spider] ERROR: Invalid control character at: line 15 column 21 (char 385) 200 https://topboxer.com/collections/boxing-gloves
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 761, in parse
    response.css('script[type="application/ld+json"]::text').getall()]
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 760, in <listcomp>
    ld = [json.loads(s.replace('\r', '')) for s in
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 15 column 21 (char 385)
2023-05-05 06:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://topboxer.com/collections/boxing-gloves>
2023-05-05 06:52:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://nypost.com/article/best-boxing-equipment-per-experts/> (referer: None)
2023-05-05 06:52:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://kdvr.com/robots.txt> (referer: None)
2023-05-05 06:52:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cashkaro.com/robots.txt> (referer: None)
2023-05-05 06:52:42 [scrapy.core.scraper] DEBUG: Scraped from <200 https://nypost.com/article/best-boxing-equipment-per-experts/>
2023-05-05 06:52:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://origympersonaltrainercourses.co.uk/robots.txt> (referer: None)
2023-05-05 06:52:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.popsugar.com/robots.txt> (referer: None)
2023-05-05 06:52:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.expertreviews.co.uk/robots.txt> (referer: None)
2023-05-05 06:52:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cashkaro.com/blog/best-boxing-gloves-in-india/201246> (referer: None)
2023-05-05 06:52:42 [scrapy.core.scraper] DEBUG: Scraped from <200 https://cashkaro.com/blog/best-boxing-gloves-in-india/201246>
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://warriorpunch.com/best-boxing-gloves-for-beginners/> (referer: None)
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.popsugar.com/fitness/Best-Boxing-Gloves-Women-45472473> (referer: None)
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://kdvr.com/reviews/br/sports-fitness-br/boxing-br/best-title-boxing-gloves/> (referer: None)
2023-05-05 06:52:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://warriorpunch.com/best-boxing-gloves-for-beginners/>
2023-05-05 06:52:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.popsugar.com/fitness/Best-Boxing-Gloves-Women-45472473>
2023-05-05 06:52:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://kdvr.com/reviews/br/sports-fitness-br/boxing-br/best-title-boxing-gloves/>
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://branded.disruptsports.com/robots.txt> (referer: None)
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.expertreviews.co.uk/health-and-grooming/1407584/best-boxing-gloves> (referer: None)
2023-05-05 06:52:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.expertreviews.co.uk/health-and-grooming/1407584/best-boxing-gloves>
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://branded.disruptsports.com/blogs/blog/which-boxing-gloves-to-buy-for-beginners> (referer: None)
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.reddit.com/robots.txt> (referer: None)
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://fightquality.com/robots.txt> (referer: None)
2023-05-05 06:52:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://branded.disruptsports.com/blogs/blog/which-boxing-gloves-to-buy-for-beginners>
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.flipkart.com/robots.txt> (referer: None)
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://origympersonaltrainercourses.co.uk/blog/best-boxing-gloves> (referer: None)
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.infinitudefight.com/robots.txt> (referer: None)
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 10 without any user agent to enforce it on.
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 14 without any user agent to enforce it on.
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 16 without any user agent to enforce it on.
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 35 without any user agent to enforce it on.
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 42 without any user agent to enforce it on.
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 43 without any user agent to enforce it on.
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 44 without any user agent to enforce it on.
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 45 without any user agent to enforce it on.
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 46 without any user agent to enforce it on.
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 47 without any user agent to enforce it on.
2023-05-05 06:52:43 [protego] DEBUG: Rule at line 69 without any user agent to enforce it on.
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://absolutelymartialarts.com/robots.txt> (referer: None)
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.k2promos.com/robots.txt> (referer: None)
2023-05-05 06:52:43 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.infinitudefight.com/buy-the-best-boxing-gloves/> (referer: None)
2023-05-05 06:52:44 [seo_spider] ERROR: Expecting value: line 1 column 1 (char 0) 200 https://origympersonaltrainercourses.co.uk/blog/best-boxing-gloves
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 761, in parse
    response.css('script[type="application/ld+json"]::text').getall()]
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 760, in <listcomp>
    ld = [json.loads(s.replace('\r', '')) for s in
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
2023-05-05 06:52:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://origympersonaltrainercourses.co.uk/blog/best-boxing-gloves>
2023-05-05 06:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://fightingadvice.com/robots.txt> (referer: None)
2023-05-05 06:52:44 [scrapy.core.scraper] DEBUG: Scraped from <403 https://www.infinitudefight.com/buy-the-best-boxing-gloves/>
2023-05-05 06:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.proboxingequipment.com/robots.txt> (referer: None)
2023-05-05 06:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.proboxingequipment.com/Boxing-Gloves_c_196.html> (referer: None)
2023-05-05 06:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://glovesaddict.com/robots.txt> (referer: None)
2023-05-05 06:52:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.proboxingequipment.com/Boxing-Gloves_c_196.html>
2023-05-05 06:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://absolutelymartialarts.com/best-boxing-gloves-beginners/> (referer: None)
2023-05-05 06:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.reddit.com/r/amateur_boxing/comments/2ykhau/the_top_15_best_boxing_gloves_ranking_the_best/> (referer: None)
2023-05-05 06:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.healthyprinciples.co.uk/robots.txt> (referer: None)
2023-05-05 06:52:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://absolutelymartialarts.com/best-boxing-gloves-beginners/>
2023-05-05 06:52:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.reddit.com/r/amateur_boxing/comments/2ykhau/the_top_15_best_boxing_gloves_ranking_the_best/>
2023-05-05 06:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.mmahive.com/robots.txt> (referer: None)
2023-05-05 06:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bwsgym.com/robots.txt> (referer: None)
2023-05-05 06:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://fightquality.com/2018/10/12/best-custom-gloves/> (referer: None)
2023-05-05 06:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://fightingadvice.com/best-boxing-gloves-under-200/> (referer: None)
2023-05-05 06:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.k2promos.com/best-beginner-boxing-gloves/> (referer: None)
2023-05-05 06:52:45 [scrapy.core.scraper] DEBUG: Scraped from <200 https://fightquality.com/2018/10/12/best-custom-gloves/>
2023-05-05 06:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.flipkart.com/sports/boxing/boxing-gloves/pr?sid=abc%2Cppq%2Cbb6&page=2> (referer: None)
2023-05-05 06:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.dontwasteyourmoney.com/robots.txt> (referer: None)
2023-05-05 06:52:45 [scrapy.core.scraper] DEBUG: Scraped from <200 https://fightingadvice.com/best-boxing-gloves-under-200/>
2023-05-05 06:52:45 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.k2promos.com/best-beginner-boxing-gloves/>
2023-05-05 06:52:45 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/sports/boxing/boxing-gloves/pr?sid=abc%2Cppq%2Cbb6&page=2>
2023-05-05 06:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bwsgym.com/etiquette-produit/best-boxing-gloves/> (referer: None)
2023-05-05 06:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://middleeasy.com/robots.txt> (referer: None)
2023-05-05 06:52:45 [scrapy.core.scraper] DEBUG: Scraped from <200 https://bwsgym.com/etiquette-produit/best-boxing-gloves/>
2023-05-05 06:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.healthyprinciples.co.uk/best-boxing-gloves-for-kids-review/> (referer: None)
2023-05-05 06:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.bestproducts.com/robots.txt> (referer: None)
2023-05-05 06:52:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.healthyprinciples.co.uk/best-boxing-gloves-for-kids-review/>
2023-05-05 06:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.mmahive.com/best-boxing-gloves-for-wrist-support/> (referer: None)
2023-05-05 06:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.momjunction.com/robots.txt> (referer: None)
2023-05-05 06:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.dontwasteyourmoney.com/products/hawk-sports-heavy-bag-boxing-gloves/> (referer: None)
2023-05-05 06:52:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.mmahive.com/best-boxing-gloves-for-wrist-support/>
2023-05-05 06:52:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.dontwasteyourmoney.com/products/hawk-sports-heavy-bag-boxing-gloves/>
2023-05-05 06:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://glovesaddict.com/best-boxing-gloves-on-amazon/> (referer: None)
2023-05-05 06:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://middleeasy.com/reviews/gear/gloves-cardio-kickboxing/> (referer: None)
2023-05-05 06:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://breakinggrips.com/robots.txt> (referer: None)
2023-05-05 06:52:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://glovesaddict.com/best-boxing-gloves-on-amazon/>
2023-05-05 06:52:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://middleeasy.com/reviews/gear/gloves-cardio-kickboxing/>
2023-05-05 06:52:46 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.fightingking.com/robots.txt> (failed 1 times): 429 Unknown Status
2023-05-05 06:52:46 [py.warnings] WARNING: /home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/scrapy/core/engine.py:276: ScrapyDeprecationWarning: Passing a 'spider' argument to ExecutionEngine.download is deprecated
  return self.download(result, spider) if isinstance(result, Request) else result

2023-05-05 06:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.momjunction.com/articles/best-boxing-gloves-for-kids_00514921/> (referer: None)
2023-05-05 06:52:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.momjunction.com/articles/best-boxing-gloves-for-kids_00514921/>
2023-05-05 06:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.bestproducts.com/fitness/equipment/g1009/boxing-gloves-mitts/> (referer: None)
2023-05-05 06:52:47 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.fightingking.com/robots.txt> (failed 2 times): 429 Unknown Status
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://breakinggrips.com/best-kids-boxing-gloves/> (referer: None)
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.mightyfighter.com/robots.txt> (referer: None)
2023-05-05 06:52:47 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.bestproducts.com/fitness/equipment/g1009/boxing-gloves-mitts/>
2023-05-05 06:52:47 [scrapy.core.scraper] DEBUG: Scraped from <200 https://breakinggrips.com/best-kids-boxing-gloves/>
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.stylecraze.com/robots.txt> (referer: None)
2023-05-05 06:52:47 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://www.fightingking.com/robots.txt> (failed 3 times): 429 Unknown Status
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (429) <GET https://www.fightingking.com/robots.txt> (referer: None)
2023-05-05 06:52:47 [protego] DEBUG: Rule at line 2 without any user agent to enforce it on.
2023-05-05 06:52:47 [protego] DEBUG: Rule at line 6 without any user agent to enforce it on.
2023-05-05 06:52:47 [protego] DEBUG: Rule at line 10 without any user agent to enforce it on.
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://linealboxing.com/robots.txt> (referer: None)
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.wbcme.co.uk/robots.txt> (referer: None)
2023-05-05 06:52:47 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.fightingking.com/boxing-gloves-brands-reviews/> (failed 1 times): 429 Unknown Status
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://blackbeltmag.com/robots.txt> (referer: None)
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.mightyfighter.com/top-10-best-boxing-gloves/> (referer: None)
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://smartmma.com/robots.txt> (referer: None)
2023-05-05 06:52:47 [protego] DEBUG: Rule at line 1 without any user agent to enforce it on.
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://linealboxing.com/best-boxing-glove-brands-2022/> (referer: None)
2023-05-05 06:52:47 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.mightyfighter.com/top-10-best-boxing-gloves/>
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.stylecraze.com/articles/best-heavy-bag-gloves/> (referer: None)
2023-05-05 06:52:47 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.fightingking.com/boxing-gloves-brands-reviews/> (failed 2 times): 429 Unknown Status
2023-05-05 06:52:47 [scrapy.core.scraper] DEBUG: Scraped from <200 https://linealboxing.com/best-boxing-glove-brands-2022/>
2023-05-05 06:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.wbcme.co.uk/ringside/best-boxing-gloves-for-beginners/> (referer: None)
2023-05-05 06:52:48 [seo_spider] ERROR: Invalid control character at: line 28 column 64 (char 1740) 200 https://www.stylecraze.com/articles/best-heavy-bag-gloves/
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 761, in parse
    response.css('script[type="application/ld+json"]::text').getall()]
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 760, in <listcomp>
    ld = [json.loads(s.replace('\r', '')) for s in
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 28 column 64 (char 1740)
2023-05-05 06:52:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stylecraze.com/articles/best-heavy-bag-gloves/>
2023-05-05 06:52:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.kreedon.com/robots.txt> (referer: None)
2023-05-05 06:52:48 [scrapy.downloadermiddlewares.robotstxt] DEBUG: Forbidden by robots.txt: <GET https://www.kreedon.com/best-boxing-gloves-brands/>
2023-05-05 06:52:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.wbcme.co.uk/ringside/best-boxing-gloves-for-beginners/>
2023-05-05 06:52:48 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://www.fightingking.com/boxing-gloves-brands-reviews/> (failed 3 times): 429 Unknown Status
2023-05-05 06:52:48 [scrapy.core.engine] DEBUG: Crawled (429) <GET https://www.fightingking.com/boxing-gloves-brands-reviews/> (referer: None)
2023-05-05 06:52:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.attacktheback.com/robots.txt> (referer: None)
2023-05-05 06:52:48 [scrapy.core.scraper] DEBUG: Scraped from <429 https://www.fightingking.com/boxing-gloves-brands-reviews/>
2023-05-05 06:52:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.boxingear.com/robots.txt> (referer: None)
2023-05-05 06:52:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://blackbeltmag.com/best-boxing-gloves> (referer: None)
2023-05-05 06:52:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://blackbeltmag.com/best-boxing-gloves>
2023-05-05 06:52:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cletoreyesuk.com/robots.txt> (referer: None)
2023-05-05 06:52:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.attacktheback.com/best-cheap-boxing-gloves/> (referer: None)
2023-05-05 06:52:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.attacktheback.com/best-cheap-boxing-gloves/>
2023-05-05 06:52:48 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://sites.google.com/view> from <GET https://www.boxingear.com/shop-2/grant-gloves/lace-up/best-boxing-gloves-for-sparring-grant-gloves/>
2023-05-05 06:52:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.fullcontactway.com/robots.txt> (referer: None)
2023-05-05 06:52:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cletoreyesuk.com/blogs/news/what-are-the-best-boxing-gloves-for-beginners> (referer: None)
2023-05-05 06:52:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.fitnessbaddies.com/robots.txt> (referer: None)
2023-05-05 06:52:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bestreviews.com/robots.txt> (referer: None)
2023-05-05 06:52:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.boxingison.com/robots.txt> (referer: None)
2023-05-05 06:52:49 [scrapy.core.scraper] DEBUG: Scraped from <200 https://cletoreyesuk.com/blogs/news/what-are-the-best-boxing-gloves-for-beginners>
2023-05-05 06:52:49 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://thewiredshopper.com/robots.txt> (referer: None)
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 28 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 37 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 38 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 39 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 40 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 41 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 42 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 43 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 44 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 45 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 46 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 47 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 48 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 49 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 50 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 51 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 52 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 53 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 54 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 55 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 56 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 57 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 58 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 59 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 60 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 61 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 67 without any user agent to enforce it on.
2023-05-05 06:52:49 [protego] DEBUG: Rule at line 72 without any user agent to enforce it on.
2023-05-05 06:52:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.msn.com/robots.txt> (referer: None)
2023-05-05 06:52:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.fullcontactway.com/best-sparring-gloves/> (referer: None)
2023-05-05 06:52:49 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://thewiredshopper.com/best-boxing-gloves-to-buy/> (referer: None)
2023-05-05 06:52:49 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.fullcontactway.com/best-sparring-gloves/>
2023-05-05 06:52:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://smartmma.com/best-boxing-gloves-for-heavy-bag/> (referer: None)
2023-05-05 06:52:49 [scrapy.core.scraper] DEBUG: Scraped from <403 https://thewiredshopper.com/best-boxing-gloves-to-buy/>
2023-05-05 06:52:49 [scrapy.core.scraper] DEBUG: Scraped from <200 https://smartmma.com/best-boxing-gloves-for-heavy-bag/>
2023-05-05 06:52:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.msn.com/en-gb/lifestyle/rf-best-products-uk/best-boxing-gloves-for-men-12oz-reviews> (referer: None)
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bestreviews.com/sports-fitness/boxing/best-boxing-gloves> (referer: None)
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gloveworx.com/robots.txt> (referer: None)
2023-05-05 06:52:50 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.msn.com/en-gb/lifestyle/rf-best-products-uk/best-boxing-gloves-for-men-12oz-reviews>
2023-05-05 06:52:50 [scrapy.core.scraper] DEBUG: Scraped from <200 https://bestreviews.com/sports-fitness/boxing/best-boxing-gloves>
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.fitnessbaddies.com/amateur-boxing-gloves/> (referer: None)
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.standard.co.uk/robots.txt> (referer: None)
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://sites.google.com/robots.txt> (referer: None)
2023-05-05 06:52:50 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.fitnessbaddies.com/amateur-boxing-gloves/>
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.pragmaticmom.com/robots.txt> (referer: None)
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lowkickmma.com/robots.txt> (referer: None)
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.standard.co.uk/shopping/esbest/health-fitness/fitness-wear/best-womens-boxing-gloves-for-beginners-a4272321.html> (referer: None)
2023-05-05 06:52:50 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.standard.co.uk/shopping/esbest/health-fitness/fitness-wear/best-womens-boxing-gloves-for-beginners-a4272321.html>
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxingready.com/robots.txt> (referer: None)
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.sportsdirect.com/robots.txt> (referer: None)
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lowkickmma.com/best-boxing-gloves/> (referer: None)
2023-05-05 06:52:50 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://sites.google.com/view> (referer: None)
2023-05-05 06:52:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.lowkickmma.com/best-boxing-gloves/>
2023-05-05 06:52:51 [scrapy.core.scraper] DEBUG: Scraped from <404 https://sites.google.com/view>
2023-05-05 06:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://themmaguru.com/robots.txt> (referer: None)
2023-05-05 06:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.dmarge.com/robots.txt> (referer: None)
2023-05-05 06:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.pragmaticmom.com/2019/11/best-boxing-gloves-for-women/> (referer: None)
2023-05-05 06:52:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.pragmaticmom.com/2019/11/best-boxing-gloves-for-women/>
2023-05-05 06:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.dmarge.com/best-boxing-gloves> (referer: None)
2023-05-05 06:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.boxingison.com/best-boxing-gloves-for-training-and-sparring/> (referer: None)
2023-05-05 06:52:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.dmarge.com/best-boxing-gloves>
2023-05-05 06:52:51 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.sportsdirect.com/boxing/boxing-gloves> (referer: None)
2023-05-05 06:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gloveworx.com/blog/how-choose-best-boxing-gloves-beginners/> (referer: None)
2023-05-05 06:52:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.boxingison.com/best-boxing-gloves-for-training-and-sparring/>
2023-05-05 06:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://thechamplair.com/robots.txt> (referer: None)
2023-05-05 06:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://brawlbros.com/robots.txt> (referer: None)
2023-05-05 06:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://themmaguru.com/best-youth-boxing-gloves/> (referer: None)
2023-05-05 06:52:51 [scrapy.core.scraper] DEBUG: Scraped from <403 https://www.sportsdirect.com/boxing/boxing-gloves>
2023-05-05 06:52:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.gloveworx.com/blog/how-choose-best-boxing-gloves-beginners/>
2023-05-05 06:52:52 [scrapy.core.scraper] DEBUG: Scraped from <200 https://themmaguru.com/best-youth-boxing-gloves/>
2023-05-05 06:52:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.nytimes.com/robots.txt> (referer: None)
2023-05-05 06:52:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gearhungry.com/robots.txt> (referer: None)
2023-05-05 06:52:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.hungry4fitness.co.uk/robots.txt> (referer: None)
2023-05-05 06:52:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://findbestboxinggloves.com/robots.txt> (referer: None)
2023-05-05 06:52:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://hiconsumption.com/robots.txt> (referer: None)
2023-05-05 06:52:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://thechamplair.com/sports/best-beginners-boxing-gloves/> (referer: None)
2023-05-05 06:52:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://thechamplair.com/sports/best-beginners-boxing-gloves/>
2023-05-05 06:52:53 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://brawlbros.com/best-boxing-gloves-on-amazon/> (referer: None)
2023-05-05 06:52:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://brawlbros.com/best-boxing-gloves-on-amazon/>
2023-05-05 06:52:53 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://hiconsumption.com/best-boxing-gloves/> (referer: None)
2023-05-05 06:52:53 [scrapy.core.scraper] DEBUG: Scraped from <200 https://hiconsumption.com/best-boxing-gloves/>
2023-05-05 06:52:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.hungry4fitness.co.uk/post/10-best-boxing-mitts-an-ultimate-guide> (referer: None)
2023-05-05 06:52:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gearhungry.com/best-boxing-gloves/> (referer: None)
2023-05-05 06:52:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.hungry4fitness.co.uk/post/10-best-boxing-mitts-an-ultimate-guide>
2023-05-05 06:52:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.gearhungry.com/best-boxing-gloves/>
2023-05-05 06:52:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxingready.com/ringside/best-boxing-gloves-wrist-support/> (referer: None)
2023-05-05 06:52:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://boxingready.com/ringside/best-boxing-gloves-wrist-support/>
2023-05-05 06:52:55 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.nytimes.com/video/style/1194840632119/gear-test-boxing-gloves.html> (referer: None)
2023-05-05 06:52:55 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.nytimes.com/video/style/1194840632119/gear-test-boxing-gloves.html>
2023-05-05 06:52:55 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://findbestboxinggloves.com/best-boxing-gloves-for-heavy-bag-the-complete-guide/> (referer: None)
2023-05-05 06:52:55 [scrapy.core.scraper] DEBUG: Scraped from <200 https://findbestboxinggloves.com/best-boxing-gloves-for-heavy-bag-the-complete-guide/>
2023-05-05 06:53:33 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 196 pages/min), scraped 97 items (at 97 items/min)
2023-05-05 06:54:33 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 06:54:49 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://skilspo.com/robots.txt> (failed 1 times): TCP connection timed out: 110: Connection timed out.
@eliasdabbas
Copy link
Owner

Thanks @mirfan899 I tried those URLs and I didn't get any issues. There were only 3 that were forbidden by robots.txt rules, and a few JSON-LD issues here and there.

Your logs show that you have 89 (out of 100) status codes that are 200.
One of the URLs seems to have timed out.

Please check your code if you have special rules that might have been blocked by some domain (these are from many different domains, and a few might have issues).
Also check the output file and see which URLs were scraped and which weren't.

@mirfan899
Copy link
Author

Here is my code.

import advertools as adv
import pandas as pd


urls = open("urls.txt").readlines()

adv.crawl(urls, "pages.jl", follow_links=False)

Okay, here is the output of code execution, takes around 13 minutes to complete

/home/irfan/.pyenv/versions/TES/bin/python /home/irfan/PycharmProjects/TES-SAAS/tests/scprapping.py 
2023-05-05 13:52:16 [scrapy.utils.log] INFO: Scrapy 2.6.1 started (bot: scrapybot)
2023-05-05 13:52:16 [scrapy.utils.log] INFO: Versions: lxml 4.9.2.0, libxml2 2.9.14, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 22.4.0, Python 3.7.9 (default, Jan 23 2022, 07:32:51) - [GCC 7.5.0], pyOpenSSL 22.0.0 (OpenSSL 3.0.3 3 May 2022), cryptography 37.0.2, Platform Linux-5.4.0-148-generic-x86_64-with-debian-bullseye-sid
2023-05-05 13:52:16 [scrapy.crawler] INFO: Overridden settings:
{'ROBOTSTXT_OBEY': True,
 'SPIDER_LOADER_WARN_ONLY': True,
 'USER_AGENT': 'advertools/0.13.2'}
2023-05-05 13:52:16 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.epollreactor.EPollReactor
2023-05-05 13:52:16 [scrapy.extensions.telnet] INFO: Telnet Password: 548495f04ca5182a
2023-05-05 13:52:16 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.feedexport.FeedExporter',
 'scrapy.extensions.logstats.LogStats']
2023-05-05 13:52:16 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
 'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2023-05-05 13:52:16 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2023-05-05 13:52:16 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2023-05-05 13:52:16 [scrapy.core.engine] INFO: Spider opened
2023-05-05 13:52:17 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2023-05-05 13:52:17 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
2023-05-05 13:52:17 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://sweetscienceoffighting.com/robots.txt> (referer: None)
2023-05-05 13:52:17 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.rollingstone.com/robots.txt> (referer: None)
2023-05-05 13:52:18 [filelock] DEBUG: Attempting to acquire lock 140224916848720 on /home/irfan/.cache/python-tldextract/3.7.9.final__TES__f2586e__tldextract-3.3.0/publicsuffix.org-tlds/de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2023-05-05 13:52:18 [filelock] DEBUG: Lock 140224916848720 acquired on /home/irfan/.cache/python-tldextract/3.7.9.final__TES__f2586e__tldextract-3.3.0/publicsuffix.org-tlds/de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2023-05-05 13:52:18 [filelock] DEBUG: Attempting to release lock 140224916848720 on /home/irfan/.cache/python-tldextract/3.7.9.final__TES__f2586e__tldextract-3.3.0/publicsuffix.org-tlds/de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2023-05-05 13:52:18 [filelock] DEBUG: Lock 140224916848720 released on /home/irfan/.cache/python-tldextract/3.7.9.final__TES__f2586e__tldextract-3.3.0/publicsuffix.org-tlds/de84b5ca2167d4c83e38fb162f2e8738.tldextract.json.lock
2023-05-05 13:52:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.t3.com/robots.txt> (referer: None)
2023-05-05 13:52:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.si.com/robots.txt> (referer: None)
2023-05-05 13:52:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gearpatrol.com/robots.txt> (referer: None)
2023-05-05 13:52:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.verywellfit.com/robots.txt> (referer: None)
2023-05-05 13:52:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.shape.com/robots.txt> (referer: None)
2023-05-05 13:52:18 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.si.com/showcase/fitness/best-boxing-gloves> (referer: None)
2023-05-05 13:52:18 [scrapy.core.scraper] DEBUG: Scraped from <403 https://www.si.com/showcase/fitness/best-boxing-gloves>
2023-05-05 13:52:18 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxingglovesreviews.com/robots.txt> (referer: None)
2023-05-05 13:52:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.t3.com/features/best-boxing-gloves> (referer: None)
2023-05-05 13:52:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.t3.com/features/best-boxing-gloves>
2023-05-05 13:52:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.rollingstone.com/product-recommendations/lifestyle/best-boxing-gloves-1234690811/> (referer: None)
2023-05-05 13:52:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.youtube.com/robots.txt> (referer: None)
2023-05-05 13:52:19 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.rollingstone.com/product-recommendations/lifestyle/best-boxing-gloves-1234690811/>
2023-05-05 13:52:19 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bleacherreport.com/robots.txt> (referer: None)
2023-05-05 13:52:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://expertboxing.com/robots.txt> (referer: None)
2023-05-05 13:52:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.shape.com/fitness/gear/best-boxing-gloves> (referer: None)
2023-05-05 13:52:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.verywellfit.com/best-boxing-gloves-4158917> (referer: None)
2023-05-05 13:52:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://sweetscienceoffighting.com/best-boxing-gloves/> (referer: None)
2023-05-05 13:52:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.shape.com/fitness/gear/best-boxing-gloves>
2023-05-05 13:52:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://thekarateblog.com/robots.txt> (referer: None)
2023-05-05 13:52:20 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxupnation.com/robots.txt> (referer: None)
2023-05-05 13:52:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.verywellfit.com/best-boxing-gloves-4158917>
2023-05-05 13:52:20 [scrapy.core.scraper] DEBUG: Scraped from <200 https://sweetscienceoffighting.com/best-boxing-gloves/>
2023-05-05 13:52:21 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxingglovesreviews.com/top-ten-boxing-gloves/> (referer: None)
2023-05-05 13:52:22 [scrapy.core.scraper] DEBUG: Scraped from <200 https://boxingglovesreviews.com/top-ten-boxing-gloves/>
2023-05-05 13:52:22 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxupnation.com/blogs/news/my-top-5-favorite-boxing-glove-brands-and-why> (referer: None)
2023-05-05 13:52:22 [scrapy.core.scraper] DEBUG: Scraped from <200 https://boxupnation.com/blogs/news/my-top-5-favorite-boxing-glove-brands-and-why>
2023-05-05 13:52:22 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.amazon.com/robots.txt> (referer: None)
2023-05-05 13:52:22 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.tabletenniscoach.me.uk/robots.txt> (referer: None)
2023-05-05 13:52:22 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gearpatrol.com/fitness/g40446087/best-boxing-gloves/> (referer: None)
2023-05-05 13:52:23 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.gearpatrol.com/fitness/g40446087/best-boxing-gloves/>
2023-05-05 13:52:23 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://myboxinglife.com/robots.txt> (referer: None)
2023-05-05 13:52:23 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://wayofmartialarts.com/robots.txt> (referer: None)
2023-05-05 13:52:23 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131> (failed 1 times): 429 Unknown Status
2023-05-05 13:52:24 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bleacherreport.com/articles/1286577-breaking-down-different-brands-of-boxing-gloves-worn-by-the-pros> (referer: None)
2023-05-05 13:52:24 [scrapy.core.scraper] DEBUG: Scraped from <200 https://bleacherreport.com/articles/1286577-breaking-down-different-brands-of-boxing-gloves-worn-by-the-pros>
2023-05-05 13:52:24 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://thekarateblog.com/best-boxing-gloves/> (referer: None)
2023-05-05 13:52:24 [scrapy.core.scraper] DEBUG: Scraped from <200 https://thekarateblog.com/best-boxing-gloves/>
2023-05-05 13:52:25 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.tabletenniscoach.me.uk/sport-equipment-guides/best-boxing-gloves-for-beginners/> (referer: None)
2023-05-05 13:52:25 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.tabletenniscoach.me.uk/sport-equipment-guides/best-boxing-gloves-for-beginners/>
2023-05-05 13:52:25 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131> (failed 2 times): 429 Unknown Status
2023-05-05 13:52:25 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://expertboxing.com/best-boxing-gloves-review> (referer: None)
2023-05-05 13:52:25 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.youtube.com/watch?v=tWoucO2nIlE> (referer: None)
2023-05-05 13:52:25 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.dickssportinggoods.com/robots.txt> (referer: None)
2023-05-05 13:52:25 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.hayabusafight.com/robots.txt> (referer: None)
2023-05-05 13:52:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://expertboxing.com/best-boxing-gloves-review>
2023-05-05 13:52:26 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://myboxinglife.com/best-boxing-gloves-for-beginners/> (referer: None)
2023-05-05 13:52:26 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://blog.joinfightcamp.com/robots.txt> (referer: None)
2023-05-05 13:52:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.youtube.com/watch?v=tWoucO2nIlE>
2023-05-05 13:52:26 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131> (failed 3 times): 429 Unknown Status
2023-05-05 13:52:26 [scrapy.core.engine] DEBUG: Crawled (429) <GET https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131> (referer: None)
2023-05-05 13:52:26 [scrapy.core.scraper] DEBUG: Scraped from <200 https://myboxinglife.com/best-boxing-gloves-for-beginners/>
2023-05-05 13:52:26 [scrapy.core.scraper] DEBUG: Scraped from <429 https://www.amazon.com/Best-Sellers-Boxing-Training-Gloves/zgbs/sporting-goods/3400131>
2023-05-05 13:52:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://wayofmartialarts.com/best-boxing-gloves-worth-your-money/> (referer: None)
2023-05-05 13:52:27 [scrapy.core.scraper] DEBUG: Scraped from <200 https://wayofmartialarts.com/best-boxing-gloves-worth-your-money/>
2023-05-05 13:52:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.youtube.com/watch?v=rHepbZOCxfY> (referer: None)
2023-05-05 13:52:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://revgear.com/robots.txt> (referer: None)
2023-05-05 13:52:27 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.youtube.com/watch?v=rHepbZOCxfY>
2023-05-05 13:52:27 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.dickssportinggoods.com/o/best-boxing-gloves-for-pad-work> (referer: None)
2023-05-05 13:52:27 [scrapy.core.scraper] DEBUG: Scraped from <403 https://www.dickssportinggoods.com/o/best-boxing-gloves-for-pad-work>
2023-05-05 13:52:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://blog.joinfightcamp.com/boxing-equipment/how-to-choose-the-best-boxing-gloves-for-beginners/> (referer: None)
2023-05-05 13:52:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.ebay.com/robots.txt> (referer: None)
2023-05-05 13:52:28 [scrapy.core.scraper] DEBUG: Scraped from <200 https://blog.joinfightcamp.com/boxing-equipment/how-to-choose-the-best-boxing-gloves-for-beginners/>
2023-05-05 13:52:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.hayabusafight.com/products/t3-boxing-gloves> (referer: None)
2023-05-05 13:52:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.walmart.com/robots.txt> (referer: None)
2023-05-05 13:52:28 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.hayabusafight.com/products/t3-boxing-gloves>
2023-05-05 13:52:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.ebay.com/t/Boxing-Gloves/30102/bn_1943751> (referer: None)
2023-05-05 13:52:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.quora.com/robots.txt> (referer: None)
2023-05-05 13:52:28 [scrapy.downloadermiddlewares.robotstxt] DEBUG: Forbidden by robots.txt: <GET https://www.quora.com/What-companies-make-the-best-quality-boxing-gloves>
2023-05-05 13:52:28 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://made4fighters.com/robots.txt> (referer: None)
2023-05-05 13:52:29 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.ebay.com/t/Boxing-Gloves/30102/bn_1943751>
2023-05-05 13:52:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://m.timesofindia.com/robots.txt> (referer: None)
2023-05-05 13:52:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.everlast.com/robots.txt> (referer: None)
2023-05-05 13:52:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://revgear.com/gear/boxing-gloves/> (referer: None)
2023-05-05 13:52:29 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-beginners-best-picks/articleshow/97912567.cms?from=mdr> from <GET https://m.timesofindia.com/most-searched-products/sports-equipment/boxing-gloves-for-beginners-best-picks/articleshow/97912567.cms>
2023-05-05 13:52:29 [scrapy.core.scraper] DEBUG: Scraped from <200 https://revgear.com/gear/boxing-gloves/>
2023-05-05 13:52:30 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.walmart.com/c/lists/top-rated-boxing-gloves> (referer: None)
2023-05-05 13:52:30 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://made4fighters.com/blogs/default-blog/top-womens-boxing-gloves> (referer: None)
2023-05-05 13:52:30 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cletoreyesboxing.com/robots.txt> (referer: None)
2023-05-05 13:52:30 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.titleboxing.com/robots.txt> (referer: None)
2023-05-05 13:52:30 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.walmart.com/c/lists/top-rated-boxing-gloves>
2023-05-05 13:52:30 [seo_spider] ERROR: Invalid control character at: line 20 column 226 (char 698) 200 https://made4fighters.com/blogs/default-blog/top-womens-boxing-gloves
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 761, in parse
    response.css('script[type="application/ld+json"]::text').getall()]
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 760, in <listcomp>
    ld = [json.loads(s.replace('\r', '')) for s in
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 20 column 226 (char 698)
2023-05-05 13:52:30 [scrapy.core.scraper] DEBUG: Scraped from <200 https://made4fighters.com/blogs/default-blog/top-womens-boxing-gloves>
2023-05-05 13:52:30 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.everlast.com/fight/boxing/gloves> (referer: None)
2023-05-05 13:52:31 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://timesofindia.indiatimes.com/robots.txt> (referer: None)
2023-05-05 13:52:31 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.everlast.com/fight/boxing/gloves>
2023-05-05 13:52:31 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://sanabulsports.com/robots.txt> (referer: None)
2023-05-05 13:52:31 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bravose.com/robots.txt> (referer: None)
2023-05-05 13:52:31 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.msmfightshop.com/robots.txt> (referer: None)
2023-05-05 13:52:32 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://sanabulsports.com/blogs/news/the-best-boxing-gloves-for-training> (referer: None)
2023-05-05 13:52:32 [scrapy.core.scraper] DEBUG: Scraped from <200 https://sanabulsports.com/blogs/news/the-best-boxing-gloves-for-training>
2023-05-05 13:52:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.msmfightshop.com/blogs/news/top-3-boxing-gloves-in-the-world> (referer: None)
2023-05-05 13:52:33 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.msmfightshop.com/blogs/news/top-3-boxing-gloves-in-the-world>
2023-05-05 13:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cletoreyesboxing.com/> (referer: None)
2023-05-05 13:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.titleboxing.com/gloves> (referer: None)
2023-05-05 13:52:34 [scrapy.core.scraper] DEBUG: Scraped from <200 https://cletoreyesboxing.com/>
2023-05-05 13:52:34 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.titleboxing.com/gloves>
2023-05-05 13:52:34 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://anthonyjoshua.com/robots.txt> (referer: None)
2023-05-05 13:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-professionals/articleshow/97128538.cms> (referer: None)
2023-05-05 13:52:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-professionals/articleshow/97128538.cms>
2023-05-05 13:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bravose.com/collections/training-gloves> (referer: None)
2023-05-05 13:52:35 [scrapy.core.scraper] DEBUG: Scraped from <200 https://bravose.com/collections/training-gloves>
2023-05-05 13:52:35 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://mmagearaddict.com/robots.txt> (referer: None)
2023-05-05 13:52:36 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://anthonyjoshua.com/blogs/news/anthony-joshua-how-to-choose-the-best-boxing-gloves> (referer: None)
2023-05-05 13:52:36 [scrapy.core.scraper] DEBUG: Scraped from <200 https://anthonyjoshua.com/blogs/news/anthony-joshua-how-to-choose-the-best-boxing-gloves>
2023-05-05 13:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.ringsport.com.au/robots.txt> (referer: None)
2023-05-05 13:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://tufwear-germany.de/robots.txt> (referer: None)
2023-05-05 13:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-beginners-best-picks/articleshow/97912567.cms?from=mdr> (referer: None)
2023-05-05 13:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://issuu.com/robots.txt> (referer: None)
2023-05-05 13:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://yokkao.com/robots.txt> (referer: None)
2023-05-05 13:52:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://timesofindia.indiatimes.com/most-searched-products/sports-equipment/boxing-gloves-for-beginners-best-picks/articleshow/97912567.cms?from=mdr>
2023-05-05 13:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.ringsport.com.au/blogs/ringsport-blog/boxing-glove-guide-part-1> (referer: None)
2023-05-05 13:52:37 [seo_spider] ERROR: Invalid control character at: line 5 column 19 (char 78) 200 https://www.ringsport.com.au/blogs/ringsport-blog/boxing-glove-guide-part-1
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 761, in parse
    response.css('script[type="application/ld+json"]::text').getall()]
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 760, in <listcomp>
    ld = [json.loads(s.replace('\r', '')) for s in
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 5 column 19 (char 78)
2023-05-05 13:52:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.ringsport.com.au/blogs/ringsport-blog/boxing-glove-guide-part-1>
2023-05-05 13:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://topboxer.com/robots.txt> (referer: None)
2023-05-05 13:52:37 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://mmagearaddict.com/best-boxing-gloves/> (referer: None)
2023-05-05 13:52:37 [scrapy.core.scraper] DEBUG: Scraped from <200 https://mmagearaddict.com/best-boxing-gloves/>
2023-05-05 13:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://tufwear-germany.de/en/blogs/news/was-sind-die-besten-boxhandschuhe-der-boxhandschuh-guide-fur-deinen-kauf> (referer: None)
2023-05-05 13:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://issuu.com/punchequipment/docs/get_the_best_boxing_gloves_for_a_winning_performan> (referer: None)
2023-05-05 13:52:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://tufwear-germany.de/en/blogs/news/was-sind-die-besten-boxhandschuhe-der-boxhandschuh-guide-fur-deinen-kauf>
2023-05-05 13:52:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://issuu.com/punchequipment/docs/get_the_best_boxing_gloves_for_a_winning_performan>
2023-05-05 13:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://nypost.com/robots.txt> (referer: None)
2023-05-05 13:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.nakmuaywholesale.com/robots.txt> (referer: None)
2023-05-05 13:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://warriorpunch.com/robots.txt> (referer: None)
2023-05-05 13:52:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://yokkao.com/pages/boxing-gloves-guide> (referer: None)
2023-05-05 13:52:38 [scrapy.core.scraper] DEBUG: Scraped from <200 https://yokkao.com/pages/boxing-gloves-guide>
2023-05-05 13:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://topboxer.com/collections/boxing-gloves> (referer: None)
2023-05-05 13:52:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.infinitudefight.com/robots.txt> (referer: None)
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 10 without any user agent to enforce it on.
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 14 without any user agent to enforce it on.
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 16 without any user agent to enforce it on.
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 35 without any user agent to enforce it on.
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 42 without any user agent to enforce it on.
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 43 without any user agent to enforce it on.
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 44 without any user agent to enforce it on.
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 45 without any user agent to enforce it on.
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 46 without any user agent to enforce it on.
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 47 without any user agent to enforce it on.
2023-05-05 13:52:39 [protego] DEBUG: Rule at line 69 without any user agent to enforce it on.
2023-05-05 13:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://origympersonaltrainercourses.co.uk/robots.txt> (referer: None)
2023-05-05 13:52:39 [seo_spider] ERROR: Invalid control character at: line 15 column 21 (char 385) 200 https://topboxer.com/collections/boxing-gloves
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 761, in parse
    response.css('script[type="application/ld+json"]::text').getall()]
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 760, in <listcomp>
    ld = [json.loads(s.replace('\r', '')) for s in
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 15 column 21 (char 385)
2023-05-05 13:52:39 [scrapy.core.scraper] DEBUG: Scraped from <200 https://topboxer.com/collections/boxing-gloves>
2023-05-05 13:52:39 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.infinitudefight.com/buy-the-best-boxing-gloves/> (referer: None)
2023-05-05 13:52:39 [scrapy.core.scraper] DEBUG: Scraped from <403 https://www.infinitudefight.com/buy-the-best-boxing-gloves/>
2023-05-05 13:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cashkaro.com/robots.txt> (referer: None)
2023-05-05 13:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://nypost.com/article/best-boxing-equipment-per-experts/> (referer: None)
2023-05-05 13:52:39 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://kdvr.com/robots.txt> (referer: None)
2023-05-05 13:52:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://nypost.com/article/best-boxing-equipment-per-experts/>
2023-05-05 13:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.nakmuaywholesale.com/top-3-boxing-gloves-for-small-hands-2022/> (referer: None)
2023-05-05 13:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.popsugar.com/robots.txt> (referer: None)
2023-05-05 13:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://warriorpunch.com/best-boxing-gloves-for-beginners/> (referer: None)
2023-05-05 13:52:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.nakmuaywholesale.com/top-3-boxing-gloves-for-small-hands-2022/>
2023-05-05 13:52:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://warriorpunch.com/best-boxing-gloves-for-beginners/>
2023-05-05 13:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.expertreviews.co.uk/robots.txt> (referer: None)
2023-05-05 13:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://branded.disruptsports.com/robots.txt> (referer: None)
2023-05-05 13:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cashkaro.com/blog/best-boxing-gloves-in-india/201246> (referer: None)
2023-05-05 13:52:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://cashkaro.com/blog/best-boxing-gloves-in-india/201246>
2023-05-05 13:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://branded.disruptsports.com/blogs/blog/which-boxing-gloves-to-buy-for-beginners> (referer: None)
2023-05-05 13:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.expertreviews.co.uk/health-and-grooming/1407584/best-boxing-gloves> (referer: None)
2023-05-05 13:52:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://branded.disruptsports.com/blogs/blog/which-boxing-gloves-to-buy-for-beginners>
2023-05-05 13:52:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://fightquality.com/robots.txt> (referer: None)
2023-05-05 13:52:40 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.expertreviews.co.uk/health-and-grooming/1407584/best-boxing-gloves>
2023-05-05 13:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.popsugar.com/fitness/Best-Boxing-Gloves-Women-45472473> (referer: None)
2023-05-05 13:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.popsugar.com/fitness/Best-Boxing-Gloves-Women-45472473>
2023-05-05 13:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://kdvr.com/reviews/br/sports-fitness-br/boxing-br/best-title-boxing-gloves/> (referer: None)
2023-05-05 13:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://origympersonaltrainercourses.co.uk/blog/best-boxing-gloves> (referer: None)
2023-05-05 13:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://kdvr.com/reviews/br/sports-fitness-br/boxing-br/best-title-boxing-gloves/>
2023-05-05 13:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://fightingadvice.com/robots.txt> (referer: None)
2023-05-05 13:52:41 [seo_spider] ERROR: Expecting value: line 1 column 1 (char 0) 200 https://origympersonaltrainercourses.co.uk/blog/best-boxing-gloves
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 761, in parse
    response.css('script[type="application/ld+json"]::text').getall()]
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 760, in <listcomp>
    ld = [json.loads(s.replace('\r', '')) for s in
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
2023-05-05 13:52:41 [scrapy.core.scraper] DEBUG: Scraped from <200 https://origympersonaltrainercourses.co.uk/blog/best-boxing-gloves>
2023-05-05 13:52:41 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.flipkart.com/robots.txt> (referer: None)
2023-05-05 13:52:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.healthyprinciples.co.uk/robots.txt> (referer: None)
2023-05-05 13:52:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://fightquality.com/2018/10/12/best-custom-gloves/> (referer: None)
2023-05-05 13:52:42 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.k2promos.com/robots.txt> (referer: None)
2023-05-05 13:52:42 [scrapy.core.scraper] DEBUG: Scraped from <200 https://fightquality.com/2018/10/12/best-custom-gloves/>
2023-05-05 13:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://fightingadvice.com/best-boxing-gloves-under-200/> (referer: None)
2023-05-05 13:52:43 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://breakinggrips.com/robots.txt> (referer: None)
2023-05-05 13:52:43 [scrapy.core.scraper] DEBUG: Scraped from <200 https://fightingadvice.com/best-boxing-gloves-under-200/>
2023-05-05 13:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.healthyprinciples.co.uk/best-boxing-gloves-for-kids-review/> (referer: None)
2023-05-05 13:52:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.healthyprinciples.co.uk/best-boxing-gloves-for-kids-review/>
2023-05-05 13:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.k2promos.com/best-beginner-boxing-gloves/> (referer: None)
2023-05-05 13:52:44 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.k2promos.com/best-beginner-boxing-gloves/>
2023-05-05 13:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.proboxingequipment.com/robots.txt> (referer: None)
2023-05-05 13:52:44 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.mmahive.com/robots.txt> (referer: None)
2023-05-05 13:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.flipkart.com/sports/boxing/boxing-gloves/pr?sid=abc%2Cppq%2Cbb6&page=2> (referer: None)
2023-05-05 13:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bwsgym.com/robots.txt> (referer: None)
2023-05-05 13:52:45 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.flipkart.com/sports/boxing/boxing-gloves/pr?sid=abc%2Cppq%2Cbb6&page=2>
2023-05-05 13:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.dontwasteyourmoney.com/robots.txt> (referer: None)
2023-05-05 13:52:45 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.proboxingequipment.com/Boxing-Gloves_c_196.html> (referer: None)
2023-05-05 13:52:45 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.proboxingequipment.com/Boxing-Gloves_c_196.html>
2023-05-05 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bwsgym.com/etiquette-produit/best-boxing-gloves/> (referer: None)
2023-05-05 13:52:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://bwsgym.com/etiquette-produit/best-boxing-gloves/>
2023-05-05 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://glovesaddict.com/robots.txt> (referer: None)
2023-05-05 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.reddit.com/robots.txt> (referer: None)
2023-05-05 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.mmahive.com/best-boxing-gloves-for-wrist-support/> (referer: None)
2023-05-05 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.dontwasteyourmoney.com/products/hawk-sports-heavy-bag-boxing-gloves/> (referer: None)
2023-05-05 13:52:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.mmahive.com/best-boxing-gloves-for-wrist-support/>
2023-05-05 13:52:46 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.dontwasteyourmoney.com/products/hawk-sports-heavy-bag-boxing-gloves/>
2023-05-05 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.bestproducts.com/robots.txt> (referer: None)
2023-05-05 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://middleeasy.com/robots.txt> (referer: None)
2023-05-05 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.reddit.com/r/amateur_boxing/comments/2ykhau/the_top_15_best_boxing_gloves_ranking_the_best/> (referer: None)
2023-05-05 13:52:46 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://absolutelymartialarts.com/robots.txt> (referer: None)
2023-05-05 13:52:47 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.fightingking.com/robots.txt> (failed 1 times): 429 Unknown Status
2023-05-05 13:52:47 [py.warnings] WARNING: /home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/scrapy/core/engine.py:276: ScrapyDeprecationWarning: Passing a 'spider' argument to ExecutionEngine.download is deprecated
  return self.download(result, spider) if isinstance(result, Request) else result

2023-05-05 13:52:47 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.reddit.com/r/amateur_boxing/comments/2ykhau/the_top_15_best_boxing_gloves_ranking_the_best/>
2023-05-05 13:52:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.momjunction.com/robots.txt> (referer: None)
2023-05-05 13:52:48 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://breakinggrips.com/best-kids-boxing-gloves/> (referer: None)
2023-05-05 13:52:48 [scrapy.core.scraper] DEBUG: Scraped from <200 https://breakinggrips.com/best-kids-boxing-gloves/>
2023-05-05 13:52:48 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.fightingking.com/robots.txt> (failed 2 times): 429 Unknown Status
2023-05-05 13:52:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://middleeasy.com/reviews/gear/gloves-cardio-kickboxing/> (referer: None)
2023-05-05 13:52:49 [scrapy.core.scraper] DEBUG: Scraped from <200 https://middleeasy.com/reviews/gear/gloves-cardio-kickboxing/>
2023-05-05 13:52:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://absolutelymartialarts.com/best-boxing-gloves-beginners/> (referer: None)
2023-05-05 13:52:50 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://www.fightingking.com/robots.txt> (failed 3 times): 429 Unknown Status
2023-05-05 13:52:50 [scrapy.core.engine] DEBUG: Crawled (429) <GET https://www.fightingking.com/robots.txt> (referer: None)
2023-05-05 13:52:50 [protego] DEBUG: Rule at line 2 without any user agent to enforce it on.
2023-05-05 13:52:50 [protego] DEBUG: Rule at line 6 without any user agent to enforce it on.
2023-05-05 13:52:50 [protego] DEBUG: Rule at line 10 without any user agent to enforce it on.
2023-05-05 13:52:50 [scrapy.core.scraper] DEBUG: Scraped from <200 https://absolutelymartialarts.com/best-boxing-gloves-beginners/>
2023-05-05 13:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.mightyfighter.com/robots.txt> (referer: None)
2023-05-05 13:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.momjunction.com/articles/best-boxing-gloves-for-kids_00514921/> (referer: None)
2023-05-05 13:52:50 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.momjunction.com/articles/best-boxing-gloves-for-kids_00514921/>
2023-05-05 13:52:50 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.wbcme.co.uk/robots.txt> (referer: None)
2023-05-05 13:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.stylecraze.com/robots.txt> (referer: None)
2023-05-05 13:52:51 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.fightingking.com/boxing-gloves-brands-reviews/> (failed 1 times): 429 Unknown Status
2023-05-05 13:52:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://glovesaddict.com/best-boxing-gloves-on-amazon/> (referer: None)
2023-05-05 13:52:51 [scrapy.core.scraper] DEBUG: Scraped from <200 https://glovesaddict.com/best-boxing-gloves-on-amazon/>
2023-05-05 13:52:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.bestproducts.com/fitness/equipment/g1009/boxing-gloves-mitts/> (referer: None)
2023-05-05 13:52:52 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.bestproducts.com/fitness/equipment/g1009/boxing-gloves-mitts/>
2023-05-05 13:52:52 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://www.fightingking.com/boxing-gloves-brands-reviews/> (failed 2 times): 429 Unknown Status
2023-05-05 13:52:53 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://linealboxing.com/robots.txt> (referer: None)
2023-05-05 13:52:53 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://smartmma.com/robots.txt> (referer: None)
2023-05-05 13:52:53 [protego] DEBUG: Rule at line 1 without any user agent to enforce it on.
2023-05-05 13:52:54 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://www.fightingking.com/boxing-gloves-brands-reviews/> (failed 3 times): 429 Unknown Status
2023-05-05 13:52:54 [scrapy.core.engine] DEBUG: Crawled (429) <GET https://www.fightingking.com/boxing-gloves-brands-reviews/> (referer: None)
2023-05-05 13:52:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.stylecraze.com/articles/best-heavy-bag-gloves/> (referer: None)
2023-05-05 13:52:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.wbcme.co.uk/ringside/best-boxing-gloves-for-beginners/> (referer: None)
2023-05-05 13:52:54 [scrapy.core.scraper] DEBUG: Scraped from <429 https://www.fightingking.com/boxing-gloves-brands-reviews/>
2023-05-05 13:52:54 [seo_spider] ERROR: Invalid control character at: line 28 column 64 (char 1740) 200 https://www.stylecraze.com/articles/best-heavy-bag-gloves/
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 761, in parse
    response.css('script[type="application/ld+json"]::text').getall()]
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/advertools/spider.py", line 760, in <listcomp>
    ld = [json.loads(s.replace('\r', '')) for s in
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/irfan/.pyenv/versions/3.7.9/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 28 column 64 (char 1740)
2023-05-05 13:52:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.stylecraze.com/articles/best-heavy-bag-gloves/>
2023-05-05 13:52:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.mightyfighter.com/top-10-best-boxing-gloves/> (referer: None)
2023-05-05 13:52:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.wbcme.co.uk/ringside/best-boxing-gloves-for-beginners/>
2023-05-05 13:52:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.fullcontactway.com/robots.txt> (referer: None)
2023-05-05 13:52:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.mightyfighter.com/top-10-best-boxing-gloves/>
2023-05-05 13:52:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://linealboxing.com/best-boxing-glove-brands-2022/> (referer: None)
2023-05-05 13:52:54 [scrapy.core.scraper] DEBUG: Scraped from <200 https://linealboxing.com/best-boxing-glove-brands-2022/>
2023-05-05 13:52:54 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.attacktheback.com/robots.txt> (referer: None)
2023-05-05 13:52:55 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.kreedon.com/robots.txt> (referer: None)
2023-05-05 13:52:55 [scrapy.downloadermiddlewares.robotstxt] DEBUG: Forbidden by robots.txt: <GET https://www.kreedon.com/best-boxing-gloves-brands/>
2023-05-05 13:52:55 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cletoreyesuk.com/robots.txt> (referer: None)
2023-05-05 13:52:55 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.attacktheback.com/best-cheap-boxing-gloves/> (referer: None)
2023-05-05 13:52:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.fullcontactway.com/best-sparring-gloves/> (referer: None)
2023-05-05 13:52:56 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://blackbeltmag.com/robots.txt> (referer: None)
2023-05-05 13:52:56 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.attacktheback.com/best-cheap-boxing-gloves/>
2023-05-05 13:52:56 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.fullcontactway.com/best-sparring-gloves/>
2023-05-05 13:52:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.fitnessbaddies.com/robots.txt> (referer: None)
2023-05-05 13:52:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bestreviews.com/robots.txt> (referer: None)
2023-05-05 13:52:57 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.boxingison.com/robots.txt> (referer: None)
2023-05-05 13:52:58 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://cletoreyesuk.com/blogs/news/what-are-the-best-boxing-gloves-for-beginners> (referer: None)
2023-05-05 13:52:58 [scrapy.core.scraper] DEBUG: Scraped from <200 https://cletoreyesuk.com/blogs/news/what-are-the-best-boxing-gloves-for-beginners>
2023-05-05 13:52:58 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://blackbeltmag.com/best-boxing-gloves> (referer: None)
2023-05-05 13:52:58 [scrapy.core.scraper] DEBUG: Scraped from <200 https://blackbeltmag.com/best-boxing-gloves>
2023-05-05 13:52:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://smartmma.com/best-boxing-gloves-for-heavy-bag/> (referer: None)
2023-05-05 13:52:59 [scrapy.core.scraper] DEBUG: Scraped from <200 https://smartmma.com/best-boxing-gloves-for-heavy-bag/>
2023-05-05 13:52:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://bestreviews.com/sports-fitness/boxing/best-boxing-gloves> (referer: None)
2023-05-05 13:52:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.msn.com/robots.txt> (referer: None)
2023-05-05 13:52:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.pragmaticmom.com/robots.txt> (referer: None)
2023-05-05 13:52:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.fitnessbaddies.com/amateur-boxing-gloves/> (referer: None)
2023-05-05 13:52:59 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://thewiredshopper.com/robots.txt> (referer: None)
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 28 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 37 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 38 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 39 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 40 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 41 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 42 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 43 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 44 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 45 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 46 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 47 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 48 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 49 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 50 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 51 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 52 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 53 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 54 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 55 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 56 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 57 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 58 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 59 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 60 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 61 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 67 without any user agent to enforce it on.
2023-05-05 13:52:59 [protego] DEBUG: Rule at line 72 without any user agent to enforce it on.
2023-05-05 13:52:59 [scrapy.core.scraper] DEBUG: Scraped from <200 https://bestreviews.com/sports-fitness/boxing/best-boxing-gloves>
2023-05-05 13:52:59 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.fitnessbaddies.com/amateur-boxing-gloves/>
2023-05-05 13:52:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.boxingear.com/robots.txt> (referer: None)
2023-05-05 13:52:59 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://thewiredshopper.com/best-boxing-gloves-to-buy/> (referer: None)
2023-05-05 13:53:00 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.msn.com/en-gb/lifestyle/rf-best-products-uk/best-boxing-gloves-for-men-12oz-reviews> (referer: None)
2023-05-05 13:53:00 [scrapy.core.scraper] DEBUG: Scraped from <403 https://thewiredshopper.com/best-boxing-gloves-to-buy/>
2023-05-05 13:53:00 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lowkickmma.com/robots.txt> (referer: None)
2023-05-05 13:53:00 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.msn.com/en-gb/lifestyle/rf-best-products-uk/best-boxing-gloves-for-men-12oz-reviews>
2023-05-05 13:53:00 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.standard.co.uk/robots.txt> (referer: None)
2023-05-05 13:53:00 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://sites.google.com/view> from <GET https://www.boxingear.com/shop-2/grant-gloves/lace-up/best-boxing-gloves-for-sparring-grant-gloves/>
2023-05-05 13:53:00 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.pragmaticmom.com/2019/11/best-boxing-gloves-for-women/> (referer: None)
2023-05-05 13:53:01 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.pragmaticmom.com/2019/11/best-boxing-gloves-for-women/>
2023-05-05 13:53:01 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.boxingison.com/best-boxing-gloves-for-training-and-sparring/> (referer: None)
2023-05-05 13:53:01 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.boxingison.com/best-boxing-gloves-for-training-and-sparring/>
2023-05-05 13:53:01 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxingready.com/robots.txt> (referer: None)
2023-05-05 13:53:01 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.lowkickmma.com/best-boxing-gloves/> (referer: None)
2023-05-05 13:53:01 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gloveworx.com/robots.txt> (referer: None)
2023-05-05 13:53:01 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.sportsdirect.com/robots.txt> (referer: None)
2023-05-05 13:53:01 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.lowkickmma.com/best-boxing-gloves/>
2023-05-05 13:53:01 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.standard.co.uk/shopping/esbest/health-fitness/fitness-wear/best-womens-boxing-gloves-for-beginners-a4272321.html> (referer: None)
2023-05-05 13:53:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.standard.co.uk/shopping/esbest/health-fitness/fitness-wear/best-womens-boxing-gloves-for-beginners-a4272321.html>
2023-05-05 13:53:02 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.dmarge.com/robots.txt> (referer: None)
2023-05-05 13:53:02 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://brawlbros.com/robots.txt> (referer: None)
2023-05-05 13:53:02 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://sites.google.com/robots.txt> (referer: None)
2023-05-05 13:53:02 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.dmarge.com/best-boxing-gloves> (referer: None)
2023-05-05 13:53:02 [scrapy.core.engine] DEBUG: Crawled (403) <GET https://www.sportsdirect.com/boxing/boxing-gloves> (referer: None)
2023-05-05 13:53:02 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://sites.google.com/view> (referer: None)
2023-05-05 13:53:02 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.dmarge.com/best-boxing-gloves>
2023-05-05 13:53:02 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://themmaguru.com/robots.txt> (referer: None)
2023-05-05 13:53:02 [scrapy.core.scraper] DEBUG: Scraped from <403 https://www.sportsdirect.com/boxing/boxing-gloves>
2023-05-05 13:53:02 [scrapy.core.scraper] DEBUG: Scraped from <404 https://sites.google.com/view>
2023-05-05 13:53:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.nytimes.com/robots.txt> (referer: None)
2023-05-05 13:53:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gloveworx.com/blog/how-choose-best-boxing-gloves-beginners/> (referer: None)
2023-05-05 13:53:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://thechamplair.com/robots.txt> (referer: None)
2023-05-05 13:53:03 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.gloveworx.com/blog/how-choose-best-boxing-gloves-beginners/>
2023-05-05 13:53:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://brawlbros.com/best-boxing-gloves-on-amazon/> (referer: None)
2023-05-05 13:53:03 [scrapy.core.scraper] DEBUG: Scraped from <200 https://brawlbros.com/best-boxing-gloves-on-amazon/>
2023-05-05 13:53:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://themmaguru.com/best-youth-boxing-gloves/> (referer: None)
2023-05-05 13:53:03 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://findbestboxinggloves.com/robots.txt> (referer: None)
2023-05-05 13:53:04 [scrapy.core.scraper] DEBUG: Scraped from <200 https://themmaguru.com/best-youth-boxing-gloves/>
2023-05-05 13:53:04 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gearhungry.com/robots.txt> (referer: None)
2023-05-05 13:53:04 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://hiconsumption.com/robots.txt> (referer: None)
2023-05-05 13:53:05 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://thechamplair.com/sports/best-beginners-boxing-gloves/> (referer: None)
2023-05-05 13:53:05 [scrapy.core.scraper] DEBUG: Scraped from <200 https://thechamplair.com/sports/best-beginners-boxing-gloves/>
2023-05-05 13:53:06 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://findbestboxinggloves.com/best-boxing-gloves-for-heavy-bag-the-complete-guide/> (referer: None)
2023-05-05 13:53:06 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.nytimes.com/video/style/1194840632119/gear-test-boxing-gloves.html> (referer: None)
2023-05-05 13:53:06 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://hiconsumption.com/best-boxing-gloves/> (referer: None)
2023-05-05 13:53:06 [scrapy.core.scraper] DEBUG: Scraped from <200 https://findbestboxinggloves.com/best-boxing-gloves-for-heavy-bag-the-complete-guide/>
2023-05-05 13:53:06 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.nytimes.com/video/style/1194840632119/gear-test-boxing-gloves.html>
2023-05-05 13:53:06 [scrapy.core.scraper] DEBUG: Scraped from <200 https://hiconsumption.com/best-boxing-gloves/>
2023-05-05 13:53:06 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.gearhungry.com/best-boxing-gloves/> (referer: None)
2023-05-05 13:53:06 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.gearhungry.com/best-boxing-gloves/>
2023-05-05 13:53:06 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://boxingready.com/ringside/best-boxing-gloves-wrist-support/> (referer: None)
2023-05-05 13:53:06 [scrapy.core.scraper] DEBUG: Scraped from <200 https://boxingready.com/ringside/best-boxing-gloves-wrist-support/>
2023-05-05 13:53:07 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.hungry4fitness.co.uk/robots.txt> (referer: None)
2023-05-05 13:53:08 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.hungry4fitness.co.uk/post/10-best-boxing-mitts-an-ultimate-guide> (referer: None)
2023-05-05 13:53:08 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.hungry4fitness.co.uk/post/10-best-boxing-mitts-an-ultimate-guide>
2023-05-05 13:53:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 196 pages/min), scraped 97 items (at 97 items/min)
2023-05-05 13:54:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 13:54:40 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://skilspo.com/robots.txt> (failed 1 times): TCP connection timed out: 110: Connection timed out.
2023-05-05 13:55:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 13:56:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 13:56:51 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://skilspo.com/robots.txt> (failed 2 times): TCP connection timed out: 110: Connection timed out.
2023-05-05 13:57:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 13:58:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 13:59:02 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://skilspo.com/robots.txt> (failed 3 times): TCP connection timed out: 110: Connection timed out.
2023-05-05 13:59:02 [scrapy.downloadermiddlewares.robotstxt] ERROR: Error downloading <GET https://skilspo.com/robots.txt>: TCP connection timed out: 110: Connection timed out.
Traceback (most recent call last):
  File "/home/irfan/.pyenv/versions/TES/lib/python3.7/site-packages/scrapy/core/downloader/middleware.py", line 49, in process_request
    return (yield download_func(request=request, spider=spider))
twisted.internet.error.TCPTimedOutError: TCP connection timed out: 110: Connection timed out.
2023-05-05 13:59:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 14:00:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 14:01:13 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://skilspo.com/gb/blog/1_how-to-choose-the-best-boxing-gloves.html> (failed 1 times): TCP connection timed out: 110: Connection timed out.
2023-05-05 14:01:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 14:02:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 14:03:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 14:03:24 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying <GET https://skilspo.com/gb/blog/1_how-to-choose-the-best-boxing-gloves.html> (failed 2 times): TCP connection timed out: 110: Connection timed out.
2023-05-05 14:04:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 14:05:17 [scrapy.extensions.logstats] INFO: Crawled 196 pages (at 0 pages/min), scraped 97 items (at 0 items/min)
2023-05-05 14:05:35 [scrapy.downloadermiddlewares.retry] ERROR: Gave up retrying <GET https://skilspo.com/gb/blog/1_how-to-choose-the-best-boxing-gloves.html> (failed 3 times): TCP connection timed out: 110: Connection timed out.
2023-05-05 14:05:35 [seo_spider] ERROR: <twisted.python.failure.Failure twisted.internet.error.TCPTimedOutError: TCP connection timed out: 110: Connection timed out.>
2023-05-05 14:05:35 [scrapy.core.scraper] DEBUG: Scraped from TCP connection timed out: 110: Connection timed out.
2023-05-05 14:05:35 [scrapy.core.engine] INFO: Closing spider (finished)
2023-05-05 14:05:35 [scrapy.extensions.feedexport] INFO: Stored jl feed (98 items) in: pages.jl
2023-05-05 14:05:35 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/exception_count': 8,
 'downloader/exception_type_count/scrapy.exceptions.IgnoreRequest': 2,
 'downloader/exception_type_count/twisted.internet.error.TCPTimedOutError': 6,
 'downloader/request_bytes': 60432,
 'downloader/request_count': 210,
 'downloader/request_method_count/GET': 210,
 'downloader/response_bytes': 6041008,
 'downloader/response_count': 204,
 'downloader/response_status_count/200': 183,
 'downloader/response_status_count/301': 1,
 'downloader/response_status_count/302': 1,
 'downloader/response_status_count/403': 9,
 'downloader/response_status_count/404': 1,
 'downloader/response_status_count/429': 9,
 'elapsed_time_seconds': 798.663859,
 'feedexport/success_count/FileFeedStorage': 1,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2023, 5, 5, 9, 5, 35, 694921),
 'httpcompression/response_bytes': 34027780,
 'httpcompression/response_count': 174,
 'item_scraped_count': 98,
 'log_count/DEBUG': 356,
 'log_count/ERROR': 12,
 'log_count/INFO': 24,
 'log_count/WARNING': 1,
 'memusage/max': 231342080,
 'memusage/startup': 142450688,
 'response_received_count': 196,
 'retry/count': 10,
 'retry/max_reached': 5,
 'retry/reason_count/429 Unknown Status': 6,
 'retry/reason_count/twisted.internet.error.TCPTimedOutError': 4,
 "robotstxt/exception_count/<class 'twisted.internet.error.TCPTimedOutError'>": 1,
 'robotstxt/forbidden': 2,
 'robotstxt/request_count': 100,
 'robotstxt/response_count': 99,
 'robotstxt/response_status_count/200': 94,
 'robotstxt/response_status_count/403': 4,
 'robotstxt/response_status_count/429': 1,
 'scheduler/dequeued': 108,
 'scheduler/dequeued/memory': 108,
 'scheduler/enqueued': 108,
 'scheduler/enqueued/memory': 108,
 'start_time': datetime.datetime(2023, 5, 5, 8, 52, 17, 31062)}
2023-05-05 14:05:35 [scrapy.core.engine] INFO: Spider closed (finished)
2023-05-05 14:05:36,013 | INFO | utils.py:160 | _init_num_threads | NumExpr defaulting to 6 threads.

Process finished with exit code 0

@eliasdabbas
Copy link
Owner

Thanks.
Seems fine, and the spider ended because it finished.

If you get a specific error feel free to open another issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants