Scrapinghub → Zyte

scrapy · Feb 2, 2021 · f30f53b · f30f53b
1 parent 28262d4
commit f30f53b
Show file tree

Hide file tree

Showing 9 changed files with 38 additions and 35 deletions.
diff --git a/AUTHORS b/AUTHORS
@@ -1,8 +1,8 @@
 Scrapy was brought to life by Shane Evans while hacking a scraping framework
 prototype for Mydeco (mydeco.com). It soon became maintained, extended and
 improved by Insophia (insophia.com), with the initial sponsorship of Mydeco to
-bootstrap the project. In mid-2011, Scrapinghub became the new official
-maintainer.
+bootstrap the project. In mid-2011, Scrapinghub (now Zyte) became the new
+official maintainer.
 
 Here is the list of the primary authors & contributors:
 

diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md
@@ -55,7 +55,7 @@ further defined and clarified by project maintainers.
 ## Enforcement
 
 Instances of abusive, harassing, or otherwise unacceptable behavior may be
-reported by contacting the project team at opensource@scrapinghub.com. All
+reported by contacting the project team at opensource@zyte.com. All
 complaints will be reviewed and investigated and will result in a response that
 is deemed necessary and appropriate to the circumstances. The project team is
 obligated to maintain confidentiality with regard to the reporter of an incident.

diff --git a/README.rst b/README.rst
@@ -42,10 +42,11 @@ Scrapy is a fast high-level web crawling and web scraping framework, used to
 crawl websites and extract structured data from their pages. It can be used for
 a wide range of purposes, from data mining to monitoring and automated testing.
 
-Scrapy is maintained by `Scrapinghub`_ and `many other contributors`_.
+Scrapy is maintained by Zyte_ (formerly Scrapinghub) and `many other
+contributors`_.
 
 .. _many other contributors: https://github.com/scrapy/scrapy/graphs/contributors
-.. _Scrapinghub: https://www.scrapinghub.com/
+.. _Zyte: https://www.zyte.com/
 
 Check the Scrapy homepage at https://scrapy.org for more information,
 including a list of features.
@@ -95,7 +96,7 @@ Please note that this project is released with a Contributor Code of Conduct
 (see https://github.com/scrapy/scrapy/blob/master/CODE_OF_CONDUCT.md).
 
 By participating in this project you agree to abide by its terms.
-Please report unacceptable behavior to opensource@scrapinghub.com.
+Please report unacceptable behavior to opensource@zyte.com.
 
 Companies using Scrapy
 ======================

diff --git a/docs/intro/install.rst b/docs/intro/install.rst
@@ -266,7 +266,6 @@ For details, see `Issue #2473 <https://github.com/scrapy/scrapy/issues/2473>`_.
 .. _setuptools: https://pypi.python.org/pypi/setuptools
 .. _homebrew: https://brew.sh/
 .. _zsh: https://www.zsh.org/
-.. _Scrapinghub: https://scrapinghub.com
 .. _Anaconda: https://docs.anaconda.com/anaconda/
 .. _Miniconda: https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html
 .. _conda-forge: https://conda-forge.org/
diff --git a/docs/topics/deploy.rst b/docs/topics/deploy.rst
@@ -14,7 +14,7 @@ spiders come in.
 Popular choices for deploying Scrapy spiders are:
 
 * :ref:`Scrapyd <deploy-scrapyd>` (open source)
-* :ref:`Scrapy Cloud <deploy-scrapy-cloud>` (cloud-based)
+* :ref:`Zyte Scrapy Cloud <deploy-scrapy-cloud>` (cloud-based)
 
 .. _deploy-scrapyd:
 
@@ -32,28 +32,28 @@ Scrapyd is maintained by some of the Scrapy developers.
 
 .. _deploy-scrapy-cloud:
 
-Deploying to Scrapy Cloud
-=========================
+Deploying to Zyte Scrapy Cloud
+==============================
 
-`Scrapy Cloud`_ is a hosted, cloud-based service by `Scrapinghub`_,
-the company behind Scrapy.
+`Zyte Scrapy Cloud`_ is a hosted, cloud-based service by Zyte_, the company
+behind Scrapy.
 
-Scrapy Cloud removes the need to setup and monitor servers
-and provides a nice UI to manage spiders and review scraped items,
-logs and stats.
+Zyte Scrapy Cloud removes the need to setup and monitor servers and provides a
+nice UI to manage spiders and review scraped items, logs and stats.
 
-To deploy spiders to Scrapy Cloud you can use the `shub`_ command line tool.
-Please refer to the `Scrapy Cloud documentation`_ for more information.
+To deploy spiders to Zyte Scrapy Cloud you can use the `shub`_ command line
+tool.
+Please refer to the `Zyte Scrapy Cloud documentation`_ for more information.
 
-Scrapy Cloud is compatible with Scrapyd and one can switch between
+Zyte Scrapy Cloud is compatible with Scrapyd and one can switch between
 them as needed - the configuration is read from the ``scrapy.cfg`` file
 just like ``scrapyd-deploy``.
 
-.. _Scrapyd: https://github.com/scrapy/scrapyd
 .. _Deploying your project: https://scrapyd.readthedocs.io/en/latest/deploy.html
-.. _Scrapy Cloud: https://scrapinghub.com/scrapy-cloud
+.. _Scrapyd: https://github.com/scrapy/scrapyd
 .. _scrapyd-client: https://github.com/scrapy/scrapyd-client
-.. _shub: https://doc.scrapinghub.com/shub.html
 .. _scrapyd-deploy documentation: https://scrapyd.readthedocs.io/en/latest/deploy.html
-.. _Scrapy Cloud documentation: https://doc.scrapinghub.com/scrapy-cloud.html
-.. _Scrapinghub: https://scrapinghub.com/
+.. _shub: https://shub.readthedocs.io/en/latest/
+.. _Zyte: https://zyte.com/
+.. _Zyte Scrapy Cloud: https://www.zyte.com/scrapy-cloud/
+.. _Zyte Scrapy Cloud documentation: https://docs.zyte.com/scrapy-cloud.html
diff --git a/docs/topics/logging.rst b/docs/topics/logging.rst
@@ -101,7 +101,7 @@ instance, which can be accessed and used like this::
     class MySpider(scrapy.Spider):
 
         name = 'myspider'
-        start_urls = ['https://scrapinghub.com']
+        start_urls = ['https://scrapy.org']
 
         def parse(self, response):
             self.logger.info('Parse function called on %s', response.url)
@@ -117,7 +117,7 @@ Python logger you want. For example::
     class MySpider(scrapy.Spider):
 
         name = 'myspider'
-        start_urls = ['https://scrapinghub.com']
+        start_urls = ['https://scrapy.org']
 
         def parse(self, response):
             logger.info('Parse function called on %s', response.url)

diff --git a/docs/topics/practices.rst b/docs/topics/practices.rst
@@ -63,7 +63,7 @@ project as example.
     process = CrawlerProcess(get_project_settings())
 
     # 'followall' is the name of one of the spiders of the project.
-    process.crawl('followall', domain='scrapinghub.com')
+    process.crawl('followall', domain='scrapy.org')
     process.start() # the script will block here until the crawling is finished
 
 There's another Scrapy utility that provides more control over the crawling
@@ -244,7 +244,7 @@ Here are some tips to keep in mind when dealing with these kinds of sites:
   super proxy that you can attach your own proxies to.
 * use a highly distributed downloader that circumvents bans internally, so you
   can just focus on parsing clean pages. One example of such downloaders is
-  `Crawlera`_
+  `Zyte Smart Proxy Manager`_
 
 If you are still unable to prevent your bot getting banned, consider contacting
 `commercial support`_.
@@ -254,5 +254,5 @@ If you are still unable to prevent your bot getting banned, consider contacting
 .. _ProxyMesh: https://proxymesh.com/
 .. _Google cache: http://www.googleguide.com/cached_pages.html
 .. _testspiders: https://github.com/scrapinghub/testspiders
-.. _Crawlera: https://scrapinghub.com/crawlera
 .. _scrapoxy: https://scrapoxy.io/
+.. _Zyte Smart Proxy Manager: https://www.zyte.com/smart-proxy-manager/
diff --git a/docs/topics/selectors.rst b/docs/topics/selectors.rst
@@ -464,10 +464,10 @@ effectively. If you are not much familiar with XPath yet,
 you may want to take a look first at this `XPath tutorial`_.
 
 .. note::
-    Some of the tips are based on `this post from ScrapingHub's blog`_.
+    Some of the tips are based on `this post from Zyte's blog`_.
 
 .. _`XPath tutorial`: http://www.zvon.org/comp/r/tut-XPath_1.html
-.. _`this post from ScrapingHub's blog`: https://blog.scrapinghub.com/2014/07/17/xpath-tips-from-the-web-scraping-trenches/
+.. _this post from Zyte's blog: https://www.zyte.com/blog/xpath-tips-from-the-web-scraping-trenches/
 
 
 .. _topics-selectors-relative-xpaths:

diff --git a/scrapy/core/downloader/handlers/http11.py b/scrapy/core/downloader/handlers/http11.py
@@ -303,11 +303,14 @@ def _get_agent(self, request, timeout):
             proxyHost = to_unicode(proxyHost)
             omitConnectTunnel = b'noconnect' in proxyParams
             if omitConnectTunnel:
-                warnings.warn("Using HTTPS proxies in the noconnect mode is deprecated. "
-                              "If you use Crawlera, it doesn't require this mode anymore, "
-                              "so you should update scrapy-crawlera to 1.3.0+ "
-                              "and remove '?noconnect' from the Crawlera URL.",
-                              ScrapyDeprecationWarning)
+                warnings.warn(
+                    "Using HTTPS proxies in the noconnect mode is deprecated. "
+                    "If you use Zyte Smart Proxy Manager (formerly Crawlera), "
+                    "it doesn't require this mode anymore, so you should "
+                    "update scrapy-crawlera to 1.3.0+ and remove '?noconnect' "
+                    "from the Zyte Smart Proxy Manager URL.",
+                    ScrapyDeprecationWarning,
+                )
             if scheme == b'https' and not omitConnectTunnel:
                 proxyAuth = request.headers.get(b'Proxy-Authorization', None)
                 proxyConf = (proxyHost, proxyPort, proxyAuth)