Skip to content

Commit

Permalink
Merge branch 'moderm-setuppy'
Browse files Browse the repository at this point in the history
  • Loading branch information
dangra committed Aug 8, 2014
2 parents c2497e6 + 0772201 commit 0254f58
Show file tree
Hide file tree
Showing 5 changed files with 78 additions and 164 deletions.
4 changes: 1 addition & 3 deletions Makefile.buildbot
@@ -1,8 +1,5 @@
TRIAL := $(shell which trial)
BRANCH := $(shell git rev-parse --abbrev-ref HEAD)
ifeq ($(BRANCH),master)
export SCRAPY_VERSION_FROM_GIT=1
endif
export PYTHONPATH=$(PWD)

test:
Expand All @@ -11,6 +8,7 @@ test:
-s3cmd sync -P htmlcov/ s3://static.scrapy.org/coverage-scrapy-$(BRANCH)/

build:
test $(BRANCH) != master || git describe >scrapy/VERSION
python extras/makedeb.py build

clean:
Expand Down
71 changes: 39 additions & 32 deletions docs/intro/install.rst
Expand Up @@ -4,32 +4,31 @@
Installation guide
==================

Pre-requisites
==============
Installing Scrapy
=================

.. note:: Check :ref:`intro-install-platform-notes` first.

The installation steps assume that you have the following things installed:

* `Python`_ 2.7
* `lxml`_. Most Linux distributions ships prepackaged versions of lxml. Otherwise refer to http://lxml.de/installation.html
* `OpenSSL`_. This comes preinstalled in all operating systems except Windows (see :ref:`intro-install-platform-notes`)
* `pip`_ or `easy_install`_ Python package managers

Installing Scrapy
=================
* `pip`_ and `setuptools`_ Python packages. Nowadays `pip`_ requires and
installs `setuptools`_ if not installed.

You can install Scrapy using easy_install or pip (which is the canonical way to
distribute and install Python packages).
* `lxml`_. Most Linux distributions ships prepackaged versions of lxml.
Otherwise refer to http://lxml.de/installation.html

.. note:: Check :ref:`intro-install-platform-notes` first.
* `OpenSSL`_. This comes preinstalled in all operating systems, except Windows
where the Python installer ships it bundled.

You can install Scrapy using pip (which is the canonical way to install Python
packages).

To install using pip::

pip install Scrapy

To install using easy_install::

easy_install Scrapy

.. _intro-install-platform-notes:

Platform specific installation notes
Expand All @@ -38,34 +37,33 @@ Platform specific installation notes
Windows
-------

After installing Python, follow these steps before installing Scrapy:
* Install Python 2.7 from http://python.org/download/

You need to adjust ``PATH`` environment variable to include paths to
the Python executable and additional scripts. The following paths need to be
added to ``PATH``::

* add the ``C:\python27\Scripts`` and ``C:\python27`` folders to the system
path by adding those directories to the ``PATH`` environment variable from
the `Control Panel`_.
C:\Python2.7\;C:\Python2.7\Scripts\;

* install OpenSSL by following these steps:
To update the ``PATH`` open a Command prompt and run::

1. go to `Win32 OpenSSL page <http://slproweb.com/products/Win32OpenSSL.html>`_
c:\python27\python.exe c:\python27\tools\scripts\win_add2path.py

2. download Visual C++ 2008 redistributables for your Windows and architecture
Close the command prompt window and reopen it so changes take effect, run the
following command and check it shows the expected Python version::

3. download OpenSSL for your Windows and architecture (the regular version, not the light one)
python --version

4. add the ``c:\openssl-win32\bin`` (or similar) directory to your ``PATH``, the same way you added ``python27`` in the first step`` in the first step
* Install `pip`_ from https://pip.pypa.io/en/latest/installing.html

* some binary packages that Scrapy depends on (like Twisted, lxml and pyOpenSSL) require a compiler available to install, and fail if you don't have Visual Studio installed. You can find Windows installers for those in the following links. Make sure you respect your Python version and Windows architecture.
Now open a Command prompt to check ``pip`` is installed correctly::

* pywin32: http://sourceforge.net/projects/pywin32/files/
* Twisted: http://twistedmatrix.com/trac/wiki/Downloads
* zope.interface: download the egg from `zope.interface pypi page <http://pypi.python.org/pypi/zope.interface>`_ and install it by running ``easy_install file.egg``
* lxml: http://pypi.python.org/pypi/lxml/
* pyOpenSSL: https://launchpad.net/pyopenssl
pip --version

Finally, this page contains many precompiled Python binary libraries, which may
come handy to fulfill Scrapy dependencies:
* At this point Python 2.7 and ``pip`` package manager must be working, let's
install Scrapy::

http://www.lfd.uci.edu/~gohlke/pythonlibs/
pip install Scrapy

Ubuntu 9.10 or above
~~~~~~~~~~~~~~~~~~~~
Expand All @@ -77,10 +75,19 @@ Instead, use the official :ref:`Ubuntu Packages <topics-ubuntu>`, which already
solve all dependencies for you and are continuously updated with the latest bug
fixes.

Archlinux
~~~~~~~~~

You can follow the generic instructions or install Scrapy from `AUR Scrapy package`::

yaourt -S scrapy


.. _Python: http://www.python.org
.. _pip: http://www.pip-installer.org/en/latest/installing.html
.. _easy_install: http://pypi.python.org/pypi/setuptools
.. _Control Panel: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/sysdm_advancd_environmnt_addchange_variable.mspx
.. _lxml: http://lxml.de/
.. _OpenSSL: https://pypi.python.org/pypi/pyOpenSSL
.. _setuptools: https://pypi.python.org/pypi/setuptools
.. _AUR Scrapy package: https://aur.archlinux.org/packages/scrapy/
4 changes: 0 additions & 4 deletions extras/scrapy.bat

This file was deleted.

2 changes: 1 addition & 1 deletion scrapy/core/downloader/handlers/ftp.py
Expand Up @@ -83,7 +83,7 @@ def gotClient(self, client, request, filepath):
callbackArgs=(request, protocol),
errback=self._failed,
errbackArgs=(request,))

def _build_response(self, result, request, protocol):
self.result = result
respcls = responsetypes.from_args(url=request.url)
Expand Down
161 changes: 37 additions & 124 deletions setup.py
@@ -1,135 +1,48 @@
# Scrapy setup.py script
#
# It doesn't depend on setuptools, but if setuptools is available it'll use
# some of its features, like package dependencies.

from distutils.command.install_data import install_data
from distutils.command.install import INSTALL_SCHEMES
from subprocess import Popen, PIPE
import os
import sys

class osx_install_data(install_data):
# On MacOS, the platform-specific lib dir is /System/Library/Framework/Python/.../
# which is wrong. Python 2.5 supplied with MacOS 10.5 has an Apple-specific fix
# for this in distutils.command.install_data#306. It fixes install_lib but not
# install_data, which is why we roll our own install_data class.

def finalize_options(self):
# By the time finalize_options is called, install.install_lib is set to the
# fixed directory, so we set the installdir to install_lib. The
# install_data class uses ('install_data', 'install_dir') instead.
self.set_undefined_options('install', ('install_lib', 'install_dir'))
install_data.finalize_options(self)

if sys.platform == "darwin":
cmdclasses = {'install_data': osx_install_data}
else:
cmdclasses = {'install_data': install_data}

def fullsplit(path, result=None):
"""
Split a pathname into components (the opposite of os.path.join) in a
platform-neutral way.
"""
if result is None:
result = []
head, tail = os.path.split(path)
if head == '':
return [tail] + result
if head == path:
return result
return fullsplit(head, [tail] + result)

# Tell distutils to put the data_files in platform-specific installation
# locations. See here for an explanation:
# http://groups.google.com/group/comp.lang.python/browse_thread/thread/35ec7b2fed36eaec/2105ee4d9e8042cb
for scheme in INSTALL_SCHEMES.values():
scheme['data'] = scheme['purelib']

# Compile the list of packages available, because distutils doesn't have
# an easy way to do this.
packages, data_files = [], []
root_dir = os.path.dirname(__file__)
if root_dir != '':
os.chdir(root_dir)

def is_not_module(filename):
return os.path.splitext(filename)[1] not in ['.py', '.pyc', '.pyo']

for scrapy_dir in ['scrapy']:
for dirpath, dirnames, filenames in os.walk(scrapy_dir):
# Ignore dirnames that start with '.'
for i, dirname in enumerate(dirnames):
if dirname.startswith('.'): del dirnames[i]
if '__init__.py' in filenames:
packages.append('.'.join(fullsplit(dirpath)))
data = [f for f in filenames if is_not_module(f)]
if data:
data_files.append([dirpath, [os.path.join(dirpath, f) for f in data]])
elif filenames:
data_files.append([dirpath, [os.path.join(dirpath, f) for f in filenames]])

# Small hack for working with bdist_wininst.
# See http://mail.python.org/pipermail/distutils-sig/2004-August/004134.html
if len(sys.argv) > 1 and sys.argv[1] == 'bdist_wininst':
for file_info in data_files:
file_info[0] = '\\PURELIB\\%s' % file_info[0]

scripts = ['bin/scrapy']
if os.name == 'nt':
scripts.append('extras/scrapy.bat')

if os.environ.get('SCRAPY_VERSION_FROM_GIT'):
v = Popen("git describe", shell=True, stdout=PIPE).communicate()[0]
with open('scrapy/VERSION', 'w+') as f:
f.write(v.strip())
with open(os.path.join(os.path.dirname(__file__), 'scrapy/VERSION')) as f:
version = f.read().strip()


setup_args = {
'name': 'Scrapy',
'version': version,
'url': 'http://scrapy.org',
'description': 'A high-level Python Screen Scraping framework',
'long_description': open('README.rst').read(),
'author': 'Scrapy developers',
'maintainer': 'Pablo Hoffman',
'maintainer_email': 'pablo@pablohoffman.com',
'license': 'BSD',
'packages': packages,
'cmdclass': cmdclasses,
'data_files': data_files,
'scripts': scripts,
'include_package_data': True,
'classifiers': [
'Programming Language :: Python',
'Programming Language :: Python :: 2.7',
'License :: OSI Approved :: BSD License',
'Operating System :: OS Independent',
from os.path import dirname, join
from setuptools import setup, find_packages


with open(join(dirname(__file__), 'scrapy/VERSION'), 'rb') as f:
version = f.read().decode('ascii').strip()


setup(
name='Scrapy',
version=version,
url='http://scrapy.org',
description='A high-level Python Screen Scraping framework',
long_description=open('README.rst').read(),
author='Scrapy developers',
maintainer='Pablo Hoffman',
maintainer_email='pablo@pablohoffman.com',
license='BSD',
packages=find_packages(exclude=('tests', 'tests.*')),
include_package_data=True,
zip_safe=False,
entry_points={
'console_scripts': ['scrapy = scrapy.cmdline:execute']
},
classifiers=[
'Framework :: Scrapy',
'Development Status :: 5 - Production/Stable',
'Intended Audience :: Developers',
'Environment :: Console',
'Intended Audience :: Developers',
'License :: OSI Approved :: BSD License',
'Operating System :: OS Independent',
'Programming Language :: Python',
'Programming Language :: Python :: 2',
'Programming Language :: Python :: 2.7',
'Topic :: Internet :: WWW/HTTP',
'Topic :: Software Development :: Libraries :: Application Frameworks',
'Topic :: Software Development :: Libraries :: Python Modules',
'Topic :: Internet :: WWW/HTTP',
]
}

try:
from setuptools import setup
except ImportError:
from distutils.core import setup
else:
setup_args['install_requires'] = [
],
install_requires=[
'Twisted>=10.0.0',
'w3lib>=1.8.0',
'queuelib',
'lxml',
'pyOpenSSL',
'cssselect>=0.9',
'six>=1.5.2',
]

setup(**setup_args)
],
)

0 comments on commit 0254f58

Please sign in to comment.