Skip to content
This repository has been archived by the owner on May 30, 2020. It is now read-only.

Commit

Permalink
Implement pep 503 data-requires for simple repository.
Browse files Browse the repository at this point in the history
This exposes the data-requires-python field when the packages registers it.

This also expand the readme with some basic instruction on how to connect
pypi-legacy to a running warehouse setup.

SQL seem to be ~2x to 5x slower, but likely still reasonable time

SQL Before :

	explain analyse select filename, python_version, md5_digest from release_files where name='ipython_sql'

	Index Scan using release_files_name_idx on release_files  (cost=0.29..15.60 rows=3 width=66) (actual time=0.028..0.028 rows=0 loops=1)
	  Index Cond: (name = 'ipython_sql'::text)
	Planning time: 0.224 ms
	Execution time: 0.058 ms

SQL After :

	explain analyse select filename, releases.requires_python, md5_digest
	from release_files
	inner join releases
	    on release_files.version=releases.version
	    and release_files.name=releases.name
	where release_files.name='ipython_sql'

	Nested Loop  (cost=4.60..31.74 rows=1 width=61) (actual time=0.013..0.013 rows=0 loops=1)
	  Join Filter: (release_files.version = releases.version)
	  ->  Index Scan using release_files_name_idx on release_files  (cost=0.29..15.60 rows=3 width=77) (actual time=0.012..0.012 rows=0 loops=1)
		Index Cond: (name = 'ipython_sql'::text)
	  ->  Materialize  (cost=4.31..16.01 rows=3 width=18) (never executed)
		->  Bitmap Heap Scan on releases  (cost=4.31..16.00 rows=3 width=18) (never executed)
		      Recheck Cond: (name = 'ipython_sql'::text)
		      ->  Bitmap Index Scan on release_name_idx  (cost=0.00..4.31 rows=3 width=0) (never executed)
			    Index Cond: (name = 'ipython_sql'::text)
	Planning time: 0.668 ms
	Execution time: 0.092 ms

Likely longer for project with more releases.
  • Loading branch information
Carreau committed Aug 19, 2016
1 parent 04a98aa commit 09b4978
Show file tree
Hide file tree
Showing 4 changed files with 220 additions and 83 deletions.
75 changes: 0 additions & 75 deletions README

This file was deleted.

203 changes: 203 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,203 @@
Required packages
-----------------

To run the PyPI software, you need Python 2.7+ and PostgreSQL


Quick development setup
-----------------------

Make sure you read http://wiki.python.org/moin/CheeseShopDev#DevelopmentEnvironmentHints
and you have a working PostgreSQL DB.

Make sure your config.ini is up-to-date, initially copying from
config.ini.template. Change CONFIG_FILE at the begining of pypi.wsgi,
so it looks like this::

CONFIG_FILE = 'config.ini'

Then, you can create a development environment like this, if you have
virtualenv installed::

$ virtualenv --no-site-packages .
$ pip install -r requirements.txt

Then you can launch the server using the pypi.wsgi script::

$ python pypi.wsgi
Serving on port 8000...

PyPI will be available in your browser at http://localhost:8000

Database Setup
--------------


Postgres
~~~~~~~~

Connect Legacy-PYPI to warehouse
````````````````````````````````

It is highly recommended, and simpler to connect legacy-pypi to an already
working `warehouse <https://github.com/pypa/warehouse>`_ setup.

Once you have a working warehouse setup, modify the ``docker-compose.yml`` file
to expose a port, by adding a ``ports`` section like so::

db:
image: postgres:9.5
ports:
- "5432:5432"


Modify the pypi-legacy ``config.ini`` ``[database]`` section to connect to this
database, You can find the required information as follows. In the
``docker-compose.yml`` file find the line the set the DATABASE_URL::

DATABASE_URL: postgresql://postgres@db/warehouse

It is structure in the following way: ``DATABASE_URL: postgresql://<user_name>@<host>/<database_name>``

Use the ``docker-machine env`` to find the Docker IP::


$ docker-machine env
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://192.168.99.100:2376"
export DOCKER_CERT_PATH="$HOME/.docker/machine/machines/default"
export DOCKER_MACHINE_NAME="default"

Here the docker-ip is ``192.168.99.100``.

The final ``config.ini`` will be like::

[database]
;Postgres Database using
;warehouse's docker-compose
host = 192.168.99.100
port = 5432
name = warehouse
user = postgres

Start warehouse as usual before starting PyPI-legacy, then start pypi-legacy
that should now connect to the local warehouse database.


Run a local Postgres Database
`````````````````````````````

To fill a database, run ``pkgbase_schema.sql`` on an empty Postgres database.
Then run ``tools/demodata`` to populate the database with dummy data.

To initialize an empty Postgres Database::

mkdir tmp
chmod 700 tmp
initdb -D tmp

The `initdb` step will likely tell you how to start a database server; likely
something along the line of::

$ pg_ctl -D tmp -l logfile start

You probably want to start that in a separate terminal, in the folder where you
created the previous `tmp` directory.



use the following to list all available postgres databases::

$ psql -l
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+-------------+-------------+----------------------------
postgres | guido_vr | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | guido_vr | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/guido_vr +
| | | | | guido_vr=CTc/guido_vr
template1 | guido_vr | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/guido_vr +
| | | | | guido_vr=CTc/guido_vr
Note the _name_ of the database, in our case above, ``postgres``, and th _user_
name , in our case ``guido_vr``, they will be of use to configure the database
in the ``config.ini`` file.


Populate the data with an example sql file, for example, ``example.sql`` that
can be found on the warehouse repository::

pgsql -d postgres -f /path/to/example/file.sql

Where ``postgres`` is the _name_ of the database noted above.


Set up the ``config.ini`` file ``[database]`` section, to connect to the postgres
instance we just started::

[database]
;Postgres Database
host = localhost
port = 5432
name = postgres
user = guido_vr


The default _host_ is likely ``localhost``, and the _port_ number ``5432`` as well.
adapt ``name`` and ``user`` with the value noted before.


Sqlite
~~~~~~

For testing purposes, run the following to create a ``packages.db`` file at the
root of the repository::

python2 tools/mksqlite.py
Set ``[database]driver`` to ``sqlite3`` in ``config.ini``, and
``[database]name`` to ``packages.db``::

[database]

driver = sqlite3
name = package.db



Then run ``tools/demodata`` to populate the database.

PyPI Requires the ``citext`` extension to be installed.

TestPyPI Database Setup
-----------------------

testpypi runs under postgres; because I don't care to fill my head with such
trivialities, the setup commands are:

createdb -O testpypi testpypi
psql -U testpypi testpypi <pkgbase_schema.sql


Restarting PyPI
---------------

PyPI has 2 different pieces that need started, web server and the task runner.

# Restart the web server
$ /etc/init.d/pypi restart
# Restart the task runner
$ initctl restart pypi-worker

Clearing a stuck cache
----------------------

Users reporting stale data being displayed? Try:

curl -X PURGE https://pypi.python.org/pypi/setuptools

(where the URL is the relevant one to the issue, I presume)

To see what fastly thinks it knows about a page (or how it's getting to you) try:

curl -I -H 'Fastly-Debug: 1' https://pypi.python.org/pypi/setuptools
18 changes: 12 additions & 6 deletions store.py
Original file line number Diff line number Diff line change
Expand Up @@ -718,15 +718,21 @@ def get_package_urls(self, name, relative=None):
file_urls = []

# uploaded files
safe_execute(cursor, '''select filename, python_version, md5_digest
from release_files where name=%s''', (name,))
for fname, pyversion, md5 in cursor.fetchall():
safe_execute(cursor,
'''
select filename, releases.requires_python, md5_digest
from release_files
inner join releases
on release_files.version=releases.version
and release_files.name=releases.name
where release_files.name=%s
''', (name,))
for fname, requires_python, md5 in cursor.fetchall():
# Put files first, to have setuptools consider
# them before going to other sites
url = self.gen_file_url(pyversion, name, fname, relative) + \
url = self.gen_file_url('<not used arg>', name, fname, relative) + \
"#md5=" + md5
file_urls.append((url, "internal", fname))

file_urls.append((url, "internal", fname, requires_python))
return sorted(file_urls)

def get_uploaded_file_urls(self, name):
Expand Down
7 changes: 5 additions & 2 deletions webui.py
Original file line number Diff line number Diff line change
Expand Up @@ -1039,7 +1039,7 @@ def simple_body(self, path):
html.append("""<html><head><title>Links for %s</title><meta name="api-version" value="2" /></head>"""
% cgi.escape(path))
html.append("<body><h1>Links for %s</h1>" % cgi.escape(path))
for href, rel, text in urls:
for href, rel, text, requires_python in urls:
if href.startswith('http://cheeseshop.python.org/pypi') or \
href.startswith('http://pypi.python.org/pypi') or \
href.startswith('http://www.python.org/pypi'):
Expand All @@ -1051,7 +1051,10 @@ def simple_body(self, path):
rel = ''
href = cgi.escape(href, quote=True)
text = cgi.escape(text)
html.append("""<a href="%s"%s>%s</a><br/>\n""" % (href, rel, text))
data_attr = ''
if requires_python:
data_attr = " data-requires-python='{}'".format(cgi.escape(requires_python))
html.append("""<a%s href="%s"%s>%s</a><br/>\n""" % (data_attr, href, rel, text))
html.append("</body></html>")
html = ''.join(html)
return html
Expand Down

0 comments on commit 09b4978

Please sign in to comment.