Skip to content

Commit

Permalink
Adding a doc on upgrading
Browse files Browse the repository at this point in the history
  • Loading branch information
coleifer committed Oct 6, 2012
1 parent ab1579a commit d9c6371
Show file tree
Hide file tree
Showing 2 changed files with 227 additions and 0 deletions.
3 changes: 3 additions & 0 deletions docs/index.rst
Expand Up @@ -15,6 +15,8 @@ peewee
For flask integration, including an admin interface and RESTful API, check
out `flask-peewee <https://github.com/coleifer/flask-peewee/>`_.

See notes on :ref:`notes on upgrading and changes from 1.0 <upgrading>`

Contents:
---------

Expand All @@ -24,6 +26,7 @@ Contents:

peewee/overview
peewee/installation
peewee/upgrading
peewee/cookbook
peewee/example
peewee/models
Expand Down
224 changes: 224 additions & 0 deletions docs/peewee/upgrading.rst
@@ -0,0 +1,224 @@
.. _upgrading:

Upgrading peewee
================

Peewee went from 2319 SLOC to 1666.

Goals for the new API
---------------------

* consistent: there is one way of doing things
* expressive: things can be done that I never thought of


.. _changes:

Changes from version 1.0
------------------------

The biggest changes between 1.0 and 2.0 are in the syntax used for
constructing queries. The first iteration of peewee I threw up on github
was about 600 lines. I was passing around strings and dictionaries and
as time went on and I added features, those strings turned into tuples and
objects. This meant, though, that I needed code to handle all the possible
ways of expressing something. Look at the code for `parse_select <https://gist.github.com/a957dbbff0310fd88d5c>`_.

I learned a valuable lesson: keep data in datastructures until the
*absolute* last second.

With the benefit of hindsight and experience, I decided to rewrite and unify
the API a bit. The result is a tradeoff. The newer syntax may be a bit more
verbose at times, but at least it will be consistent.

Since seeing is believing, I will show some side-by-side comparisons. Let's
pretend we're using the models from the cookbook, good ol' user and tweet:

.. code-block:: python
class User(Model):
username = CharField()
class Tweet(Model):
user = ForeignKeyField(User, related_name='tweets')
message = TextField()
created_date = DateTimeField(default=datetime.datetime.now)
is_published = BooleanField(default=True)
Get me a list of all tweets by a user named "charlie":

.. code-block:: python
# 1.0
Tweet.select().join(User).where(username='charlie')
# 2.0
Tweet.select().join(User).where(User.username == 'charlie')
Get me a list of tweets ordered by the authors username, then newest to oldest:

.. code-block:: python
# 1.0
Tweet.select().order_by(('created_date', 'desc')).join(User).order_by('username')
# 2.0
Tweet.select().join(User).order_by(User.username, Tweet.created_date.desc())
Get me a list of tweets created by users named "charlie" or "peewee herman", and
which were created in the last week.

.. code-block:: python
last_week = datetime.datetime.now() - datetime.timedelta(days=7)
# 1.0
Tweet.select().where(created_date__gt=last_week).join(User).where(
Q(username='charlie') | Q(username='peewee herman')
)
# 2.0
Tweet.select().join(User).where(Tweet.created_date > last_week & (
(User.username == 'charlie') | (User.username == 'peewee herman')
))
Get me a list of users and when they last tweeted (if ever):

.. code-block:: python
# 1.0
User.select({
User: ['*'],
Tweet: [Max('created_date', 'last_date')]
}).join(Tweet, 'LEFT OUTER').group_by(User)
# 2.0
User.select(
User, fn.Max(Tweet.created_date).alias('last_date')
).join(Tweet, JOIN_LEFT_OUTER).group_by(User)
Lets do an atomic update on a counter model (you'll have to user your imagination):

.. code-block:: python
# 1.0
Counter.update(count=F('count') + 1).where(url=request.url)
# 2.0
Counter.update(count=Counter.count + 1).where(Counter.url == request.url)
Let's find all the users whose username starts with 'a' or 'A':

.. code-block:: python
# 1.0
User.select().where(R('LOWER(SUBSTR(username, 1, 1)) = %s', 'a'))
# 2.0
User.select().where(fn.Lower(fn.Substr(User.username, 1, 1)) == 'a')
I hope a couple things jump out at you from these examples. What I see is
that the 1.0 API is sometimes a bit less verbose, but it relies on strings in
many places (which may be fields, aliases, selections, join types, functions, etc). In the
where clause stuff gets crazy as there are args being combined with bitwise
operators ("Q" expressions) and also kwargs being used with django-style "double-underscore"
lookups. The crazy thing is, there are so many different ways I could have expressed
some of the above queries using peewee 1.0 that I had a hard time deciding which
to even write.

The 2.0 API is hopefully more consistent. Selections, groupings, functions, joins
and orderings all pretty much conform to the same API. Likewise, where and having
clauses are handled the same way (in 1.0 the where clause is simply a raw string).


Changes in fields and columns
-----------------------------

Well, for one, columns are gone. They were a shim that I used to hack in non-integer
primary keys. I always thought the field SQL generation was one of the grosser
parts of the module and even worse was the back-and-forth that happened between the
field and column classes. So, columns are gone - its just fields - and they're
hopefully a bit smaller and saner. I also cleaned up the primary key business.
Basically it works like this:

* if you don't specify a primary key, one will be created named "id"
* if you do specify a primary key and it is a PrimaryKeyField (or subclass),
it will be an automatically incrementing integer
* if you specify a primary key and it is anything else peewee assumes you are
in control and will stay out of the way.

The API for specifying a non-auto-incrementing primary key changed:

.. code-block:: python
# 1.0
class OldSchool(Model):
uuid = PrimaryKeyField(column_class=VarCharColumn)
# 2.0
class NewSchool(Model):
uuid = CharField(primary_key=True)
The kwargs for the Field constructor changed slightly, the biggest probably
being that ``db_index`` was renamed to ``index``.


Changes in database and adapter
-------------------------------

In peewee 1.0 there were two classes that controlled access to the database --
the Database subclass and an Adapter. The adapter's job was to say what features
a database backend provided, what operations were valid, what column types were
supported, and how to open a connection. The database was a bit higher-level and
its main job was to execute queries and provide metadata about the database, like
lists of tables, last insert id, etc.

I chose to consolidate these two classes, since inevitably they always went in
pairs (e.g. SqliteDatabase/SqliteAdapter). The database class now encapsulates
all this functionality.


How the SQL gets made
---------------------

The first thing I started with is the QueryCompiler and the data structures it
uses. It takes the data structures from peewee and spits out SQL. It works
recursively and knows about a few types of expressions:

* the query tree
* comparison statements like '==', 'IN', 'LIKE' which comprise the leaves of the tree
* expressions like addition, substraction, bitwise operations
* sql functions like ``substr`` and ``lower``
* aggregate functions like ``count`` and ``max``
* columns, which may be selected, joined on, grouped by, ordered by, used as parameters
for functions and aggregates, etc.
* python objects to use as query parameters

At the heart of it is the ``Expr`` object, which is for "expression". It
can be anything that can validly be translated into part of a SQL query.

Expressions can be nested, giving way to interesting possibilities like
the following example I love which selects users whose username starts with "a":

.. code-block:: python
User.select().where(fn.Substr(fn.Lower(User.username, 1, 1)) == 'a')
The "where" clause now contains a tree with one leaf. The leaf represents the
nested function expression on the left-hand-side and the scalar value 'a' on the
right hand side. Peewee will recursively evaluate the expressions on either side
of the operation and generate the correct SQL.

Another aspect is that :py:class:`Field` objects are also expressions, which
makes it possible to write things like:

.. code-block:: python
Employee.select().where(Employee.salary < (Employee.tenure * 1000) + 40000)
.. note:: I totally went crazy with operator overloading.

If you're interested in looking, the ``QueryCompiler.parse_expr`` method is where
the bulk of the code lives.

0 comments on commit d9c6371

Please sign in to comment.