Adding a doc on upgrading

coleifer · Oct 6, 2012 · d9c6371 · d9c6371
1 parent ab1579a
commit d9c6371
Show file tree

Hide file tree

Showing 2 changed files with 227 additions and 0 deletions.
diff --git a/docs/index.rst b/docs/index.rst
@@ -15,6 +15,8 @@ peewee
 For flask integration, including an admin interface and RESTful API, check
 out `flask-peewee <https://github.com/coleifer/flask-peewee/>`_.
 
+See notes on :ref:`notes on upgrading and changes from 1.0 <upgrading>`
+
 Contents:
 ---------
 
@@ -24,6 +26,7 @@ Contents:
 
    peewee/overview
    peewee/installation
+   peewee/upgrading
    peewee/cookbook
    peewee/example
    peewee/models

diff --git a/docs/peewee/upgrading.rst b/docs/peewee/upgrading.rst
@@ -0,0 +1,224 @@
+.. _upgrading:
+
+Upgrading peewee
+================
+
+Peewee went from 2319 SLOC to 1666.
+
+Goals for the new API
+---------------------
+
+* consistent: there is one way of doing things
+* expressive: things can be done that I never thought of
+
+
+.. _changes:
+
+Changes from version 1.0
+------------------------
+
+The biggest changes between 1.0 and 2.0 are in the syntax used for
+constructing queries.  The first iteration of peewee I threw up on github
+was about 600 lines.  I was passing around strings and dictionaries and
+as time went on and I added features, those strings turned into tuples and
+objects.  This meant, though, that I needed code to handle all the possible
+ways of expressing something.  Look at the code for `parse_select <https://gist.github.com/a957dbbff0310fd88d5c>`_.
+
+I learned a valuable lesson:  keep data in datastructures until the
+*absolute* last second.
+
+With the benefit of hindsight and experience, I decided to rewrite and unify
+the API a bit.  The result is a tradeoff.  The newer syntax may be a bit more 
+verbose at times, but at least it will be consistent.
+
+Since seeing is believing, I will show some side-by-side comparisons.  Let's
+pretend we're using the models from the cookbook, good ol' user and tweet:
+
+.. code-block:: python
+
+    class User(Model):
+        username = CharField()
+
+    class Tweet(Model):
+        user = ForeignKeyField(User, related_name='tweets')
+        message = TextField()
+        created_date = DateTimeField(default=datetime.datetime.now)
+        is_published = BooleanField(default=True)
+
+
+Get me a list of all tweets by a user named "charlie":
+
+.. code-block:: python
+
+    # 1.0
+    Tweet.select().join(User).where(username='charlie') 
+
+    # 2.0
+    Tweet.select().join(User).where(User.username == 'charlie')
+
+Get me a list of tweets ordered by the authors username, then newest to oldest:
+
+.. code-block:: python
+
+    # 1.0
+    Tweet.select().order_by(('created_date', 'desc')).join(User).order_by('username')
+
+    # 2.0
+    Tweet.select().join(User).order_by(User.username, Tweet.created_date.desc())
+
+Get me a list of tweets created by users named "charlie" or "peewee herman", and
+which were created in the last week.
+
+.. code-block:: python
+
+    last_week = datetime.datetime.now() - datetime.timedelta(days=7)
+
+    # 1.0
+    Tweet.select().where(created_date__gt=last_week).join(User).where(
+        Q(username='charlie') | Q(username='peewee herman')
+    )
+
+    # 2.0
+    Tweet.select().join(User).where(Tweet.created_date > last_week & (
+        (User.username == 'charlie') | (User.username == 'peewee herman')
+    ))
+
+Get me a list of users and when they last tweeted (if ever):
+
+.. code-block:: python
+
+    # 1.0
+    User.select({
+        User: ['*'], 
+        Tweet: [Max('created_date', 'last_date')]
+    }).join(Tweet, 'LEFT OUTER').group_by(User)
+
+    # 2.0
+    User.select(
+        User, fn.Max(Tweet.created_date).alias('last_date')
+    ).join(Tweet, JOIN_LEFT_OUTER).group_by(User)
+
+Lets do an atomic update on a counter model (you'll have to user your imagination):
+
+.. code-block:: python
+
+    # 1.0
+    Counter.update(count=F('count') + 1).where(url=request.url)
+
+    # 2.0
+    Counter.update(count=Counter.count + 1).where(Counter.url == request.url)
+
+Let's find all the users whose username starts with 'a' or 'A':
+
+.. code-block:: python
+
+    # 1.0
+    User.select().where(R('LOWER(SUBSTR(username, 1, 1)) = %s', 'a'))
+
+    # 2.0
+    User.select().where(fn.Lower(fn.Substr(User.username, 1, 1)) == 'a')
+
+
+I hope a couple things jump out at you from these examples.  What I see is
+that the 1.0 API is sometimes a bit less verbose, but it relies on strings in 
+many places (which may be fields, aliases, selections, join types, functions, etc).  In the 
+where clause stuff gets crazy as there are args being combined with bitwise 
+operators ("Q" expressions) and also kwargs being used with django-style "double-underscore" 
+lookups. The crazy thing is, there are so many different ways I could have expressed
+some of the above queries using peewee 1.0 that I had a hard time deciding which 
+to even write.
+
+The 2.0 API is hopefully more consistent.  Selections, groupings, functions, joins
+and orderings all pretty much conform to the same API.  Likewise, where and having
+clauses are handled the same way (in 1.0 the where clause is simply a raw string).
+
+
+Changes in fields and columns
+-----------------------------
+
+Well, for one, columns are gone.  They were a shim that I used to hack in non-integer
+primary keys.  I always thought the field SQL generation was one of the grosser
+parts of the module and even worse was the back-and-forth that happened between the
+field and column classes.  So, columns are gone - its just fields - and they're
+hopefully a bit smaller and saner.  I also cleaned up the primary key business.
+Basically it works like this:
+
+* if you don't specify a primary key, one will be created named "id"
+* if you do specify a primary key and it is a PrimaryKeyField (or subclass),
+  it will be an automatically incrementing integer
+* if you specify a primary key and it is anything else peewee assumes you are
+  in control and will stay out of the way.
+
+The API for specifying a non-auto-incrementing primary key changed:
+
+.. code-block:: python
+
+    # 1.0
+    class OldSchool(Model):
+        uuid = PrimaryKeyField(column_class=VarCharColumn)
+
+    # 2.0
+    class NewSchool(Model):
+        uuid = CharField(primary_key=True)
+
+The kwargs for the Field constructor changed slightly, the biggest probably
+being that ``db_index`` was renamed to ``index``.
+
+
+Changes in database and adapter
+-------------------------------
+
+In peewee 1.0 there were two classes that controlled access to the database --
+the Database subclass and an Adapter.  The adapter's job was to say what features
+a database backend provided, what operations were valid, what column types were
+supported, and how to open a connection.  The database was a bit higher-level and
+its main job was to execute queries and provide metadata about the database, like
+lists of tables, last insert id, etc.
+
+I chose to consolidate these two classes, since inevitably they always went in
+pairs (e.g. SqliteDatabase/SqliteAdapter).  The database class now encapsulates
+all this functionality.
+
+
+How the SQL gets made
+---------------------
+
+The first thing I started with is the QueryCompiler and the data structures it
+uses.  It takes the data structures from peewee and spits out SQL.  It works 
+recursively and knows about a few types of expressions:
+
+* the query tree
+* comparison statements like '==', 'IN', 'LIKE' which comprise the leaves of the tree
+* expressions like addition, substraction, bitwise operations
+* sql functions like ``substr`` and ``lower``
+* aggregate functions like ``count`` and ``max``
+* columns, which may be selected, joined on, grouped by, ordered by, used as parameters
+  for functions and aggregates, etc.
+* python objects to use as query parameters
+
+At the heart of it is the ``Expr`` object, which is for "expression".  It
+can be anything that can validly be translated into part of a SQL query.
+
+Expressions can be nested, giving way to interesting possibilities like
+the following example I love which selects users whose username starts with "a":
+
+.. code-block:: python
+
+    User.select().where(fn.Substr(fn.Lower(User.username, 1, 1)) == 'a')
+
+The "where" clause now contains a tree with one leaf.  The leaf represents the
+nested function expression on the left-hand-side and the scalar value 'a' on the
+right hand side.  Peewee will recursively evaluate the expressions on either side
+of the operation and generate the correct SQL.
+
+Another aspect is that :py:class:`Field` objects are also expressions, which
+makes it possible to write things like:
+
+.. code-block:: python
+
+    Employee.select().where(Employee.salary < (Employee.tenure * 1000) + 40000)
+
+.. note:: I totally went crazy with operator overloading.
+
+If you're interested in looking, the ``QueryCompiler.parse_expr`` method is where
+the bulk of the code lives.