Skip to content

Commit

Permalink
Merge c860917 into 2b1caad
Browse files Browse the repository at this point in the history
  • Loading branch information
havok2063 committed Apr 3, 2020
2 parents 2b1caad + c860917 commit 3a16b4e
Show file tree
Hide file tree
Showing 31 changed files with 1,317 additions and 53 deletions.
7 changes: 5 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ cache:

sudo: false

services:
- postgresql

python:
- '2.7'
- '3.5'
Expand Down Expand Up @@ -34,10 +37,10 @@ install:
- pip install pytest
- pip install pytest-coverage
- pip install coveralls
- python setup.py install
- pip install .[all,dev]

script:
- pytest python/sdssdb/tests --cov python/sdssdb --cov-report html
- pytest -p no:sugar python/sdssdb/tests --cov python/sdssdb --cov-report html

after_success:
- coveralls
6 changes: 6 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ Changelog

This document records the main changes to the ``sdssdb`` code.

* Test suite only runs where existing local databases found. Optionally run only `peewee` or `sqlalchemy` tests.
* Adds ability to generate fake data based on real database models for tests
* Adds ability to test against real or fake databases
* Write tests either for `peewee` or `sqlalchemy` databases
* :feature:`-` New framework for writing tests against databases

* :release:`0.3.2 <2020-03-10>`
* Change ``operations-test`` profile to ``operations`` using the new machine hostname.
* New schema and models for ``sdss5db.targetdb``.
Expand Down
257 changes: 254 additions & 3 deletions docs/sphinx/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,11 @@ In addition to improvements to the code, you can contribute database connections
| |__ schema1.py
| |__ schema2.py
Let's imagine you want to create files for a database called ``awesomedb`` which has two schemas: ``amazing`` and ``stupendous``. Depending on whether you are creating model classes for peewee or sqlalchemy (or both), you will need to create a directory called ``awesomedb`` under the correct library directory with a ``__init__.py`` file and two ``amazing.py`` and ``stupendous.py`` files. The following sections will show you how to fill out those files depending on the library used.
Let's imagine you want to create files for a database called ``awesomedb`` which has two schemas: ``amazing``
and ``stupendous``. Depending on whether you are creating model classes for peewee or sqlalchemy (or both),
you will need to create a directory called ``awesomedb`` under the correct library directory with a
``__init__.py`` file and two ``amazing.py`` and ``stupendous.py`` files. The following sections will show you
how to fill out those files depending on the library used.


Peewee
Expand Down Expand Up @@ -131,10 +135,257 @@ For the model classes you will need to write the files manually but there is no
database.add_base(Base)


In this example we have two tables, ``user`` and ``address`` that we model as ``User`` and ``Address`` respectively. Note that we don't need to specify any column at this point, just the ``__tablename__`` metadata property. All model classes need to subclass from ``Base``, which in turn subclasses from `~sqlalchemy.ext.declarative.AbstractConcreteBase` and ``AwesomedbBase``. We can use the special attribute ``print_fields`` to define a list of fields that will be output in the standard representation of the model instances (primary keys and ``label`` fields are always output).
In this example we have two tables, ``user`` and ``address`` that we model as ``User`` and ``Address``
respectively. Note that we don't need to specify any column at this point, just the ``__tablename__``
metadata property. All model classes need to subclass from ``Base``, which in turn subclasses from
`~sqlalchemy.ext.declarative.AbstractConcreteBase` and ``AwesomedbBase``. We can use the special attribute
``print_fields`` to define a list of fields that will be output in the standard representation of the model
instances (primary keys and ``label`` fields are always output).

The ``define_relations`` function must contain all the foreign key relationships for this model. In this case there only one relationship that allows to retrieve the address for a given ``User`` (and its back reference). We need to encapsulate the relationships in a function so that they can be recreated if we change the database connection to point to a different database. Finally, we add the ``database.add_base(Base)`` statement to bind the base to the database connection.
The ``define_relations`` function must contain all the foreign key relationships for this model. In this
case there only one relationship that allows to retrieve the address for a given ``User`` (and its
back reference). We need to encapsulate the relationships in a function so that they can be recreated if
we change the database connection to point to a different database. Finally, we add the
``database.add_base(Base)`` statement to bind the base to the database connection.

Testing Your New Database
-------------------------

After creating your database, you will want to ensure its stability and robustness as you expand its
capabilities over time. This can be done by writing tests against your database. The testing directory system
is very similar to the `sdssdb` database directory, with test database files located within separate library
folders for ``peewee`` databases (``pwdbs``) or ``sqlalchemy`` databases (``sqladbs``).

.. code-block:: none
tests
|
|__ pwdbs
| |
| |__ __init__.py
| |__ conftest.py
| |__ models.py
| |__ factories.py
| |
| |__ test_database1.py
| |__ test_database2.py
|
|__ sqladbs
| |
| |__ __init__.py
| |__ conftest.py
| |__ models.py
| |__ factories.py
| |
| |__ test_database1.py
| |__ test_database2.py
|
|__ conftest.py
|__ test_generic_items.py
Most Python testing frameworks look for tests in files named ``test_xxxx.py``. Under each library we create a
``test_xxxx`` file for each new database we want to test. Since we've created a new ``awesomedb`` database, our
testing file will be ``test_awesomedb.py``. This file gets placed under either the ``pwdbs`` or ``sqladbs`` (or both)
depending on if your database is using ``peeewee`` or ``sqlalchemy``.

``sdssdb`` uses `pytest <https://docs.pytest.org/en/latest/>`_ as its testing framework, and assumes user
familiarity with pytest. The test directories contain ``conftest.py`` files which are files used for sharing
fixture functions between tests. See `here <https://docs.pytest.org/en/latest/fixture.html#conftest-py-sharing-fixture-functions>`_
for more details. You will also see files called ``models`` and ``factories``. We will come back to these later.

Peewee
^^^^^^

Let's see what an example ``test_awesomedb.py`` might look like
::

import pytest
from sdssdb.peewee.awesomedb import database, stupendous


@pytest.mark.parametrize('database', [database], indirect=True)
class TestStupdendous(object):

def test_user_count(self):
''' test that count of user table returns results '''
user_ct = stupendous.User.select().count()
assert user_ct > 0

We follow pytest's `test naming convention <https://docs.pytest.org/en/latest/goodpractices.html#test-discovery>`_
for naming test files as well as tests within files. In our ``test_awesomedb`` file, we group similar tests
by schema together into ``Test`` classes, i.e. for the ``stupendous`` schema, we create a ``TestStupendous`` class.
All tests related to the ``stupendous`` schema will be defined in this class. Individual tests within each class
are defined as methods on the class, named with ``test_xxxx``.

In order for our test class to understand that we wish to use the ``awesomedb`` database for all defined tests, we
use the provided ``database`` fixture function and parametrize it with the ``awesomedb`` database. See
`fixture parametrization <https://docs.pytest.org/en/latest/fixture.html#parametrizing-fixtures>`_ to learn more
about how to parametrize tests or fixtures.

We've defined a simple test, ``test_user_count``, that checks that our ``user`` table returns
some number of results > 0. In this case, we are a performing a simple select statement that does not modify the
database. If we are writing tests that perform write operations on the database, we could use the provided
``transaction`` fixture to ensure all changes are rolled back.

SQLAlchemy
^^^^^^^^^^

The example ``test_awesomedb.py`` file for a ``sqlalchemy`` database will look very similar to the
``peewee`` version.
::

import pytest
from sdssdb.sqlalchemy.awesomedb import database
if database.connected:
from sdssdb.sqlalchemy.awesomedb import stupendous


@pytest.mark.parametrize('database', [database], indirect=True)
class TestStupdendous(object):

def test_user_count(self, session):
''' test that count of user table returns results '''
user_ct = session.query(stupendous.User).count()
assert user_ct > 0

There are two main differences in this file from the ``peewee`` version. The first is that we must wrap the
import of the ``stupendous`` models inside a conditional that checks if the database has been successfully
connected to. This is needed because importing ``sqlalchemy`` models when no database exists, or
cannot connect, breaks other succcessful database imports. The second change is the use of the ``session``
fixture inside the test. Since ``sqlalchemy`` needs a db session to perform queries, we use the
provided ``session`` pytest fixture. This fixture will ensure that all changes made to the database
are rolled back and not permanent.

Generating and Inserting Fake Data into Your Database Tables
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If you are only interested in writing simple tests that test real data in your database
tables, then you can stop here and start writing your tests. Sometimes, however, you may want to write
tests for special database queries or model functions where you don't quite have the right data, or enough
of it, loaded. In these cases, we can generate fake data and insert it dynamically into our database tables.
To do so, we have to create a "model factory". This factory creates fake data based on a database Model.

The following examples use the following resources to generate fake data:

- `factory_boy <https://factoryboy.readthedocs.io/en/latest/>`_ - creates db model factories to generate fake entries
- `faker <https://faker.readthedocs.io/en/master/index.html>`_ - creates fake data as needed by models
- `pytest-factoryboy <https://pytest-factoryboy.readthedocs.io/en/latest/>`_ - turns model factories into pytest fixtures

Let's see how to create factories to generate fake Users and Addressess, inside the ``factories.py`` file,
using the ``peewee`` library implementation as an example.
::

from sdssdb.peewee.awesomedb import database as awesomedb, stupendous
from .factoryboy import PeeweeModelFactory

class AddressFactory(PeeweeModelFactory):
# define a Meta class with the associated model and database
class Meta:
model = stupendous.Address
database = awesomedb

# define fake data generators for all columns in the table
pk = factory.Sequence(lambda n: n)
street = factory.Faker('street_address')
city = factory.Faker('city')
state = factory.Faker('state_abbr')
zipcode = factory.Faker('zipcode')
full = factory.LazyAttribute(lambda a: f'{a.street}\n{a.city}, {a.state} {a.zipcode}')

class UserFactory(PeeweeModelFactory):
class Meta:
model = stupendous.User
database = awesomedb

pk = factory.Sequence(lambda n: n)
first = factory.Faker('first_name')
last = factory.Faker('last_name')
name = factory.LazyAttribute(lambda u: f'{u.first} {u.last}')

# establishes the one-to-one relationship
address = factory.SubFactory(AddressFactory)

If the ``User`` and ``Address`` models created previously have the following columns on each table, we use
the `factorboy declarations <https://factoryboy.readthedocs.io/en/latest/reference.html#declarations>`_
and `factory.Faker providers <https://faker.readthedocs.io/en/master/providers.html>`_ to assign each column
a fake data generator. For each factory we need to define a ``Meta`` class in it that defines the database
model associated with it, as well as the database it belongs to.

These factories allow us to create fake instances of data that automatically inserts into the
designated database table. To create an instance locally without database insertion, you can use
``UserFactory.build`` or to create in bulk, use ``UserFactory.create_batch``.
::

>>> user = UserFactory()
>>> user
>>> <User: pk=1, name='Walter Brown'>
>>> user.address
>>> <Address: pk=1>

The more common use however will be in tests. These factories automatically get converted into pytest
fixture functions using ``pytest-factoryboy``. Let's see how we would use this in ``test_awesomedb.py``.
::

@pytest.mark.parametrize('database', [database], indirect=True)
class TestStupdendous(object):

def test_new_user(self, user_factory):
''' test that we add a new user '''
user_factory.create(first='New Bob')
user = stupendous.User.get(stupendous.User.first=='New Bob')
assert user.first == 'New Bob'

Notice the lowercase-underscore syntax. This is the fixture name of the ``UserFactory``. The above examples
were written using the ``peeweee`` implementation. For real examples, see the sdss5db tests in
``tests/pwdbs/test_sdss5db.py`` and associated factories in ``test/pwdbs/factories.py``. The ``sqlalchemy``
version of defining a factory is very similar.
::

import factory
from sdssdb.tests.sqladbs import get_model_from_database
from sdssdb.sqlalchemy.awesomedb import database as awesomedb
stupendous = get_model_from_database(awesomedb, 'stupendous')

if stupendous:
class UserFactory(factory.alchemy.SQLAlchemyModelFactory):
''' factory for stupendous user table '''
class Meta:
model = stupendous.User
sqlalchemy_session = aweseomdb.Session # the SQLAlchemy session object

# column definitions as before
pk = factory.Sequence(lambda n: n)
...

Because ``sqlalchemy`` models cannot be imported when no database exists locally, we must use
``get_model_from_database`` to conditionally import the models we need, and place the factory class inside
a conditional. Additionally, the factory Meta class needs the ``sqlalchemy`` Session rather the database itself.
All other behaviours and defintions are the same. For examples of ``sqlalchemy`` factories and their uses, see
``tests/sqladbs/factories.py`` and the mangadb tests in ``tests/sqladbs/test_mangadb.py``.

Using a Generic Test Database
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Sometimes you may want to test a function common to many databases, or a generic database connection, or simply
not want to mess with real databases. In these cases, a temporary test postgres database is available to use.
By default, when no real database is passed into the ``database`` fixture function, the test database is generated.
For example, the ``peewee`` test example case from earlier would now be the following, with the pytest
parametrization line removed.
::

class TestStupdendous(object):

def test_user_count(self):
''' test that count of user table returns results '''
user_ct = stupendous.User.select().count()
assert user_ct > 0

This test would now use the temporary database, which is setup and destroyed for each test module. Because
the test database is created as a blank slate, all database models must be created as well, in addition to any
model factories. These models can be stored in the ``models.py`` file under the respective library directories.
See any of the ``models.py`` files for examples of creating test database models, and ``factories.py`` for their
associated factories. See any of the tests defined in ``test_factory.py`` for examples of how to write tests
against temporary database models defined in ``models.py``.

Should I use Peewee or SQLAlchemy?
----------------------------------
Expand Down
24 changes: 23 additions & 1 deletion python/sdssdb/connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,18 +59,25 @@ class DatabaseConnection(six.with_metaclass(abc.ABCMeta)):
autoconnect : bool
Whether to autoconnect to the database using the profile parameters.
Requites `.dbname` to be set.
dbversion : str
A database version. If specified, appends to dbname as "dbname_dbversion"
and becomes the dbname used for connection strings.
"""

#: The database name.
dbname = None
dbversion = None

def __init__(self, dbname=None, profile=None, autoconnect=True):
def __init__(self, dbname=None, profile=None, autoconnect=True, dbversion=None):

#: Reports whether the connection is active.
self.connected = False
self.profile = None
self.dbname = dbname if dbname else self.dbname
self.dbversion = dbversion or self.dbversion
if self.dbversion:
self.dbname = f'{self.dbname}_{self.dbversion}'

self.set_profile(profile=profile, connect=autoconnect)

Expand Down Expand Up @@ -303,6 +310,18 @@ def become_user(self):

self.become(user)

def change_version(self, dbversion=None):
''' Change database version and attempt to reconnect
Parameters:
dbversion (str):
A database version
'''
self.dbversion = dbversion
dbname, *dbver = self.dbname.split('_')
self.dbname = f'{dbname}_{self.dbversion}' if dbversion else dbname
self.connect(dbname=self.dbname, silent_on_fail=True)


if _peewee:

Expand Down Expand Up @@ -426,6 +445,8 @@ def _conn(self, dbname, silent_on_fail=False, **params):
self.engine.dispose()
self.engine = None
self.connected = False
self.Session = None
self.metadata = None
else:
self.connected = True
self.dbname = dbname
Expand All @@ -436,6 +457,7 @@ def _conn(self, dbname, silent_on_fail=False, **params):
def reset_engine(self):
''' Reset the engine, metadata, and session '''

self.bases = []
if self.engine:
self.engine.dispose()
self.engine = None
Expand Down
6 changes: 6 additions & 0 deletions python/sdssdb/etc/sdssdb.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,3 +51,9 @@ utahdb:
host: db.sdss.utah.edu
port: 5432
domain: db.sdss.utah.edu

slore:
user: sdss
host: lore.sdss.utah.edu
port: 5432
domain: lore.sdss.utah.edu
4 changes: 2 additions & 2 deletions python/sdssdb/sqlalchemy/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,11 +99,11 @@ def cone_search(self, ra, dec, a, b=None, pa=None, ra_col='ra', dec_col='dec'):
dec_attr = getattr(self, dec_col)

if b is None:
return fn.q3c_radial_query(ra_attr, dec_attr, ra, dec, a)
return func.q3c_radial_query(ra_attr, dec_attr, ra, dec, a)
else:
pa = pa or 0.0
ratio = b / a
return fn.q3c_ellipse_query(ra_attr, dec_attr, ra, dec, a, ratio, pa)
return func.q3c_ellipse_query(ra_attr, dec_attr, ra, dec, a, ratio, pa)

@cone_search.expression
def cone_search(cls, ra, dec, a, b=None, pa=None, ra_col='ra', dec_col='dec'):
Expand Down
Loading

0 comments on commit 3a16b4e

Please sign in to comment.