adds a testing framework for databases #25

havok2063 · 2020-03-24T19:44:39Z

This PR adds an initial testing framework that supports testing against databases in either peewee or sqlalchemy. You can write tests against real databases, or create tests using a general test database. You can create fake tables, or insert fake data temporarily into real ones.
Uses factory_boy, faker, and pytest-factoryboy for customizable fake data factories for each db model. It uses pytest-postgresql to generate a test postgres database instance.

Full list:

tests against existing real databases, with change rollbacks
generate and insert fake data into real tables, with change rollbacks
test generic database code on a test database
ignores database tests when there is no local database
options to run tests only for peewee or sqlalchemy databases
option to switch session/transactions from function to module scope
adds example tests against the generic sdss database connection
adds example tests for peewee and sqla using the test database + fake data
adds example tests for peewee and sqla using fake data on real dbs

Still to do:

update changelog
add documentation (see here)

Optionals:

look into easier factory creation for a given ModelClass
look into mechanism for mapping existing Models onto the test database
better test organization?
look into easier fake test data generation

…transaction

…ome text

coveralls · 2020-03-24T20:09:20Z

Pull Request Test Coverage Report for Build 211

0 of 83 (0.0%) changed or added relevant lines in 7 files are covered.
761 unchanged lines in 15 files lost coverage.
Overall coverage decreased (-13.08%) to 0.0%

Changes Missing Coverage	Changed/Added Lines	%
python/sdssdb/sqlalchemy/init.py	2	0.0%
python/sdssdb/sqlalchemy/archive/init.py	3	0.0%
python/sdssdb/sqlalchemy/mangadb/datadb.py	3	0.0%
python/sdssdb/sqlalchemy/mangadb/dapdb.py	4	0.0%
python/sdssdb/connection.py	13	0.0%
python/sdssdb/utils/ingest.py	25	0.0%
python/sdssdb/sqlalchemy/archive/sas.py	33	0.0%

Files with Coverage Reduction	New Missed Lines	%
python/sdssdb/misc/init.py	1	0%
python/sdssdb/sqlalchemy/archive/init.py	1	0%
python/sdssdb/utils/init.py	3	0%
python/sdssdb/peewee/sdss5db/init.py	5	0%
python/sdssdb/utils/internals.py	9	0%
python/sdssdb/utils/schemadisplay.py	11	0%
python/sdssdb/init.py	18	0%
python/sdssdb/misc/color_print.py	19	0%
python/sdssdb/core/exceptions.py	20	0%
python/sdssdb/utils/ingest.py	24	0%

Totals
Change from base Build 210:	-13.08%
Covered Lines:	0
Relevant Lines:	5154

💛 - Coveralls

albireox

Overall I like this framework very much. I haven't really tested it so I'm talking mostly from a look at the code and reading the documentation, but it seems robust. A few questions:

How do the test run in Travis if they expect a real database (i.e., when not using factory-boy)?
Can "fake" data be somehow loaded from CSV files or such? I can see how create totally fake data can require a lot of effort or be ultimately not useful if you're expecting certain data.
Can we create a fixture that runs some sanity checks on models automatically? Something very simple such as making sure they import, that they connect to the database, etc. My guess is that something like that would cover 90% of the relevant checks.
We probably need some quite thorough testing of SQLADatabaseConnection and PeeweeDatabaseConnection.

The last two items are general comments, not really intended for this PR.

I've added a few comments about style and linting. In general, can you enable the option to remove trailing whitespaces?

python/sdssdb/utils/ingest.py

albireox · 2020-04-14T05:02:12Z

python/sdssdb/utils/ingest.py

+from sqlalchemy.ext.automap import automap_base
+from sqlalchemy.ext.declarative import declarative_base, DeferredReflection
+from sdssdb import log
+from sdssdb.connection import SQLADatabaseConnection
+from sdssdb.sqlalchemy import BaseModel


Imports in incorrect order.

I updated this file using the isort extension, with default settings. It didn't change these imports much. I think it moved the inflect import down into a new block. If there's a preferred setting or method you're using, I'm happy to adopt it. Just let me know what I need to change.

python/sdssdb/utils/ingest.py

albireox · 2020-04-14T05:05:18Z

python/sdssdb/etc/sdssdb.yml

@@ -51,3 +51,9 @@ utahdb:
    host: db.sdss.utah.edu
    port: 5432
    domain: db.sdss.utah.edu
+
+slore:


I needed a profile for the sdss user on the lore host machine. I already had a lore profile for the read-only marvin database user which is only relevant for the manga db. I wasn't really sure what to call this. And what's the policy here on what actually goes in this file? Are we supposed to put one new profile per host machine? Or one new profile per database user per host machine?

albireox · 2020-04-14T05:08:48Z

python/sdssdb/connection.py

+        self.dbversion = dbversion or self.dbversion
+        if self.dbversion:
+            self.dbname = f'{self.dbname}_{self.dbversion}' 


This dbversion thing seems a bit adhoc and assumes a given format for the versions of the databases. I don't love it.

How is this different from having a default version for a database (supposedly, the latest) and if you want a different one you call connect with the new database name?

It is a bit wonky I know. I added this to deal with the archive database. Each data release, a new database is created with names like archive_20200203, or archive_20190711 but the models and connection don't change. In principle there's nothing wrong with fixing the database name to the one with the latest version but I didn't like the idea of always having to commit a new change for that and potentially tag a new release.

We don't have a strong policy yet about versioning databases and/or version naming schemes but it might be a good idea to make one. It makes sense to me to somehow separate db names and versions. But I'm open to suggestions.

albireox · 2020-04-14T05:37:44Z

Also, could we move the tests outside the package? I've come to realise it's better to not have them as part of the package because they're not code you want to ship your package with, and it ends up being painful to exclude them when packaging.

havok2063 · 2020-04-20T16:26:31Z

@albireox I've moved the tests out into the top-level directory. I've also added some repsonses to your individual comments. Thanks for pointing out the setting for removing trailing whitespace. I've been manually doing that and those little yellow tildes are quite annoying. Regarding your questions.

Currently all tests with real databases are skipped when no database is detected, including on Travis. So Travis runs any tests that don't need real databases (e.g. for general database connection code) or for any tests using the temporary test database. I set it up so people could run tests for databases but it wouldn't outright fail for everyone else when they don't have all databases set up locally for example.
I don't think factoryboy lets you load fake data from files. It works by building a "fake" class or ORM Model that maps to a real one. For catalogs in catalogdb I can see how creating fake models would be a lot of work since you'd need to specify every column. We might be able to come up with factory to generate fake models based on data defined in a file or perhaps a schema file.
Yeah I think that should be possible. What do you mean by "check that the models connects to the database"? Like that it can run a simple query? How is that different than writing a simple test?
Yeah I agree on the tests for SQLADatabaseConnection and PeeweeDatabaseConnection. I started some in the test_connection.py files.

havok2063 · 2020-04-22T18:50:10Z

@albireox I've added some code to more dynamically create model factories to generate fake data. It will try to auto-generate fake data for every column on a database table so one doesn't have to do it manually. This can be customized either when you create the factory or from a file definition. See lines 61-65 at https://github.com/sdss/sdssdb/blob/archive/tests/pwdbs/factories.py or the file at https://github.com/sdss/sdssdb/blob/archive/tests/data/models.yml. It currently can only auto-generate fake data for simple column definitions. I haven't yet implemented anything for columns that are actually foreign keys that point to other models. But you can still define those manually.

The test suite passes locally but fails on travis due to some strange issue with importing catalogdb, see #29. I also can't write some sqlalchemy tests for targetdb because of #28

albireox · 2020-05-04T23:19:25Z

This sounds good to me. I think both blocking issues are now fixed.

havok2063 added 21 commits February 26, 2020 14:52

bugfixing sqlalchemy func call

e51855b

adding sdss lore profile

43c589c

adding option for specifying a database version

b8743fc

adding option to change database version

11b464f

making sas.File.name property a hybrid one

54a93ae

adding primary key to new catalogdb table

2b247d9

making default archive latest version

2335852

fixing archive db version

92737a6

setting up initial infrastructure for testing databases

f8d58b3

adding initial peewee tests

43d83eb

rejiggering general connection tests

20fbcff

moving session/transaction fixtures; adding dynamic scope to session/…

5e5ca24

…transaction

adding transactions to peewee tests

c083406

cleaning up a bit

16755af

adding option to only run peewee or sqla tests

7c14cb3

Merge branch 'master' into archive

ad28a4f

cleaning up a bit more

6da88de

cleaning up a bit more

809df76

updating reqs

35be7a8

updating sdss5db example from old target_type to category; updating s…

79cb560

…ome text

adding some text and cleaning up

47c27a9

havok2063 added the enhancement New feature or request label Mar 24, 2020

havok2063 requested a review from albireox March 24, 2020 19:44

havok2063 added 5 commits March 24, 2020 15:50

testing new travis install

670ed9e

bugfix on travis pip install extras

3958289

updating pytest req to >5.2

49cd918

turning off pytest-sugar in travis for now

d139c60

adding postgresql to travis services

83d5ce4

removing unnecessary fixture

18d3f42

havok2063 added 2 commits April 13, 2020 14:32

turning off tests in action

d8bc7df

adding action badge

c7faa97

albireox reviewed Apr 14, 2020

View reviewed changes

havok2063 added 4 commits April 20, 2020 10:56

Merge branch 'master' into archive

ac02487

fixing docstrings to match rest of file

77bce32

running isort on ingest.py

f2f892f

moving tests directory to top level

e16cbc4

havok2063 added 11 commits April 20, 2020 18:02

adding ability to autogenerate factory given a model

0d698aa

test tweak to test for travis

809f665

test tweak to test for travis

7a68a31

test tweak to test for travis

1fbc3b0

testing travis

0c06479

correcting obsinfo expnum hybrid prop

18cf638

commenting out all new code to test travis

45e8579

triggering new build

1ca9ea8

testing travis catalogdb removal

b7c6fc2

turning on sqla obsinfo test

fd7c106

turning on pwdb allwise test

bbe2416

havok2063 requested a review from albireox April 22, 2020 18:50

havok2063 added 2 commits April 30, 2020 12:26

Merge branch 'master' into archive

1593a4b

Merge branch 'master' into archive

a027221

albireox approved these changes May 4, 2020

View reviewed changes

havok2063 added 2 commits May 12, 2020 11:59

merging master

b0de1da

removing tests for sqlalchemy sdss5db

94d13f7

havok2063 merged commit 6312337 into master May 12, 2020

havok2063 deleted the archive branch May 12, 2020 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adds a testing framework for databases #25

adds a testing framework for databases #25

havok2063 commented Mar 24, 2020 •

edited

Loading

coveralls commented Mar 24, 2020 •

edited

Loading

albireox left a comment

albireox Apr 14, 2020

havok2063 Apr 20, 2020 •

edited

Loading

albireox Apr 14, 2020

havok2063 Apr 20, 2020

albireox Apr 14, 2020

havok2063 Apr 20, 2020

albireox commented Apr 14, 2020

havok2063 commented Apr 20, 2020

havok2063 commented Apr 22, 2020

albireox commented May 4, 2020

adds a testing framework for databases #25

adds a testing framework for databases #25

Conversation

havok2063 commented Mar 24, 2020 • edited Loading

coveralls commented Mar 24, 2020 • edited Loading

Pull Request Test Coverage Report for Build 211

💛 - Coveralls

albireox left a comment

Choose a reason for hiding this comment

albireox Apr 14, 2020

Choose a reason for hiding this comment

havok2063 Apr 20, 2020 • edited Loading

Choose a reason for hiding this comment

albireox Apr 14, 2020

Choose a reason for hiding this comment

havok2063 Apr 20, 2020

Choose a reason for hiding this comment

albireox Apr 14, 2020

Choose a reason for hiding this comment

havok2063 Apr 20, 2020

Choose a reason for hiding this comment

albireox commented Apr 14, 2020

havok2063 commented Apr 20, 2020

havok2063 commented Apr 22, 2020

albireox commented May 4, 2020

havok2063 commented Mar 24, 2020 •

edited

Loading

coveralls commented Mar 24, 2020 •

edited

Loading

havok2063 Apr 20, 2020 •

edited

Loading