DM-15096: Cannot reuse association database with ApPipeTask reruns #29

kfindeisen · 2018-07-31T19:12:44Z

This PR removes the database-in-repository model from ap_pipe as well as all the code that was necessary to enforce it. SQLite configs are no longer allowed to specify :memory:, as testing revealed that in-memory databases are incompatible with the requirement that the database be fully initialized before AssociationTask is created. The documentation has been updated to emphasize that the user must always choose a database location in addition to the usual output repository.

mrawls

I only reviewed the documentation updates. With a couple clarifications I think it will be good. Thank you for keeping the docs in sync with the code changes!

mrawls · 2018-08-01T23:23:17Z

doc/lsst.ap.pipe/pipeline-tutorial.rst

@@ -64,7 +64,7 @@ To process your ingested data, run

 .. prompt:: bash

-   ap_pipe.py repo --calib repo/calibs --rerun processed --id visit=123456 ccdnum=42 filter=g --template templates
+   ap_pipe.py repo --calib repo/calibs --rerun processed -c associator.level1_db.db_name=association.db --id visit=123456 ccdnum=42 filter=g --template templates



I believe @morriscb said over on Jira that the config field should be db_url, not db_name. Unless I'm misinterpreting, please update this accordingly.

No, the config field is db_name. It we for some reason had to configure l1dbproto then that would be relevant, but AFAIK we don't. (I can't test it because AssociationDBSqliteTask and AssociationL1DBProtoTask have incompatible APIs, and ap_pipe assumes the former.)

mrawls · 2018-08-01T23:26:17Z

doc/lsst.ap.pipe/pipeline-tutorial.rst

+.. note::
+
+   You must :ref:`configure <command-line-task-config-howto>` the association database location.
+   In the examples above, this is done with the ``-c`` option, but a personal config file may be more convenient if you intend to run ``ap_pipe`` many times.



What happens if the user fails to configure the database location and tries to run ap_pipe anyway? Please add a sentence about what will happen in this situation.

Perhaps more importantly, it's not clear where (path?) this Must-Be-Named database will be created (or is expected to already exist). Please clarify this and make sure it is consistent with the pseudo file-tree in the Expected Outputs section. Can a path be given for the db_name/db_url or just a name?

You say the database name must be specified, but please explain what it does with the name. Something like, "ap_pipe will look for a database that already exists at the location specified. If it finds one, it will connect to it and continue associating, and if it doesn't find one, it will create one."

mrawls · 2018-08-01T23:47:01Z

doc/lsst.ap.pipe/pipeline-tutorial.rst

@@ -104,11 +108,11 @@ something like

 .. code-block:: none

+   association.db   <--- the Prompt Products Database with DIAObjects
   repo/


What I take away from this is that the database with whatever name the user chooses will be created in the same directory which contains repo, assuming there is not already a file there with that name. Is that right?

No, it's created relative to the working directory. Is there some obvious other place to put it, given that spelling out a full rerun path kind of defeats the point of reruns?

Ah I see. I think I tend to assume that "repo" is the working directory even though that's not a requirement and the examples do spell out repo rather than . as the input.

Please replace "It should look something like" with the following, or something close to it:

"At runtime, if you do not provide a path to an existing Prompt Products Database, a new database with the name you provide will be created (in your working directory or at the path you specify). The result from running ap_pipe should look something like"

That sentence sounds too much like saying there's a default location. How about changing the example to -c associator.level1_db.db_name=ppdb/association.db and modifying the "after" directory layout to match? Will that make things unambiguous?

Yes, that sounds good. I understand you want to avoid hinting at any kind of default. Thanks!

Because ApPipeTask can't work with in-memory databases, users must provide a location themselves. The documentation has been updated accordingly.

kfindeisen requested review from morriscb and mrawls July 31, 2018 19:12

kfindeisen mentioned this pull request Jul 31, 2018

DM-15096: Cannot reuse association database with ApPipeTask reruns lsst/ap_verify#41

Merged

morriscb approved these changes Aug 1, 2018

View reviewed changes

mrawls reviewed Aug 1, 2018

View reviewed changes

kfindeisen force-pushed the tickets/DM-15096 branch from 4cdf04a to 450df32 Compare August 3, 2018 18:45

Add explicit config options to validation error messages.

fbf6a6d

kfindeisen force-pushed the tickets/DM-15096 branch from 450df32 to e28c1e7 Compare August 6, 2018 23:52

Remove hardcoded database location.

285599e

Because ApPipeTask can't work with in-memory databases, users must provide a location themselves. The documentation has been updated accordingly.

kfindeisen force-pushed the tickets/DM-15096 branch from e28c1e7 to 285599e Compare August 6, 2018 23:54

kfindeisen merged commit 285599e into master Aug 6, 2018

kfindeisen deleted the tickets/DM-15096 branch February 25, 2019 19:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-15096: Cannot reuse association database with ApPipeTask reruns #29

DM-15096: Cannot reuse association database with ApPipeTask reruns #29

kfindeisen commented Jul 31, 2018

mrawls left a comment

mrawls Aug 1, 2018

kfindeisen Aug 3, 2018

mrawls Aug 1, 2018

mrawls Aug 1, 2018

mrawls Aug 1, 2018

kfindeisen Aug 3, 2018

mrawls Aug 6, 2018

kfindeisen Aug 6, 2018

mrawls Aug 6, 2018

DM-15096: Cannot reuse association database with ApPipeTask reruns #29

DM-15096: Cannot reuse association database with ApPipeTask reruns #29

Conversation

kfindeisen commented Jul 31, 2018

mrawls left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment