Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-15096: Cannot reuse association database with ApPipeTask reruns #29

Merged
merged 2 commits into from Aug 6, 2018

Conversation

kfindeisen
Copy link
Member

This PR removes the database-in-repository model from ap_pipe as well as all the code that was necessary to enforce it. SQLite configs are no longer allowed to specify :memory:, as testing revealed that in-memory databases are incompatible with the requirement that the database be fully initialized before AssociationTask is created. The documentation has been updated to emphasize that the user must always choose a database location in addition to the usual output repository.

Copy link
Collaborator

@mrawls mrawls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only reviewed the documentation updates. With a couple clarifications I think it will be good. Thank you for keeping the docs in sync with the code changes!

@@ -64,7 +64,7 @@ To process your ingested data, run

.. prompt:: bash

ap_pipe.py repo --calib repo/calibs --rerun processed --id visit=123456 ccdnum=42 filter=g --template templates
ap_pipe.py repo --calib repo/calibs --rerun processed -c associator.level1_db.db_name=association.db --id visit=123456 ccdnum=42 filter=g --template templates

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe @morriscb said over on Jira that the config field should be db_url, not db_name. Unless I'm misinterpreting, please update this accordingly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the config field is db_name. It we for some reason had to configure l1dbproto then that would be relevant, but AFAIK we don't. (I can't test it because AssociationDBSqliteTask and AssociationL1DBProtoTask have incompatible APIs, and ap_pipe assumes the former.)

.. note::

You must :ref:`configure <command-line-task-config-howto>` the association database location.
In the examples above, this is done with the ``-c`` option, but a personal config file may be more convenient if you intend to run ``ap_pipe`` many times.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the user fails to configure the database location and tries to run ap_pipe anyway? Please add a sentence about what will happen in this situation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps more importantly, it's not clear where (path?) this Must-Be-Named database will be created (or is expected to already exist). Please clarify this and make sure it is consistent with the pseudo file-tree in the Expected Outputs section. Can a path be given for the db_name/db_url or just a name?

You say the database name must be specified, but please explain what it does with the name. Something like, "ap_pipe will look for a database that already exists at the location specified. If it finds one, it will connect to it and continue associating, and if it doesn't find one, it will create one."

@@ -104,11 +108,11 @@ something like

.. code-block:: none

association.db <--- the Prompt Products Database with DIAObjects
repo/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I take away from this is that the database with whatever name the user chooses will be created in the same directory which contains repo, assuming there is not already a file there with that name. Is that right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's created relative to the working directory. Is there some obvious other place to put it, given that spelling out a full rerun path kind of defeats the point of reruns?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I think I tend to assume that "repo" is the working directory even though that's not a requirement and the examples do spell out repo rather than . as the input.

Please replace "It should look something like" with the following, or something close to it:

"At runtime, if you do not provide a path to an existing Prompt Products Database, a new database with the name you provide will be created (in your working directory or at the path you specify). The result from running ap_pipe should look something like"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sentence sounds too much like saying there's a default location. How about changing the example to -c associator.level1_db.db_name=ppdb/association.db and modifying the "after" directory layout to match? Will that make things unambiguous?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that sounds good. I understand you want to avoid hinting at any kind of default. Thanks!

Because ApPipeTask can't work with in-memory databases, users must
provide a location themselves. The documentation has been
updated accordingly.
@kfindeisen kfindeisen merged commit 285599e into master Aug 6, 2018
@kfindeisen kfindeisen deleted the tickets/DM-15096 branch February 25, 2019 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants