Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-15588: Remove home-brewed SQLite PPDB #34

Merged
merged 2 commits into from Nov 30, 2018
Merged

Conversation

morriscb
Copy link
Contributor

No description provided.

Copy link
Member

@kfindeisen kfindeisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I have some suggestions for pushing responsibilities down to lower-level code, and found a few places where the documentation needs to be updated.

ppdb = pexConfig.ConfigField(
dtype=daxPpdb.PpdbConfig,
doc="Ppdb database configuration.",
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make this a ConfigurableField? You'd have to provide an explicit ConfigClass keyword in the field definition, but it would let you replace daxPpdb.Ppdb(config=config) with config.apply() elsewhere, so it's more flexible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem is that Ppdb isn't a task (it's base class is object) so I can't use the config.apply() I think. Correct me if I'm wrong though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested it, but the documentation doesn't say anything about the target having to be a task. It should work on any class that takes a config as its first constructor parameter:

Must be callable, and the first parameter will be the value of this field

Targets usually have a ConfigClass member (which Ppdb doesn't), but you can override that when creating the field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll give it a shot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so it worked. The apply function seems to do what is expected. I've moved the ppdb.makeSchema() call into init and deleted the "_connectToDB" as it was now completely handled by the apply call.

self.ppdb.extra_schema_file = os.path.join(
getPackageDir("ap_association"),
"data",
"ppdb-ap-pipe-schema-extra.yaml")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If all this stuff is needed any time you create a Ppdb for ap_association (and this seems to be exactly the same config you use in test_association_task.py), maybe it should go in a config file in ap_association? It will make this code cleaner (though you'd still have to explicitly load the config), and you won't have to touch ap_pipe whenever you change the ap_association defaults.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed some of the superfluous settings as the schema for Association has moved closer to that of the default Ppdb and most of the defaults are held there. Once we need to start using the fancier features of the Ppdb I'll consider using ap_association specific config.

if hasattr(self.associator.level1_db, "db_name") and self.associator.level1_db.db_name == ":memory:":
raise ValueError("Source association needs a persistent database [associator.level1_db.db_name]. "
"Please provide a DB file (need not exist) or use a different level1_db Task.")
if self.ppdb.isolation_level == "READ_COMMITTED" and self.ppdb.db_url[:6] == "sqlite":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think either of the following would be more idiomatic and less brittle:

Suggested change
if self.ppdb.isolation_level == "READ_COMMITTED" and self.ppdb.db_url[:6] == "sqlite":
if self.ppdb.isolation_level == "READ_COMMITTED" and self.ppdb.db_url.startswith("sqlite"):
Suggested change
if self.ppdb.isolation_level == "READ_COMMITTED" and self.ppdb.db_url[:6] == "sqlite":
if self.ppdb.isolation_level == "READ_COMMITTED" and "sqlite" in self.ppdb.db_url:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, does the new check belong in ap_pipe, or in dax_ppdb? Is this something nobody running PPDB should do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to Ppdb instead.

_setupDatabase(self.config.associator.level1_db)
self.ppdb = _connectToDatabase(self.config.ppdb)
self.makeSubtask("diaSourceDpddifier",
inputSchema=self.differencer.schema)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice delegation to ImageDifferenceTask. 👍

catalog = sensorRef.get(diffType + "Diff_diaSrc")
exposure = sensorRef.get(diffType + "Diff_differenceExp")

dia_sources = self.diaSourceDpddifier.run(catalog, exposure)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume (hope) the Dpddifier is a temporary measure; is there some ticket we can TODO here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not currently. Currently the DPDD-ification process Yusra is working on is still a prototype. At some point we may switch over to that but there will always be some process in between the output of ip_diffim and ap_association/pipe.


return pipeBase.Struct(
l1Database=self.associator.level1_db,
l1Database=self.ppdb,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the documentation for runAssociation's return value.

@@ -278,12 +307,16 @@ def _setupDatabase(configurable):
A ConfigurableInstance with a database-managing class in its ``target``
field. The API of ``target`` must expose a ``create_tables`` method
taking no arguments.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation is out of date. See my proposed change to ApPipeConfig.ppdb before updating, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the config.apply method I've removed this part of the code.

ppdb = daxPpdb.Ppdb(config=configurable,
afw_schemas=dict(DiaObject=make_dia_object_schema(),
DiaSource=make_dia_source_schema()))
ppdb._schema.makeSchema()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the discussion on lsst/dax_ppdb#5, this line should be ppdb.makeSchema() but otherwise doesn't need to be changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a TODO comment referring to DM-16606? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I should do both correct? (add the comment and change it to makeSchema).

Fix linting

Fix typos

Fix getPackageDir usage.

Add output documention to db connection method.

Fix linting.
Change runAssociation docstring.

Remove ppdb, sqlite validation.

change ppdb config to ConfigurableField.

Remove unnecessary Ppdb schema settings.
@morriscb morriscb merged commit 94235ce into master Nov 30, 2018
@kfindeisen kfindeisen deleted the tickets/DM-15588 branch April 13, 2022 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants