DM-15588: Remove home-brewed SQLite PPDB #32

morriscb · 2018-11-19T23:46:44Z

No description provided.

kfindeisen

Looks good, just some style comments (in particular, code duplication in the test).

Please try to clean up the commit history by e.g. squashing the "debug" and "fix" commits, though. I find it hard to tell what the changes were from the current list.

kfindeisen · 2018-11-26T19:54:36Z

data/ppdb-ap-pipe-schema-extra.yaml

+- name: nDiaSources
+  type: INT
+  nullable: false
+  description: Total number of DiaSources associated with this DiaObject.


I'm a bit confused by this. You add the same thing to dax_ppdb/data/ppdb-schema-extra.yaml; why do you need two more copies here? Which file is actually being used?

The one being used in ap_pipe is from ap_association. The reason the duplication exists is because the ap_association one as extra columns for the observed DiaSource filter. It also doesn't have the DiaObjectLast table as we don't currently use the Ppdb concept of "dailyJob". The current default config is baseline.

kfindeisen · 2018-11-26T20:06:27Z

python/lsst/ap/association/afwUtils.py

-                        doc='Calibrated scatter in flux in %s band.' %
-                        filter_name)
+
+    # Generated automatically from ppdb-schema.yaml in dax_ppdb/data.


Please tell me the source code for afwUtils.py isn't auto-generated...

Why not use one of the .yaml files to populate schema, instead of having it (pseudo-?)hard-coded like this?

My understanding is that the underlying cat schema format (which the Ppdb schema is auto generated from) is changing in the near future. I didn't want completely auto generate the afw schema until the new cat schema format is finalized.

Ok, but what do you mean by the comment? Is afwUtils.py being auto-generated somehow?

The full file isn't, but the afw columns are.

kfindeisen · 2018-11-26T20:13:36Z

python/lsst/ap/association/afwUtils.py

+    # Mapping of arrays and BLOBs is not currently supported by the PPDB so we
+    # these columns out.
+    """
+    schema.addField('uLcPeriodic', type='ArrayF',


Okay, that's a clever way to emulate block comments in Python. I'm surprised none of the style-checking software complains, though.

Same. I'll change these to normal comments.

It was a quick and dirty solution to comment out the block because Ppdb doesn't currently support array types or BLOBs. Forgot to change it for the PR.

kfindeisen · 2018-11-26T21:58:48Z

python/lsst/ap/association/association.py

@@ -130,22 +143,91 @@ def run(self, dia_sources, exposure):
            Input exposure representing the region of the sky the dia_sources
            were detected on. Should contain both the solved WCS and a bounding
            box of the ccd.
+        ppdb : `lsst.dax.ppdb.Ppdb`
+            Ppdb connection object to retrieve DIASources/Objects from and
+            write to.


I think this means DM-13602 can be marked as obsolete.

Seems so? ap_association doesn't set the db location anymore but instead receives a Ppdb object. The settings for the Ppdb are a PexConfig object though and currently ap_verify and ap_pipe set these configs. Not sure if that completely answers the question.

kfindeisen · 2018-11-26T22:46:46Z

python/lsst/ap/association/association.py

+                output_dia_objects.append(cov_dia_object)
+
+        # Return deep copy to enforce contiguity.
+        return output_dia_objects.copy(deep=True)


If contiguity is something you need to enforce, consider documenting that the return value is guaranteed contiguous.

kfindeisen · 2018-11-27T00:04:45Z

tests/test_association_task.py

+            dia_source["filterId"] = self.exposure.getFilter().getId()
+            dia_source["x"] = 0
+            dia_source["y"] = 0
+            dia_source["snr"] = 10


Are these all mandatory columns? 😨 I'm a bit worried about code duplication with the code you're supposed to be testing...

"x" and "y" are non-Nullable in the Ppdb thus I set them here. The association step takes place in RA/DEC and these columns are not touched.

kfindeisen · 2018-11-27T00:08:13Z

tests/test_association_task.py

+                dia_object['%sPSFluxSigma' % filter_name] = 1
+                dia_object['%sPSFluxNdata' % filter_name] = 1
+
+        dateTime = dafBase.DateTime(nsecs=1400000000 * 10**9 - 1000)


Suggested change

dateTime = dafBase.DateTime(nsecs=1400000000 * 10**9 - 1000)

dateTime = dafBase.DateTime(nsecs=1400000000 * 1e9 - 1000)

This is kind of unusual notation anyway; why not just write out the number?

Changed to a string input.

kfindeisen · 2018-11-27T00:11:41Z

tests/test_association_task.py

+            dia_source["apFlux"] = 10000 / self.flux0
+            dia_source["apFluxErr"] = \
+                np.sqrt((100 / self.flux0) ** 2 + (10000 * self.flux0_err / self.flux0 ** 2) ** 2)
+            dia_source["snr"] = 10


Some code duplication with _run_association_and_retrieve_objects and test_update_dia_objects; can you combine how the three methods initialize sources?

kfindeisen · 2018-11-27T00:12:50Z

tests/test_association_task.py

-        assoc_db.create_tables()
+        ppdb = Ppdb(config=self.ppdbConfig,
+                    afw_schemas=dict(DiaObject=make_dia_object_schema(),
+                                     DiaSource=make_dia_source_schema()))


Code duplication with _store_dia_objects_and_sources; maybe have a makePpdb function?

kfindeisen · 2018-11-27T00:14:39Z

tests/test_association_task.py

-            'psFluxMean_g', 'psFluxMeanErr_g', 'psFluxSigma_g',
-            'psFluxMean_r', 'psFluxMeanErr_r', 'psFluxSigma_r']
+            'id', 'gPSFluxMean', 'gPSFluxMeanErr', 'gPSFluxSigma',
+            'rPSFluxMean', 'rPSFluxMeanErr', 'rPSFluxSigma']


Does this depend on the content of the schema files or some other external source? I'm worried about it getting out of sync.

Actually, why not just use test_dia_object_values[0].keys()? You'd have one fewer copy to keep up to date.

Realized I can delete this specific list and just use the keys from the dictionary. Does that simplification satisfy?

Sounds good.

kfindeisen · 2018-11-27T18:43:20Z

tests/test_association_task.py

+        ppdb = Ppdb(config=self.ppdbConfig,
+                    afw_schemas=dict(DiaObject=make_dia_object_schema(),
+                                     DiaSource=make_dia_source_schema()))
+        ppdb._schema.makeSchema()


Given the discussion on lsst/dax_ppdb#5, this line should be ppdb.makeSchema(drop=True) but otherwise doesn't need to be changed.

Does the new make_schema function satisfy this?

If you really meant _make_ppdb, then mostly yes. Did you mean to leave off the drop=True bit?

I did. I end up using the function not just to create a database but to connect to an existing one.

kfindeisen · 2018-11-27T18:44:51Z

tests/test_association_task.py

+        ppdb = Ppdb(config=self.ppdbConfig,
+                    afw_schemas=dict(DiaObject=make_dia_object_schema(),
+                                     DiaSource=make_dia_source_schema()))
+        ppdb._schema.makeSchema()


Given the discussion on lsst/dax_ppdb#5, this line should be ppdb.makeSchema(drop=True) but otherwise doesn't need to be changed (or it could be merged with the similar code in the test case just above it).

Ditto with above.

Rename yaml schemas. Simplify afwUtils and add DPDD columns. Remove tests for previous DB wrappers. Add Ppdb as input to AssociationTask API. Debug imports and column names in unittests. Debug initial problems with test_update_dia_object. Add afw mappings. Comment out columns with unsupport datatypes. Fix column names in association flux calculations. Add radecTai to DiaObject values. Add values to DiaSources and Objects. Fixed unittests for new API. Added column mappings. Fixed bugs in AssociationTask regarding flux MeanErrors. Fixed values and bugs in test_association.

Remove unnecessary afw mappings. Add nDiaSources back to DiaObject schemas. Remove deprecated afw->sql code.

Fix schema function names and change Sigma to Err Remove PPDB internal "Validity" columns Change psFlux and Err to type D. Add nDiaSources to afw schema. Change afw float types to double Bug fixes for use in ap_verify. Change schema error fields to float. Compute and store total diaSources. Add centroid mapping. Add filter name and id to MapDiaSource Reduce default MapDiaSource calibration fields.

Add make_schema method. Debug unittest. Change date input to string. Remove association config class. Change to getPackageDir Remove schemaMapper. Fix docstring Add docstring on continuity. Change doc comment to hash comment. Add nDiaSources to test_run. Add nDiaSource value set. Create _set_dia_source_values function. Debug _set_source_values

kfindeisen approved these changes Nov 27, 2018

View reviewed changes

kfindeisen reviewed Nov 27, 2018

View reviewed changes

morriscb added 4 commits November 28, 2018 17:35

Remove deprecaded DB wrappers.

d14c575

Remove unnecessary afw mappings. Add nDiaSources back to DiaObject schemas. Remove deprecated afw->sql code.

morriscb force-pushed the tickets/DM-15588 branch from 2849f4c to 4c9e5e0 Compare November 28, 2018 23:36

morriscb merged commit 4c9e5e0 into master Nov 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-15588: Remove home-brewed SQLite PPDB #32

DM-15588: Remove home-brewed SQLite PPDB #32

morriscb commented Nov 19, 2018

kfindeisen left a comment

kfindeisen Nov 26, 2018

morriscb Nov 27, 2018 •

edited

kfindeisen Nov 26, 2018

morriscb Nov 27, 2018

kfindeisen Nov 27, 2018

morriscb Nov 28, 2018

kfindeisen Nov 26, 2018

morriscb Nov 27, 2018

morriscb Nov 28, 2018

kfindeisen Nov 26, 2018

morriscb Nov 27, 2018

kfindeisen Nov 26, 2018

kfindeisen Nov 27, 2018

morriscb Nov 28, 2018

kfindeisen Nov 27, 2018

morriscb Nov 28, 2018

kfindeisen Nov 27, 2018

kfindeisen Nov 27, 2018

kfindeisen Nov 27, 2018

kfindeisen Nov 27, 2018

morriscb Nov 27, 2018

kfindeisen Nov 27, 2018

kfindeisen Nov 27, 2018 •

edited

morriscb Nov 28, 2018

kfindeisen Nov 28, 2018

morriscb Nov 28, 2018

kfindeisen Nov 27, 2018 •

edited

morriscb Nov 28, 2018

	dateTime = dafBase.DateTime(nsecs=1400000000 * 10**9 - 1000)
	dateTime = dafBase.DateTime(nsecs=1400000000 * 1e9 - 1000)

DM-15588: Remove home-brewed SQLite PPDB #32

DM-15588: Remove home-brewed SQLite PPDB #32

Conversation

morriscb commented Nov 19, 2018

kfindeisen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

morriscb Nov 27, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfindeisen Nov 27, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfindeisen Nov 27, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

morriscb Nov 27, 2018 •

edited

kfindeisen Nov 27, 2018 •

edited

kfindeisen Nov 27, 2018 •

edited