Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-30497: Bug: DiaSource table cannot find Ixx column #115

Merged
merged 3 commits into from Jun 9, 2021

Conversation

morriscb
Copy link
Contributor

@morriscb morriscb commented Jun 4, 2021

No description provided.

@parejkoj
Copy link
Contributor

parejkoj commented Jun 4, 2021

Where can we put a consistency check for these values against those in dax_apdb/data/apdb-schema.yaml? This package depends on dax_apdb, so it's the obvious place for it. We need to validate that these column names match those in the official database schema.

@morriscb
Copy link
Contributor Author

morriscb commented Jun 4, 2021

Sure, it could work there. As far as I know there is supposed to eventually be a central place to test against for schemas derived from the DPDD. Colin would be the best person to ask about that I would guess.

Copy link
Contributor

@mrawls mrawls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried it and got an error on ixxPsf, because it's ixxPSF. I strongly agree with John's comment that we need a consistency check 😱

data/DiaSource.yaml Outdated Show resolved Hide resolved
data/DiaSource.yaml Outdated Show resolved Hide resolved
data/DiaSource.yaml Outdated Show resolved Hide resolved
@parejkoj
Copy link
Contributor

parejkoj commented Jun 4, 2021

Why can't we test it here in this package?

@mrawls mrawls requested a review from parejkoj June 5, 2021 01:14
@mrawls mrawls changed the title DM-30497: Change case on variable names for consitency with dax_apdb. DM-30497: Bug: DiaSource table cannot find Ixx column Jun 5, 2021
@@ -295,6 +296,10 @@ def setUp(self):
self.apdb.storeDiaObjects(self.diaObjects,
self.dateTime)

# These columns intentionally appear in DiaSource.yaml,
# but not in the APDB schema
self.ignoreColumns = ["filterName", "bboxSize", "isDipole"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are specified in the schema-extra file in ap_association.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I figured there was someplace these not-DPDD-yet-important columns lived.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I see filterName and isDipole in there, but where does bboxSize come from?

@@ -390,6 +395,25 @@ def testGetPixelRanges(self):
self.assertEqual(normPix, testPix)
self.assertEqual(flipPix, testPix)

def test_apdbSchema(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only tests if the default schema for adpb and the functors run by default match. Also, it is safe to have extra columns in the apdb schema not filled by the DiaSource functor file. The main testing I've been doing to make sure things work has been running ap_verify as an integration test. That only just failed, as we've discovered, as sqlite is not case sensitive in it's column names but Postgres is. We can leave these here for now, but I will point out that it does not protect against mismatches in the schema files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this not protect against mismatches in the schema files? This test ensures that everything is case-consistent as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schemas can be set at run time. This only tests for the default ones. It doesn't even pull in all columns that are currently being used and written to the apdb.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think testing default schema agreement is better than not testing at all, but that's a good point about the limitation. I agree with John that it would be better to test against an actually instantiated database, but that's more than I'm willing to do for this ticket.

Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test; I have some suggestions about reworking it.

I would prefer if this could be a test against an actually instantiated database, instead of just testing yaml file contents, but that might be too tricky to implement.

@@ -295,6 +296,10 @@ def setUp(self):
self.apdb.storeDiaObjects(self.diaObjects,
self.dateTime)

# These columns intentionally appear in DiaSource.yaml,
# but not in the APDB schema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Period at the end of sentence, please.

table_list = list(yaml.safe_load_all(yaml_stream))
for table in table_list:
if table['table'] == 'DiaSource':
apdbSchemaColumns = [column['name'] for column in table['columns']]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't you then want a break after this, since there should only be one DiaSource table, so you no longer need to loop over the table_list?

diaSourceFunctor = yaml.safe_load_all(yaml_stream)
for functor in diaSourceFunctor:
diaSourceColumns = list(functor['funcs'].keys())
for column in diaSourceColumns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this loop, could you do self.assertEqual(set(diaSourceColumns), set(apdbSchemaColumns)), or maybe self.assertLess(set(diaSourceColumns), set(apdbSchemaColumns)) since I think the one is a proper subset of the other?

@@ -390,6 +395,25 @@ def testGetPixelRanges(self):
self.assertEqual(normPix, testPix)
self.assertEqual(flipPix, testPix)

def test_apdbSchema(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this not protect against mismatches in the schema files? This test ensures that everything is case-consistent as well.

for functor in diaSourceFunctor:
diaSourceColumns = [column for column in list(functor['funcs'].keys())
if column not in self.ignoreColumns]
self.assertLess(set(diaSourceColumns), set(apdbSchemaColumns))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert should go outside the for loop: you want to build up the list of column names from the DiaSource.yaml file, and then check against that whole list.

@mrawls mrawls merged commit 65ce80c into master Jun 9, 2021
@mrawls mrawls deleted the tickets/DM-30497 branch June 9, 2021 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants