Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add auto spectra #640

Merged
merged 25 commits into from
Nov 3, 2022
Merged

Add auto spectra #640

merged 25 commits into from
Nov 3, 2022

Conversation

bhazelton
Copy link
Member

@bhazelton bhazelton commented Oct 27, 2022

Description

Add a new table (hera_auto_spectrum) with the full autocorrelation spectra (not just the medians which are currently recorded in the hera_autos table).

This new table uses an Array type column in postgres and a character type column in SQLite. In order to make sure this works properly I added testing related to spinning up SQLite databases, which we never did before in our tests. In the process I found and fixed some bugs and made some significant changes to cm_gen_sqlite.py.

I also changed the CI to use mamba instead of conda because I was annoyed with how slow it was.

Motivation and Context

I think this closes an issue somewhere, but I can't find it.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Schema change (any change to the SQL tables)
  • New feature without schema change (non-breaking change which adds functionality)
  • Change associated with a change in redis structure
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Version change
  • Build or continuous integration change
  • Other

Checklist:

Schema change:

  • My code follows the code style of this project.
  • I have added or updated the docstrings associated with my feature using the numpy docstring format.
  • I have created an alembic version file to produce the schema change.
  • I have tested looping upgrading and downgrading the alembic version and tests pass consistently.
  • I have updated or created monitoring scripts and daemons so any new tables will be automatically populated (if appropriate).
  • I understand the updates required onsite (detailed in the readme) and I will make those
    changes when this is merged.
  • I have updated the schema documentation document with my changes (under docs/mc_definition.tex)
  • I have added tests to cover my new feature.
  • Unit tests pass on site (This is a critical check, CI can differ from site).
  • I have updated the CHANGELOG.

@codecov
Copy link

codecov bot commented Oct 27, 2022

Codecov Report

Base: 98.26% // Head: 98.36% // Increases project coverage by +0.09% 🎉

Coverage data is based on head (2a32ff1) compared to base (dbb7b7e).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #640      +/-   ##
==========================================
+ Coverage   98.26%   98.36%   +0.09%     
==========================================
  Files          34       34              
  Lines        5080     5131      +51     
==========================================
+ Hits         4992     5047      +55     
+ Misses         88       84       -4     
Impacted Files Coverage Δ
hera_mc/autocorrelations.py 98.57% <100.00%> (+0.65%) ⬆️
hera_mc/cm_gen_sqlite.py 100.00% <100.00%> (ø)
hera_mc/mc_session.py 99.77% <100.00%> (+<0.01%) ⬆️
hera_mc/mc.py 92.56% <0.00%> (+3.30%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@bhazelton
Copy link
Member Author

@david-deboer do you have ideas around testing that it doesn't error with sqlite? 2 uncovered lines here associated with the support for sqlite.

Antenna number. Part of primary_key.
antenna_feed_pol : String Column
Feed polarization, either 'e' or 'n'. Part of primary_key.
spectrum : Float Columnn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assumption made here about channel/frequency mapping. I'm fine with this if you are.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you're getting at. That the frequencies aren't in their own column? That would be pretty redundant because they'd be the same for all rows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly agree with bryna here, but we have done 8k and 16k channel autos before (as a test if I remember but we still did it). But overall it seems like a bit of wasted space to save it all the time. Even then we don't spit out the frequencies in redis anywhere, I'm pretty sure we'd always have to reconstruct them using a linspace and an average (which is what we'd ask a user to do anyway). We could always just write down the algorithm somewhere convenient for anyone who is interested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could add a column for frequency resolution if that helped. I don't actually know where to find that in redis...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added something to the table definition to describe how to get the frequencies. The basic answer is that the bandwidth (0 -> 250 MHz) is fixed, so the frequencies can be computed from that and from the number of entries in the spectrum.

dannyjacobs
dannyjacobs previously approved these changes Oct 27, 2022
Copy link
Contributor

@dannyjacobs dannyjacobs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment on the finer points of the sql array definition, but it looks reasonable to me. We continue here the practice of leaving channel mapping to the wits of the end consumer. But in the interest of time I suggest we test now and debate later.

Copy link
Member

@mkolopanis mkolopanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only real question I have is if we want to specifically specify the precision of the Float column? Currently we know these autos will always be a float32 and the default Float in sqlalchemy does not seem to have a specified precision. From a little bit of research, it seems like this will default to double in psql.
https://stackoverflow.com/questions/62938757/how-to-force-sqalchemy-float-type-to-real-in-postgres

Assuming this stackoverflow's assertion is correct about casting to double, we're not really gaining any information by upcasting it to 64bit, just using double the space. I think I would argue for the type of this Array to be Real.

@david-deboer
Copy link
Contributor

Isn't the only way to hit those two lines is to have a non-postgresql data base (e.g. a test sqlite version in data/test_data) that gets used as the database in the tests?

Copy link
Member

@mkolopanis mkolopanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few extra questions now

hera_mc/cm_gen_sqlite.py Show resolved Hide resolved
hera_mc/cm_gen_sqlite.py Outdated Show resolved Hide resolved
@bhazelton bhazelton marked this pull request as ready for review November 2, 2022 23:58
Copy link
Member

@mkolopanis mkolopanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the extra work on this. Looks good from my end. I don't mind a little hand waving on the sqlite coverage.

@mkolopanis mkolopanis merged commit 383569c into main Nov 3, 2022
@mkolopanis mkolopanis deleted the add_auto_spectra branch November 3, 2022 17:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants