-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add auto spectra #640
Add auto spectra #640
Conversation
Codecov ReportBase: 98.26% // Head: 98.36% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #640 +/- ##
==========================================
+ Coverage 98.26% 98.36% +0.09%
==========================================
Files 34 34
Lines 5080 5131 +51
==========================================
+ Hits 4992 5047 +55
+ Misses 88 84 -4
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@david-deboer do you have ideas around testing that it doesn't error with sqlite? 2 uncovered lines here associated with the support for sqlite. |
hera_mc/autocorrelations.py
Outdated
Antenna number. Part of primary_key. | ||
antenna_feed_pol : String Column | ||
Feed polarization, either 'e' or 'n'. Part of primary_key. | ||
spectrum : Float Columnn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assumption made here about channel/frequency mapping. I'm fine with this if you are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you're getting at. That the frequencies aren't in their own column? That would be pretty redundant because they'd be the same for all rows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mostly agree with bryna here, but we have done 8k and 16k channel autos before (as a test if I remember but we still did it). But overall it seems like a bit of wasted space to save it all the time. Even then we don't spit out the frequencies in redis anywhere, I'm pretty sure we'd always have to reconstruct them using a linspace and an average (which is what we'd ask a user to do anyway). We could always just write down the algorithm somewhere convenient for anyone who is interested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could add a column for frequency resolution if that helped. I don't actually know where to find that in redis...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added something to the table definition to describe how to get the frequencies. The basic answer is that the bandwidth (0 -> 250 MHz) is fixed, so the frequencies can be computed from that and from the number of entries in the spectrum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't comment on the finer points of the sql array definition, but it looks reasonable to me. We continue here the practice of leaving channel mapping to the wits of the end consumer. But in the interest of time I suggest we test now and debate later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only real question I have is if we want to specifically specify the precision of the Float
column? Currently we know these autos will always be a float32
and the default Float
in sqlalchemy does not seem to have a specified precision. From a little bit of research, it seems like this will default to double
in psql.
https://stackoverflow.com/questions/62938757/how-to-force-sqalchemy-float-type-to-real-in-postgres
Assuming this stackoverflow's assertion is correct about casting to double, we're not really gaining any information by upcasting it to 64bit, just using double the space. I think I would argue for the type of this Array
to be Real
.
Isn't the only way to hit those two lines is to have a non-postgresql data base (e.g. a test sqlite version in data/test_data) that gets used as the database in the tests? |
67cc95e
to
eba1eaf
Compare
9ca27b9
to
cd4c463
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few extra questions now
+ more debugging cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the extra work on this. Looks good from my end. I don't mind a little hand waving on the sqlite coverage.
Description
Add a new table (
hera_auto_spectrum
) with the full autocorrelation spectra (not just the medians which are currently recorded in thehera_autos
table).This new table uses an Array type column in postgres and a character type column in SQLite. In order to make sure this works properly I added testing related to spinning up SQLite databases, which we never did before in our tests. In the process I found and fixed some bugs and made some significant changes to
cm_gen_sqlite.py
.I also changed the CI to use mamba instead of conda because I was annoyed with how slow it was.
Motivation and Context
I think this closes an issue somewhere, but I can't find it.
Types of changes
Checklist:
Schema change:
changes when this is merged.