-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partitioned nibrs count views into ORI and State/national counts + Simplified DB JSON serialization #573
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Did you want to merge #571 first?
crime_data/common/cdemodels.py
Outdated
@@ -741,8 +764,14 @@ def base_query(self, field): | |||
query += ' GROUP by b.year, b.:field, offense_name' | |||
|
|||
query += ' ORDER by b.year, b.:field' | |||
query = 'SELECT array_to_json(array_agg(row_to_json(m))) as json_data from ( ' + query + ') m' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to use the AsIs parameters again to avoid any hint of SQL Injection worries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well this AsIs() thing is a bit problematic, as it doesn't render the parameters. - The query that's being wrapped here is already parameterized.
@harrisj - Yep - We'll need to merge in the other stuff, and sort out the test slice DB changes before merging this one (so the tests will pass). |
Sorry, those merge conflicts are my bad. I needed to add something when I reduced the slice DB so it would create empty year tables to run those materialized views, but I keep it commented out otherwise. |
@harrisj - No problem - I'm going to sort through these today, and upload the new db slice with these mods. |
Let me know when you want me to review again |
@harrisj - This is gtg for review. The db changes are already live. - Bandit is complaining about sql injection, but the code added simply wraps an already parameterized query... so if it's complaining now, then I'm not sure why it wasn't doing it before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things here and there, but I really want to test this out, so I'd like to merge
crime_data/common/cdemodels.py
Outdated
@@ -347,6 +347,8 @@ def base_query(self, field): | |||
query_gap_fill = ' RIGHT JOIN (SELECT DISTINCT ' + join_table + '.' + join_field + ' AS :field, c.year from ' + join_table + ' CROSS JOIN (SELECT year::text from nibrs_years) c) b ON (a.:field = b.:field AND a.year = b.year)' | |||
query = query + query_gap_fill | |||
query += ' ORDER by a.year, a.:field' | |||
if self.as_json: | |||
query = 'SELECT array_to_json(array_agg(row_to_json(m))) as json_data from ( ' + query + ') m' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably silence Bandit if you change this to use the AsIs wrapper and the parameterized thing, but that is a bit complicated I know. I do wonder if psycopg doesn't have a better means for doing that wrapping. Anyhow for now you can suppress bandit by putting a #noop comment on the end of that line
LGTM |
Welp.
Turns out Marshmallow wasn't responsible for serializing the results of the NIBRS count endpoints anyways DOH
However, the regular ole' serialization could be problematic because there just isn't any reason to load a potentially huge result into Python data structures. That having been said, I doubt the performance boost will be super meaningful.
Also in this I broke out the NIBRS count views into views containing counts for ORI's, and counts for state/national level. This partition is pretty lopsided, but it does make the state/national level queries trivial again.