-
Notifications
You must be signed in to change notification settings - Fork 18
Access log query enhancements #815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6879e07
to
6eccabb
Compare
6eccabb
to
2a8057a
Compare
Nice work, Harsha. I worry that this endpoint would tip over an instance if asked to create a CSV or return everything in a very large access_log. We'll have to discuss with @ryansanford what limitations for safety seem reasonable and the possibility of a command line tool to generate the CSV when you really do want to have millions of logs returned. |
2a8057a
to
2cad0fd
Compare
722c295
to
795d2c0
Compare
795d2c0
to
6ff3ef7
Compare
Somewhere between 70-80k results, we end up with the below error. That was indeed resolved by adding the index
|
Next breakage is happening between 300-400k. uwsgi worker is dying. This is happening after 17 seconds.
|
@hkethi002 Would putting a python script in |
Regarding the uwsgi exiting, it's due to running out of memory. It reaches about 2.5 GB at the point of processing 360k results |
c4b0e03
to
66054cd
Compare
bin/database.py
Outdated
""" | ||
Indexes the timestamps of access logs | ||
""" | ||
config.log_db.access_log.createIndex({'timestamp': -1}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can put this in config.py and it will get added on startup if it's not already there.
cc4a208
to
9e1dc99
Compare
Ran into Unicode Errors caused by some labels having emojis in them, and it seems like csv library being used uses ASCII so should they just be skipped for now? |
We're going to have to support unicode, especially if we are going to support users whose languages don't use the ASCII char set. Looks like there is a unicode replacement for the csv lib: |
d7ab4bd
to
07de208
Compare
raise APIReportParamsException('Limit must be an integer greater than 0.') | ||
for access_type in access_types: | ||
if access_type not in AccessTypeList: | ||
raise APIReportParamsException('Not a valid access type') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should decide on a "safe" limit and reject requests that go over that limit. The rejection 400 could mention that an admin should run the script instead. @ryansanford what would you say the safe uwsgi log limit would be after your testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then make sure we add a test for failing when attempting to go over that limit.
a20ff12
to
1eae317
Compare
12a8976
to
52f6e63
Compare
52f6e63
to
21c33f0
Compare
api/handlers/reporthandler.py
Outdated
if limit < 1: | ||
raise APIReportParamsException('Limit must be an integer greater than 0.') | ||
elif limit > 10000: | ||
raise APIReportParamsException('Limit exceeds 1000 entries, please contact admin to run script.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need one more zero on the error :)
After error message change, LGTM! Feel free to merge. |
21c33f0
to
dbd6da1
Compare
Fixes #811
Review Checklist
Requirements:
subject
query param that only returns records wherecontext.subject.label
matches that inputaccess_type
query params that only return records whereaccess_type
matches one of those inputscsv
query param that, whenTrue
, returns a csv file of all matching records flattened into rows/api/report/accesslog/types
?) that returns the available access types for frontend use.