Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify analytics logging code #334

Merged
merged 3 commits into from Feb 5, 2017
Merged

Simplify analytics logging code #334

merged 3 commits into from Feb 5, 2017

Conversation

GUI
Copy link
Member

@GUI GUI commented Feb 5, 2017

These changes started with some experimental work while debugging some analytics issues over the past couple weeks. While we're still working on some further changes in a separate branch, the changes here represent the basic cleanup parts of that effort, so we can go ahead and get these merged in for the next release.

  • Don't log requests to the website backends and web app in the analytics database. Only log API requests. This can help eliminate a lot of unnecessary log data and 404 bot traffic from crawlers.
  • Get rid of several fields that we were collecting but not using. Also remove various complexities with the logging due to the various timers we were trying to gather. Those were really for debugging purposes, but it doesn't seem worth all the extra hoops to jump through.
  • Shift the timing and debugging details into the nginx access logs, so we still have detailed timing information on the servers for debugging, it's just not part of the analytics database.
  • Go back to non-gzipped nginx access logs, so it's easier to tail these files.
  • Make the fields we're logging to Elasticsearch a bit more explicit. This all started with some work to possibly change the Elasticsearch schema. For now, we're keeping the schema mostly the same for backwards compatibility, but the code is now setup to make the schema more explicit so this will be easier to change in the future.
  • Make the Lua logging code a bit more modular and reusable, so its easier to utilize in other scripts (for example, the recent script we created to reprocess nginx access logs would have been easier if more of the internal Lua functions were more easily available).
  • Remove duplicative UTF-8 handling for log data. Since rsyslog has it's own way to strip invalid utf-8 data to ensure things go into elasticsearch we don't need to perform the same action in Lua code.

- Don't log requests to the website backends and web app in the
  analytics database. Only log API requests. This can help eliminate a
  lot of unnecessary log data and 404 bot traffic from crawlers.
- Get rid of several fields that we were collecting but not using. Also
  remove various complexities with the logging due to the various timers
  we were trying to gather. Those were really for debugging purposes,
  but it doesn't seem worth all the extra hoops to jump through.
- Shift the timing and debugging details into the nginx access logs, so
  we still have detailed timing information on the servers for
  debugging, it's just not part of the analytics database.
- Go back to non-gzipped nginx access logs, so it's easier to tail these
  files.
- Make the fields we're logging to Elasticsearch a bit more explicit.
  This all started with some work to possibly change the Elasticsearch
  schema. For now, we're keeping the schema mostly the same for
  backwards compatibility, but the code is now setup to make the schema
  more explicit so this will be easier to change in the future.
- Make the Lua logging code a bit more modular and reusable, so its
  easier to utilize in other scripts (for example, the recent script we
  created to reprocess nginx access logs would have been easier if more
  of the internal Lua functions were more easily available).
Since rsyslog has it's own way to strip invalid utf-8 data to ensure
things go into elasticsearch we don't need to perform the same action in
Lua code.
@GUI GUI merged commit ae954de into master Feb 5, 2017
@GUI GUI added this to the v0.14.0 milestone Feb 6, 2017
@GUI GUI deleted the simplify-logging branch February 20, 2017 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant