-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User counters using PostgreSQL JSON #5373
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments :)
@@ -148,6 +149,16 @@ class User < ApplicationRecord | |||
|
|||
scope :dev_account, -> { find_by(id: ApplicationConfig["DEVTO_USER_ID"]) } | |||
|
|||
scope :with_this_week_comments, lambda { |number| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I included a few scopes as an example on how to use JSONB operators for queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a file length perspective, how do you feel about maybe moving all scopes to a separate file one way or another? If we lean in here I could see this getting long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I personally like the idea of at least counter scopes and even related methods being separate! We did something similar at Kenna with a model that had extensive status options and it worked out really well. Very easy to track down anything status-related bc it was all contained in a single file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also down for organizing files in a sane way, some of our models are way too big!
class UserCounter < ApplicationRecord | ||
belongs_to :user | ||
|
||
serialize :data, HashSerializer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This takes care of the conversion between hash and JSON
belongs_to :user | ||
|
||
serialize :data, HashSerializer | ||
store_accessor :data, :comments_these_7_days, :comments_prior_7_days |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This creates the accessor that allows us to do:
user.counters.comments_these_7_days
in addition to
user.counters.data[:comments_these_7_days]
validates :comments_these_7_days, numericality: { only_integer: true } | ||
validates :comments_prior_7_days, numericality: { only_integer: true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can add basic validators to the json fields
class CreateUserCounters < ActiveRecord::Migration[5.2] | ||
def change | ||
create_table :user_counters do |t| | ||
t.references :user, index: { unique: true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unique to make sure that we only have 1 row per user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely the kind of data I would store in Elasticsearch but I love this postgres approach. We will see how it scales and once ES is up and running if we want to do performance tests and comparisons we can.
task update_users: :environment do | ||
ActiveRecord::Base.transaction do | ||
# NOTE: we could bypass Rails by using SQL directly to compute the counts and update these columns | ||
User.includes(:counters).find_each do |user| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should log the duration of this in Datadog so that we can track how long it takes and if it starts taking a really long time, then we might want to skip the Rails interface in an attempt to speed it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often do you plan on running this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know yet, I copied the idea of the task from #5310 :D
It really is a POC for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be run daily, or maybe split it up into sections and run it more often.
I see this as only having to be as accurate as we need it to be so we can play around with this.
It immediately enables us to ask the question "how many people had more comments this week than last?" and I think we can flexibly tie the functionality to the questions we want to ask and how often we want to ask them.
As a proof of concept this all looks good to me @rhymes.
@mstruve when we get up and running with Elastic we can start defining the useful borders of where to use each, and maybe we won't need this at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It immediately enables us to ask the question "how many people had more comments this week than last?" and I think we can flexibly tie the functionality to the questions we want to ask and how often we want to ask them.
This can probably be asked with a single SQL query :) We should probably start thinking about storing these questions in dedicated query objects outside the perimeter of the model, since they will mostly likely be used only for reports or dashboards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mstruve how do we log the duration? :-) I'm not familiar with the DD library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this to a Sidekiq worker when we have a chance so that we can track how long it takes and if we want send some of the count data to Datadog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know what we want to do with this PR, it was meant as a POC but if we like this approach we can get it merged and then add the fields to it as we review the user counters code around the codebase
task update_users: :environment do | ||
ActiveRecord::Base.transaction do | ||
# NOTE: we could bypass Rails by using SQL directly to compute the counts and update these columns | ||
User.includes(:counters).find_each do |user| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mstruve how do we log the duration? :-) I'm not familiar with the DD library
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm game to give this a shot! Want to pull the WIP off the title?
task update_users: :environment do | ||
ActiveRecord::Base.transaction do | ||
# NOTE: we could bypass Rails by using SQL directly to compute the counts and update these columns | ||
User.includes(:counters).find_each do |user| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move this to a Sidekiq worker when we have a chance so that we can track how long it takes and if we want send some of the count data to Datadog.
What type of PR is this? (check all applicable)
Description
This PR is related to a discussion we had in #5310 about creating a proof of concept of how counters could work in a JSONB column in PostgreSQL.
A few details:
users
has a 1:1 relationship withuser_counters
user.counters.data[:comments_these_7_days]
oruser.counters.comments_this_7_days
both in read and writeExample of the column content (after running the included task):
Related Tickets & Documents
#5310