Skip to content

Add support for running client in test mode#15

Merged
pauldambra merged 2 commits intoPostHog:masterfrom
thelookoutway:test-mode
Jun 24, 2022
Merged

Add support for running client in test mode#15
pauldambra merged 2 commits intoPostHog:masterfrom
thelookoutway:test-mode

Conversation

@tatey
Copy link
Copy Markdown
Contributor

@tatey tatey commented Jun 18, 2022

Hey there 👋,

I've been integrating PostHog into our environment. We have a backend app that drains logs from Heroku and turns them into page views which eventually make their way into our data warehouse. What I've done is take those page views and forward them into PostHog.

I've been trying to write tests that place assertions on payloads being sent, but it's been hard. Initially I tried WebMock, but it was verbose and didn’t play nicely with the worker thread. I've used other clients before, like the StatsD client, which has a test mode for solving this problem.

When I poked around the code for this client, I noticed instance variables were being set for placing assertions on the queue. This is exactly what I wanted to do. What this patch does, is take that idea and sanction it through a public API.

I'm running the fork in our backend app in production right now and this is how it looks (I really like it!):

# config/initializers/post_hog.rb
$post_hog = PostHog::Client.new(api_key: ENV.fetch("POST_HOG_API_KEY"), test_mode: Rails.env.test?)
# app/jobs/forward_page_view_to_post_hog_job.rb
class ForwardPageViewToPostHogJob < ApplicationJob
  def perform(page_view)
    $post_hog.capture({
      event: "$pageview",
      distinct_id: page_view.uuid,
      timestamp: page_view.timestamp,
      properties: {
        "$pathname" => page_view.path,
        "HTTP Method" => page_view.method,
        "HTTP Status" => page_view.status,
      },
    }.deep_merge(group_properties(page_view)))
  end

  private

  def group_properties(page_view)
    if page_view.company_id.present?
      {properties: {"$groups" => {"company" => page_view.company_id}}}
    else
      {}
    end
  end
end
# spec/jobs/forward_page_view_to_post_hog_spec.rb
require "rails_helper"

RSpec.describe(ForwardPageViewToPostHogJob) do
  it "performs" do
    ForwardPageViewToPostHogJob.perform_now(page_views(:admin))
    expect($post_hog.dequeue_last_message).to include(
      event: "$pageview",
      distinct_id: "c239e246-4e59-46fb-8978-15205bbe2071",
      timestamp: "2019-07-08T03:52:45.000Z",
      properties: {
        "$groups" => {"company"=>"1"},
        "$pathname" => "/admin",
        "HTTP Method" => "GET",
        "HTTP Status" => "200",
        "$lib" => "posthog-ruby",
        "$lib_version" => "1.2.4",
      },
    )
  end
end

I wasn't sure how receptive you'd be to this patch so I've tried to be true to what's there and keep it minimal. If you like the idea, I'm happy to take feedback and extend it further.

Some ideas I have:

  • Rename Worker to ThreadWorker so it pairs nicely with NoopWorker. Potentially rename NoopWorker to TestWorker.
  • Make #flush and #ensure_worker_running not block when using the NoopWorker.
  • Stretch goal: Make the queue accessible via something like PostHost::Client.test_queue so it doesn't matter how the client is initialised, you can easily get access to the queue in your tests.

Thank you! 🙏

@pauldambra
Copy link
Copy Markdown
Member

Looks good to me...

Rename Worker to ThreadWorker so it pairs nicely with NoopWorker.

Yep, nice change...

Potentially rename NoopWorker to TestWorker

I quite like NoopWorker. I think it's clearer (a worker that does nothing) than TestWorker (a worker that is used in tests and does......). But feel free to argue the case :)

Make #flush and #ensure_worker_running not block when using the NoopWorker

Yep, should make NoopWorker complete 👍

Stretch goal: Make the queue accessible via something like PostHost::Client.test_queue so it doesn't matter how the client is initialised, you can easily get access to the queue in your tests.

Feels like this should be a follow-up PR to help keep this one focussed?

@pauldambra
Copy link
Copy Markdown
Member

Also, thank you for the contribution 🙌 💖

@tatey
Copy link
Copy Markdown
Contributor Author

tatey commented Jun 20, 2022

Thanks for the feedback @pauldambra 🙏.

I've pushed up some changes for you to review. I'm using fixups to patch commits and will rebase once you're satisfied with the changes.


Yep, nice change...

I quite like NoopWorker. I think it's clearer (a worker that does nothing) than TestWorker (a worker that is used in tests and does......). But feel free to argue the case :)

I prefer NoopWorker too.

I ended up renaming Worker to SendWorker. I think SendWorker and NooWorker better reflect the dequeuing strategy.

Make #flush and #ensure_worker_running not block when using the NoopWorker

Yep, should make NoopWorker complete

This was a bit trickier to do than I first thought. SendWorker effectively uses the same behaviour to flush the queue as it does to run the queue, where as in the NoopWorker we want these to be distinct.

I came up with a way of doing it by passing an argument. I think this is satisfactory, but it's not amazing. In the medium term I'd look to push the entire dequeuing strategy into the workers themselves, rather than splitting it between the client and the worker.

Stretch goal: Make the queue accessible via something like PostHost::Client.test_queue so it doesn't matter how the client is initialised, you can easily get access to the queue in your tests.

Feels like this should be a follow-up PR to help keep this one focussed?

I agree. I won't pursue the stretch goal in this PR.

Comment thread spec/spec_helper.rb
false
end
end

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a positive note we've been able to remove both NoopWorker and DummyWorker 😄

Comment thread lib/posthog/noop_worker.rb Outdated
end

def flush
@queue.clear
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the worker is no-op it feels like this should this be no-op too...

Maybe that pushes us to a different naming 🤔

If we had EventSendingWorker and EventSwallowingWorker is that clearer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting!

Look, I wrestled with this one and ultimately went the other way, but it's not a strongly held opinion.

What ultimately made me think flushing was OK is:

  1. You can call Client#dequeue_last_message which pops the message off the queue
  2. When writing tests it's not uncommon to reset state between tests. I am not personally doing this.
  3. We can't write a flush test for the client in this project unless we actually clear the queue. We could potentially manipulate the queue directly in the test to achieve the desired effect though.

Copy link
Copy Markdown
Contributor Author

@tatey tatey Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I thought of another possibility. I definitely felt tension when I was adding an argument to #ensure_worker_running.

What if clearing and flushing are being conflated? The client is the owner of the queue, not the worker, so what if we exposed a clear method on the client itself?

You could imagine doing the following:

client.flush # Blocks and waits until the queue is cleared
client.clear # Clears the queue without waiting

That would mean the NoopWorker could go back to being literally a noop, we could reach for Client#clear in the test, and consumers of the client, like my project, could reach for Client#clear in tests.

It would also solve another problem I've been wrestling with which is how would I write a test for this....

def perform
  client.capture(...)
  client.flush
end

By the time I put an assertion the queue would be empty with the code as it is at the time of writing. If we changed it based on the above then #flush would noop, so I could do my assertion, and #clear would empty the queue so I could reset state between tests. It's 100% opt-in which feels more respectful too.

How does that sound to you?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I like the separation of arranging items onto the queue and working to pull them off... And hides the worker inside the client nicely 👍

Thanks for bearing with my nit-picking! ❇️

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all! This was a bit of a brain itch for me so I'm glad we persevered. I'm grateful for the feedback 😄

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pauldambra OK, I've made the change, rebased to squash the fixups, and pushed. I'm really happy with this API because it doesn't have the awkwardness we felt in the previous iteration.

The only caveat is that we couldn't get away from monkey patching the client in the #flush spec, but I think this is an acceptable trade off as it's completely isolated to that spec. I feel like this is cleaner than having DummyWorker in spec_helper.rb, but will ultimately support whatever position you have.

All feedback welcome.

tatey added 2 commits June 21, 2022 22:33
- This nicely pairs with NoopWorker.
- Fix before block leaking state to subsequent specs.
@tatey tatey requested a review from pauldambra June 21, 2022 12:37
@tatey
Copy link
Copy Markdown
Contributor Author

tatey commented Jun 23, 2022

@pauldambra Hey there. Anything I can do to help get this merged? I'm grateful for the feedback I've gotten to date, and always open to more.

@pauldambra
Copy link
Copy Markdown
Member

This looks good to me...

@liyiy since you all are working on server side libraries soon do you want to take a look too?

Copy link
Copy Markdown
Member

@pauldambra pauldambra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@pauldambra pauldambra merged commit 0c29405 into PostHog:master Jun 24, 2022
@pauldambra
Copy link
Copy Markdown
Member

Thanks again @tatey .

You should get an email from our contributor bot. You can find out what to do if that doesn't arrive here https://posthog.com/docs/contribute/recognizing-contributions

@tatey
Copy link
Copy Markdown
Contributor Author

tatey commented Jun 24, 2022

No worries. It's been a delight. Thanks for cutting a new release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants