Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sprint - Aug 5 to Aug 16 #24105

Open
daibhin opened this issue Jul 31, 2024 · 9 comments
Open

Sprint - Aug 5 to Aug 16 #24105

daibhin opened this issue Jul 31, 2024 · 9 comments
Labels
sprint Sprint planning

Comments

@daibhin
Copy link
Contributor

daibhin commented Jul 31, 2024

Global Sprint Planning

3 things that might take us down

  1. nothing

Urgent incident follow-ups

https://github.com/orgs/PostHog/projects/103/views/2

Team sprint planning

For your team sprint planning copy this template into a comment below for each team.

# Team ___

**Support hero:** ___

## Retro

<!-- Grab the high and low priority items from last time and add whether that item was completed or not -->

- 

## Hang over items from previous sprint

<!-- For each item, decide to re-prioritise (and add below) or deprioritise -->

- Item 1. prioritised/deprioritise

## OKR

1. OKR, status (red/yellow/green) and action points if yellow/red


### High priority

-

### Low priority / side quests

-

@daibhin daibhin added the sprint Sprint planning label Jul 31, 2024
@daibhin daibhin pinned this issue Jul 31, 2024
@daibhin
Copy link
Contributor Author

daibhin commented Jul 31, 2024

Team Replay David

Support hero: @daibhin

Retro

Note: missed a lot of goals due to unexpected time off in the team

  • @annikaschmid pricing changes were unexpectedly complicated. Almost over the line
    • retro scheduled
  • @daibhin support load is ticking up again
    • look into support engineers handling proxy issues
    • update feature ownership for proxy related issues

High priority

✅ done
🟢 in progress, on track
🟡 in progress, off track
🔴 missed

  • 🟢 make sure pricing changes have landed @pauldambra
    • will be done by the end of the week, Raquel running last 3k customers today
  • 🟡 error tracking
    • 🔴 if/when new ingestion pipeline
      • we do have a new ingestion pipeline but no work done here
      • paused until Paul is back and can catch up with Ben
    • 🟡 alerting/notifications? @daibhin
  • 🔴 heatmap ingestion separation? @pauldambra

Low priority / side quests

  • 🔴 replay capture "large messages" @pauldambra
  • 🔴 feels like there is a person filtering bug where we don't get consistent results in replay @daibhin / @pauldambra
    • doesn't seem widespread, no other reports
  • 🔴 replay bug investigation tooling @pauldambra / @daibhin
    • playback snapshot by snapshot
    • chrome extension avoidance
  • 🟢 session replays & survey following up on the new filtering experience @daibhin
    • replays looked positive, people filtered well
    • survey running, waiting on replies

OKR

  1. OKR, status (red/yellow/green) and action points if yellow/red
  • 🟡 📱Goal 1: People think of PostHog as a mobile solution
  • 🟢 🪲 Goal 2: Error tracking in people's hands
  • 🟡 ⁉️ Goal 3: Hiring
    • Neil's joining the team!

High priority

  • Onboard @neilkakkar into the team
  • Reduce support backlog (<=10 tickets, currently 25) @daibhin
  • Error tracking alpha @daibhin
    • no stack traces or alerting
    • JS SDK support only
    • convert fingerprints to arrays (currently strings)
    • Add alpha tags in UI and feedback banner
    • share with customers previously interviewed (no feature preview)
  • Build out SDK support
    • Manual capture methods for JS and Python SDKs
  • iOS session recording performance improvements @marandaneto
    • multiple people complaining about performance hit for screens that are constantly redrawing such as animations, transitions, maps, etc, so this is a priority.
  • React Native session recording @marandaneto

Low priority / side quests

  • replay bug investigation tooling @pauldambra
    • playback snapshot by snapshot

@benjackwhite
Copy link
Contributor

benjackwhite commented Jul 31, 2024

Team Infra

Retro / hangover

  • 🟢 Get PoC of Warpstream for both Replay and events in place @frankh
    • It's fully in production for all Replay things!
  • 🔴 Actually implement reverse proxy sharding idea @frankh
    • Warpstream took prio here (keeping things focused)
  • 🟡 Helmfile is starting to hit its scaling limits and has random failures @frankh
    • Made a change to improve some failures but its still not great
    • Also removed some other dependencies that were causing issues
  • 🟢 Postgres EU schema maintenance @danielxnj
    • Couple of bumps but we got there
  • 🟢 K8s upgrades maintenance @danielxnj
    • Karpenter upgrade still outstanding
  • 🟢 RDS maintenance @danielxnj
  • 🟢 Security group roll outs for locking down services @danielxnj
    • Dev rollout being tested - looking good. Will carefully roll out to prod envs
  • 🟢 Roll out celery keda to EU @benjackwhite
  • 🟡 Mr blobby autoscaling based on batch size utilzation @benjackwhite
  • 🟡 Infisical decision and plan @benjackwhite @danielxnj
    • Cloud test worked fine
    • Deprioritized

OKR

  1. 🦹 Zero-trust security 🟢
  2. 🤓 10x Developer Experience 🟡
  3. 💪 Every service lives and dies alone
  4. 💰 Save big on cost 🟡

High priority

@Phanatic
Copy link
Contributor

Phanatic commented Jul 31, 2024

Team Feature Success

Support hero: @Phanatic
Days off:
Juraj: 0 days
Phani: 2 days
Dylan: 0 days

Retro

  • Customer support :
    • TTFR was lower this time around.
    • Easier than last time.
    • Customers were asking about confusion re: why feature flags were evaluated as they were.
    • Multiple questions about overriding feature flags on the client.
  • ⌛ Experiment confidence intervals @jurajmajerik : Juraj on parental leave
  • ✅ verification framework/service to ensure that rust flags service works the same as decide. @dmarticus
  • no-code experiments RFC - @Phanatic

Hang over items from previous sprint


OKRs

  1. Make sure feature flags can handle 10x current scale
  2. No-code experiments
  3. Split out experiments into its own product

High priority

Low priority / side quests / maybe the team will get to this next year

  • Temporal queues for feature success - @Phanatic

@EDsCODE
Copy link
Member

EDsCODE commented Jul 31, 2024

Team Data <->

OKR Q2 2024

Objective

Query 3000

  • Key Results:
    • Autocomplete
    • Increase general BI experience/product BI meta#157
    • Declutter the data warehouse UI and make the features intuitive to find

Data Modeling MVP

  • Key Results:
    • Infrastructure decided and implemented
    • Integrating external data with feature flags
    • External data everywhere in insights/persons/cohorts
    • Get billing team to use modeling in posthog for their invoices_with_annual table

Retro

High Priorirty

  • Data Modeling MVP/experimentation. We have some ideas that need to be tested @tomasfarias
  • Data warehouse design refresh @EDsCODE
  • Get a better view of adoption both absolute and relative to broad posthog sign ups @EDsCODE
  • BI continued BI meta#157 @Gilbert09

@raquelmsmith
Copy link
Member

Team Growth

Retro

Retro items
  • @raquelmsmith (was out most of this week, didn't )
  • @zlwaterfield (on call first first week - support second week)
    • Complete the subscribe to all products backfill (left over - ran into data issues that have been resolved)
    • Complete the startup plan metadata clean and dashboard (left over)
    • Add at least one E2E SAML test (left over)
    • Improve activation error redirects in billing
    • Block customers from resubscribing if they've previously had their sub canceled from failed payments
    • Update support response time copy to use "target response time"
    • Improve billing limits - store as number, improve validation / error handling in client and server
    • Startup customer events for customer.io emails when rolling off
    • Misc plan issues - users on enterprise when they shouldn't be, mis matched tiers, free session replay plans, etc.

Q3 Goals

✅=finished 🟡=in progress 🔴=won't finish ⚪=not started

  1. 🟡 Make onboarding awesome for Product analytics and Data warehouse (Raquel)
  2. ⚪ Support self-serve annual commitments (Zach)
  3. 🟡 Dive into the data to understand our billing metrics and customers better (Zach)
  4. ✅ Launch pricing for data warehouse (Raquel)
  5. 🟡 Hire 2 people (one for billing, one for auth/permissions focus)

This sprint

Time off: @raquelmsmith (August 5-12)

  • @zlwaterfield
    • teams annual plans
    • sentry profile / perf clean up
    • subscription cancellation reasoning dropdown
    • ICP score research - how is it calculated, where is it stored, when is it accessible, does it align with current views of ICP?
    • startup plan credits linked to infra cost
    • billing limit to numbers
    • SAML tests
  • @raquelmsmith

@Twixes
Copy link
Collaborator

Twixes commented Jul 31, 2024

Team Product Analytics

Support hero: @webjunkie (no secondary support)

Retro

  • 🍋‍🟩 @thmsobrmlr Remove filters from the frontend, allowing us to save new insights with query only.
  • 🍋‍🟩 @skoob13 "First time this event was done by user" filer in Trends. Then, LLM-based insight generation MVP.
  • 🍋‍🟩 @Twixes 1 week of support. Project environments rolled out internally. Onboarding @annaszell.
  • 🟢 @aspicer Just 1 week of support.

Goals

  1. Rock-solid analytics (@thmsobrmlr + @webjunkie + @aspicer + @anirudhpillai)
    1. 🟢 Legacy Minus – removing legacy insights code so that we can move fast
    2. 🟠 Tests Plus – shipping fewer bugs in the first place.
    3. 🔴 Metrics Plus – catching issues before before users report
    4. 🟡 Performance Plus - eliminating UX pain via maximum query performance/reliability, based on Metrics Plus data
    5. 🟢 Support Plus – sparking joy for users when they’re led to report a bug
  2. Answering more product questions, deeper (@thmsobrmlr + @webjunkie + @aspicer + @anirudhpillai)
    1. 🔴 Growth Plus - increasing ease of onboarding, and subsequent retention
    2. 🟡 Analysis Plus - answering more product questions, more deeply
  3. 🟡 ArtificialHog (@Twixes + @skoob13) – an LLM-based chat-like interface for answering product questions.

High priority

Offsite week, including hackathon + hackathon followup!

  • @thmsobrmlr Refactoring the backend from Insight.filters to Insight.query (dry run of insights table migration)

Low priority / side quests

@fuziontech
Copy link
Member

fuziontech commented Jul 31, 2024

Team Click Haus, Haus of the Hogs

OKR Q2 2024

Objective

James as a Service -> Clickhouse as a Service

  • P0 tasks such as
    • 🟢 Deletes
    • 🟢 Keeping clusters happy
    • 🟢 Provisioning more disks
    • 🟢 Schema Reviews
    • 🟢 Debugging
    • 🟢 Performance < Thanks @tkaemming
    • 🟢 Backups/Restores
  • Decide whether ByConity is the way forward
    • 🟢 Load it with data, set up
    • 🟡 Test performance, test the functionality/compatibility gaps
  • IF ByConity works, migrate over to it
    • 🟡 Enumerate all functionality that doesn’t work and update the functions/contribute to ByConity
    • 🟡 Syntax
    • 🟡 If it works on metal, put it in k8s with Karpenter
    • 🟡 Evaluate which nodes we should use
  • IF ByConity doesn’t work, reshard US to look like EU cluster
    • 🟡 All clusters (Dev, US, EU) should be consistent in shape and topology. This will make it easier to manage and maintain the clusters and apply learnings from one cluster to another.
    • 🟡 We want all cluster operations to be automated and managed through some form of infra as code that is available in source control.
    • 🟡 Schema management on ClickHouse should be entirely automated and managed through source control with no exceptions. This includes Coordinator schemas.
    • 🟡 We should be able to spin up and down replicas of any cluster with no manual intervention.
    • 🟢 We should be able to upgrade ClickHouse versions with no manual intervention.
    • 🟡 We should have tooling / runbooks for resharding (if we continue down the current coordinator path)

Board

https://github.com/orgs/PostHog/projects/85/views/2

Retro

@Daesgar - Somewhat hard emotionally for me because of everything going on (with being only person in EU for CH). Incident we had was pretty awesome facing this kind of thing and dealing with it/recovering from it. Not best to have this kind of incident, but you learn a lot from them. Validated connection at the pueblo ✅

@fuziontech - It was good to have an incident like last week so that we know that were never losing data. We understand the problem better and how to address it in the future. Basically the system is Anti-Fragile. Slowly our systems will become more robust. Overall good week. I wish we still had more time because distractions like this do slow down progress against the planned work. Sudden reprioritization of work.

Board Snapshot

image

@robbie-c
Copy link
Collaborator

robbie-c commented Jul 31, 2024

Team web analytics

Support hero: @robbie-c

Retro

This sprint ended up being quite different from what was planned, and I spent a lot of it in zendesk.

The session table v2 roll out had to be reverted due to a problem with the backfill, but this is fixed and rolled out again and looks good.

Ended up fixing a few long standing bugs with the help of some customers.

I told joe that if we had to do marketing launch + come out of beta right now, I'd be comfortable with that. He's on holiday next week but we'll start planning when he's back.

Planned Tasks

🟡 Figure out and fix #23690 Person profiles not being merged, using recent versions of the JS sdk (maybe 2 different issues, customer facing one is more urgent than posthog facing one)
🟢 Get everyone in EU onto sessions v2, iron out any problems
🔴 Come up with a solution for a few linked issues: people want a default set of filters, and people want it to be easier to separate their sites. There's been a lot of dicussion on this in the past, see #18863 #12181 #20314 <-- I'm leaning towards the first link here but figuring this out is part of the work

OKR

  1. Make querying fast enough for large customers
  2. Heavily requested features
  3. Work better with other products
  4. Product and growth

High priority

Hangover

Stretch goals

  • Add some visual regression tests for web analytics
  • Docs changes for the improvements to attribution

Ongoing

  • In the background, continue to backfill the sessions table (EU is done, need to do US)

@benjackwhite
Copy link
Contributor

benjackwhite commented Jul 31, 2024

Team Customer Data Platform (new and improved)

Off: Brett 1 week
Support: Oliver

Retro

High priority

  • 🔴 Phase out final uses for jobs and scheduler deployments
  • 🟢 Migrate all semver flattener users to use the inlined plugin and inline some more plugins (user agent done too)
  • 🟢 Rusty hook to use app metrics v2 for debuggability (Brett)
  • 🟢 New Pipelines UI is fully out
  • 🟢 Support for AWS kinesis destination (Hog STL)
  • 🟢 Throughput estimation graph
  • 🟢 App metrics for hog functions
  • 🟢 Native oauth flows for
  • 🟢 Hook up to an actual webhook service
  • 🟡 mvp of event and property definitions as a separate service (Oliver)

OKR

(to be refactored)

High priority

  • Property/Event definitions off the critical path @oliverb123
  • Exception ingestion @bretthoerner @benjackwhite
  • Delivery service V0.2
    • Stress test the existing rusty hook system and any improvements for a V0.1
    • "bad function" rate limiting system
    • Architect V0.2 ("Codename Cyclotron")
  • Final UX stuff to enable releasing this in some form @benjackwhite

Side quests

  • Warpstream for events testing (Mostly done by infra)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sprint Sprint planning
Projects
None yet
Development

No branches or pull requests

8 participants