Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blog - How we built a scalable, arbitrary code app server (from MVP to millions of events) #4505

Closed
ivanagas opened this issue Oct 21, 2022 · 1 comment · Fixed by #4612
Closed
Assignees

Comments

@ivanagas
Copy link
Contributor

Strapline

Explain the idea in a sentence or two

How we built a scalable, arbitrary code app server (from MVP to millions of events)

Why should we do it?

How it will it be useful or interesting for readers/viewers

I think it is very interesting, a solid amount of work has already been done by Marius, Yakko, Michael, and could be very popular with our target demographic. Lots there to do a technical deep dive.

#1955

https://www.youtube.com/watch?v=3_yH24Bh0HE

Was talked about but never finished: https://github.com/PostHog/product-internal/issues/152#issuecomment-906544963

Outline

Bullet point outline of structure / questions / topics to be covered

  • We wanted the ability for anyone to run code that modifies their event
    • Because PostHog is product analytics, there are infinite use cases of companies who could use us, each with special data requirements, we might not solve them all.
    • To make sure we are capturing as much data as possible for users, we can integrate with other services.
  • The beginning
    • Hackathon, adding a world map
    • Failed beginnings
      • Python
    • MVP built in 3 days
  • App structure summary
    • Each app has two files: index.js (or index.ts) and plugin.json
      • index contains logic
      • plugin contains configuration for user input
    • Also includes Attachment, Storage, and LogEntry
      • Aka extensions
    • Within index, have functions like processEvent and onEvent which take a single event and the meta object (configuration values basically)
    • Exports a function for use in a VM
  • Integrating with PostHog, focusing on areas most relevant to apps.
    • https://posthog.com/docs/how-posthog-works/ingestion-pipeline
    • Need to maintain scalability (millions of events+)
    • Integrating with our ingestion pipeline
    • Plugin server, focus on areas most relevant to apps.
      • Main thread (control plane)
        • Takes tasks, starts services
        • Manage threads with Piscina
          • Abstracts a lot away
        • Scheduler
        • Job queue
      • Worker threads (receive and do tasks)
        • Managed by Piscina
        • Each thread can run up to 10 tasks at the same time
        • Structured with ingestion logic, connections object, and VMs (see App structure above)
        • Tasks relevant to apps
  • Building more functionality
  • How did we make sure this doesn’t break things?
  • What does this allow
    • You can create and upload your own app.
      • Many plugins have been created by our community
      • Large customers writing their own apps for their needs
    • Anyone can run what they want when self-hosting.
    • Expand functionality of PostHog, allow people to customize their injestion and data processing themselves
  • Summary
    • Build an MVP, limit scope
    • Integrate with what you have and what is most useful for users
    • Improve robustness and stability
    • Expand functionality
@ivanagas ivanagas self-assigned this Oct 21, 2022
@andyvan-ph
Copy link
Contributor

Go for it. This has Hackernews all over it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants