Skip to content

Blitzy-Sandbox/blitzy-RudderStack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

301 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

📖 Just launched Data Learning Center - Resources on data engineering and data infrastructure

The Customer Data Platform for Developers

Website · Documentation · Docs · Changelog · Blog · Slack · Twitter


As the leading open source Customer Data Platform (CDP), RudderStack provides data pipelines that make it easy to collect data from every application, website and SaaS platform, then activate it in your warehouse and business tools.

With RudderStack, you can build customer data pipelines that connect your whole customer data stack and then make them smarter by triggering enrichment and activation in customer tools based on analysis in your data warehouse. It's easy-to-use SDKs and event source integrations, Cloud Extract integrations, transformations, and expansive library of destination and warehouse integrations makes building customer data pipelines for both event streaming and cloud-to-warehouse ELT simple.

RudderStack

Try RudderStack Cloud Free - a free tier of RudderStack Cloud. Click here to start building a smarter customer data pipeline today, with RudderStack Cloud.

Key features

  • Warehouse-first: RudderStack treats your data warehouse as a first class citizen among destinations, with advanced features and configurable, near real-time sync. Warehouse capabilities include configurable backfill with date-range support for historical data re-sync, selective sync with per-table and per-column filtering, warehouse replay from archived events for targeted re-processing, enhanced health monitoring with Prometheus metrics, per-upload tracking, and alerting thresholds, and idempotent sync validation across all 9 warehouse connectors (Snowflake, BigQuery, Redshift, ClickHouse, Delta Lake, PostgreSQL, MSSQL, Azure Synapse, and Datalake).

  • Developer-focused: RudderStack is built API-first. It integrates seamlessly with the tools that the developers already use and love.

  • High Availability: RudderStack comes with at least 99.99% uptime. We have built a sophisticated error handling and retry system that ensures that your data will be delivered even in the event of network partitions or destinations downtime.

  • Privacy and Security: You can collect and store your customer data without sending everything to a third-party vendor. With RudderStack, you get fine-grained control over what data to forward to which analytical tool.

  • Unlimited Events: Event volume-based pricing of most of the commercial systems is broken. With RudderStack Open Source, you can collect as much data as possible without worrying about overrunning your event budgets.

  • Segment API-compatible: RudderStack is fully compatible with the Segment API and achieves 100% field-level parity with the Twilio Segment Event Specification across all six core event types (identify, track, page, screen, group, alias), including structured Client Hints pass-through (context.userAgentData) and semantic event category support. RudderStack has also validated drop-in SDK compatibility with Segment's JavaScript (analytics.js / Analytics 2.0), iOS (analytics-ios), Android (analytics-android), and server-side SDKs (Node.js, Python, Go, Java, Ruby). Existing Segment SDK users can migrate by swapping the endpoint URL and Write Key — no code changes required. See the SDK Compatibility Migration Guides for per-SDK instructions.

  • Production-ready: Companies like Mattermost, IFTTT, Torpedo, Grofers, 1mg, Nana, OnceHub, and dozens of large companies use RudderStack for collecting their events.

  • Seamless Integration: RudderStack currently supports integration with over 90 popular tool and warehouse destinations.

  • User-specified Transformation: RudderStack offers a powerful JavaScript-based event transformation framework which lets you enhance or transform your event data by combining it with your other internal data. Furthermore, as RudderStack runs inside your cloud or on-premise environment, you can easily access your production data to join with the event data.

Get started

The easiest way to experience RudderStack is to sign up for RudderStack Cloud Free - a completely free tier of RudderStack Cloud.

You can also set up RudderStack on your platform of choice with these two easy steps:

Step 1: Set up RudderStack

Note: If you are planning to use RudderStack in production, we STRONGLY recommend using our Kubernetes Helm charts. We update our Docker images with bug fixes much more frequently than our GitHub repo.

Step 2: Verify the installation

Once you have installed RudderStack, send test events to verify the setup.

Architecture

RudderStack is an independent, stand-alone system with a dependency only on the database (PostgreSQL). Its backend is written in Go with a rich UI written in React.js.

A high-level view of RudderStack’s architecture is shown below:

Architecture

For more details on the various architectural components, refer to our documentation.

For detailed architecture documentation, see the Architecture Overview. See also: Data Flow | Pipeline Stages | Deployment Topologies | Warehouse State Machine

📚 Documentation

Comprehensive documentation is available in the docs/ directory, covering architecture, API references, integration guides, operational runbooks, and Segment parity analysis.

Category Description
Gap Report Segment parity gap analysis and sprint roadmap
Architecture System architecture, data flows, deployment topologies
API Reference HTTP API, Event Spec, gRPC API, error codes
Getting Started Installation, configuration, first events
Migration Guide Segment-to-RudderStack migration
Source SDKs JavaScript, iOS, Android, server-side SDK guides
Destinations Stream, cloud, and warehouse destination guides
Transformations Custom transforms and Functions
Governance Tracking plans, consent, event filtering
Identity Identity resolution and profiles
Operations Warehouse sync, warehouse replay, backfill, capacity planning
Warehouse Connectors Per-warehouse setup and configuration guides
Backfill API Warehouse backfill with configurable date ranges
Health Monitoring Warehouse sync health metrics, Prometheus integration, alerting
Selective Sync Per-table and per-column warehouse sync filtering
Warehouse Replay Replay archived events through the warehouse pipeline
Reference Configuration, environment variables, glossary
Contributing Development setup, destination onboarding, testing
SDK Compatibility Segment SDK migration guides for JavaScript, iOS, Android, and server-side SDKs
Cloud Source Framework Cloud source ingestion architecture design for polling/webhook-based SaaS integrations

Segment Parity Gap Report

A comprehensive gap analysis comparing RudderStack capabilities against Twilio Segment features is available in the Gap Report. The Event Spec Parity dimension has achieved 100% field-level parity with the Twilio Segment Event Specification, covering all six core event types (identify, track, page, screen, group, alias), all 18 standard context fields, structured Client Hints (context.userAgentData), 17 reserved identify traits, 12 reserved group traits, and seven semantic event categories (E-Commerce v2, Video, Mobile, B2B SaaS, Email, Live Chat, A/B Testing). Source SDK Compatibility has been validated across JavaScript, iOS, Android, and five server-side SDKs, raising the Source Catalog parity score from ~60% to ~85%. A Cloud Source Framework design has been produced to address the 140 cloud app source gap through a polling/webhook-based ingestion architecture. RudderStack extensions beyond the Segment spec — including /v1/replay, /internal/v1/retl, /beacon/v1/*, /pixel/v1/*, and the merge call type — are documented in the Event Spec API Reference. The analysis also covers destination catalog coverage, transformation/Functions, Protocols enforcement, identity resolution, and warehouse sync.

Warehouse sync parity has been improved from ~80% to ~95% through the Sprint 7–9 Warehouse Feature Enhancement, which delivered idempotent sync validation across all 9 warehouse connectors (Snowflake, BigQuery, Redshift, ClickHouse, Delta Lake, PostgreSQL, MSSQL, Azure Synapse, and Datalake), configurable backfill with date-range support, enhanced health monitoring with Prometheus metrics and alerting, selective sync with per-table and per-column filtering, and warehouse replay from archived events. See the Backfill API, Health Monitoring, Selective Sync, and Warehouse Replay documentation for details.

Note: Segment Engage/Campaigns and Reverse ETL are planned for Phase 2.

Segment SDK Compatibility

RudderStack Gateway supports drop-in compatibility with Segment SDK client libraries. Existing Segment SDK users can migrate to RudderStack by replacing the endpoint URL (api.segment.io<your-rudderstack-data-plane-url>) and substituting a RudderStack Write Key — no application code changes are required.

The following SDKs have been validated for full compatibility:

SDK Library Validated Capabilities
JavaScript analytics.js / Analytics 2.0 All 6 event types, batch (/v1/batch), beacon (/beacon/v1/batch), pixel (/pixel/v1/track, /pixel/v1/page)
iOS analytics-ios (Swift) All event types, mobile context auto-collection (device, os, app, network, screen), lifecycle events
Android analytics-android (Kotlin) All event types, mobile context auto-collection (device, os, app, network, screen), lifecycle events
Node.js analytics-node Batch endpoint, retry behavior
Python analytics-python Batch endpoint, flush behavior
Go analytics-go Batch endpoint
Java analytics-java Batch endpoint
Ruby analytics-ruby Batch endpoint, retry behavior

Migration guides:

Contribute

We would love to see you contribute to RudderStack. Get more information on how to contribute here.

License

RudderStack server is released under the Elastic License 2.0.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors