GitHub - yahoo/burst: BURST Behavioral Analysis System

The Burst Behavioral Analysis Engine

Burst is designed to support fast, rich, and flexible behavioral study of enormous and noisy real world event datasets generated by mobile applications as they are used day to day by their end users over long periods of time. It was developed at a small mobile application analytics startup called Flurry, later bought by Yahoo, and has been serving production scale request workloads (for free!) 24x7 to its customers for many years. It is very good at what it does and is now available to you as an open source platform.

Is Burst For You?

We suggest you start any effort to understand Burst and what it does is by first taking a look at how we define Behavioral Analysis. This is there to allow you to quickly understand if your data, and the questions you want to ask of it, match well what Burst does well. Then we suggest turning to the overview of the Burst Data Model, the Burst Execution Model, and finally the Burst Runtime Model. These high level presentations should help you get a cleaner and deeper sense of what Burst is, how it works, and how you might envision it working for you. Extra credit to dig into a unique approach that Burst takes called the Single Pass Scan, as well as high level discussions of Performance, Security, and Sampling.

What Burst Is Not

Equally important is to spend a moment clarifying what Burst is not...

a database

Burst is not a general purpose query engine nor is it a persistent, authoritative, or transactional database. It is an online analysis engine that scans imported data snapshots. To analyze your data you must first import a dataset from your data storage system into the memory/disk cache of a suitable Burst compute Cell, where you can then run one or more analysis requests across that data snapshot.

real time

Burst does not support what can be considered real time or streaming data access, it does provide services and protocols that can be used to build efficient, massive parallel import pipelines that can fetch up-to-date data quickly. The data Burst analyzes will be as current or up to date as the last import done. Burst has features that allow you to control the time window (lookback) of the 'view' you import.

conformant SQL

Burst has a rich front end language called EQL, and where possible we have tried to make that language conform to and look the same as SQL. However, though the world of behavioral data and questions significantly overlaps the world of relational data models and relational calculus, EQL and the underlying semantics are simply not the same, nor are they intended to be the same, as SQL and its underlying semantics.

Prerequisites

In order to stand up a Burst compute cell and use it to analyze your data you will need these basics:

A Burst Compute Cell: One or more nodes with a Java runtime environment to set up a Burst Supervisor/Worker process or container topology e.g. Kubernetes. The Burst runtime is distributed either as a docker container or an executable jar that can be placed into virtual containers or any other packaging/deployment environment appropriate to your needs. With a few reasonable limits, you can scale Burst horizontally to service larger datasets and vertically to provide faster computations.
Metadata Catalog: Burst uses a MySQL DB as a Catalog that stores metadata. For most scenarios this DB does not need to be particularly high performance though for high/concurrent analysis request rates, it should be able to provide low latency indexed table lookups
Remote Datasource: A datasource system/cluster, with access to your data, where the Burst Java remote data import system endpoint can be stood up. This can be colocated on the Burst compute cell. If you have a parallel (multi-node) data storage system such as HBASE, the Burst data import system is quite good at spreading remote data feed endpoints across numerous data nodes.

Next Steps

If you want to get up close and personal we have a few more steps for you to take...

explore Burst using a local cell
build the Burst source tree
(coming soon) launch Burst with your schema, your data, and your questions

Digging Deeper

If you are still with us, and you want to understand and/or vette the implementation, we suggest you take a look at the individual subsystem documentation and as well as become familiar with our external dependencies.

------ HOME --------------------------------------------

Name		Name	Last commit message	Last commit date
Latest commit History 203 Commits
.mvn		.mvn
.run		.run
burst-agent		burst-agent
burst-alloy-testing		burst-alloy-testing
burst-api		burst-api
burst-application		burst-application
burst-brio		burst-brio
burst-catalog		burst-catalog
burst-clients		burst-clients
burst-dash		burst-dash
burst-eql		burst-eql
burst-fabric		burst-fabric
burst-felt		burst-felt
burst-ginsu		burst-ginsu
burst-hydra		burst-hydra
burst-motif		burst-motif
burst-nexus		burst-nexus
burst-relate		burst-relate
burst-samplestore		burst-samplestore
burst-supervisor		burst-supervisor
burst-system-integration-tests		burst-system-integration-tests
burst-tesla		burst-tesla
burst-vitals		burst-vitals
burst-worker		burst-worker
burst-zap		burst-zap
docker/corretto		docker/corretto
docs		docs
documentation		documentation
kubernetes		kubernetes
scripts		scripts
stylesheets		stylesheets
thrift		thrift
.editorconfig		.editorconfig
.gitignore		.gitignore
.whitesource		.whitesource
CHANGELOG.md		CHANGELOG.md
Code_of_Conduct.md		Code_of_Conduct.md
LICENSE.txt		LICENSE.txt
Makefile		Makefile
PULL_REQUEST_TEMPLATE.md		PULL_REQUEST_TEMPLATE.md
README.md		README.md
mvnw		mvnw
mvnw.cmd		mvnw.cmd
params.json		params.json
pom.xml		pom.xml
screwdriver.yaml		screwdriver.yaml
sd.allow		sd.allow
whitesource.config		whitesource.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Burst Behavioral Analysis Engine

Is Burst For You?

What Burst Is Not

Prerequisites

Next Steps

Digging Deeper

About

Releases 1

Contributors 4

Languages

License

yahoo/burst

Folders and files

Latest commit

History

Repository files navigation

The Burst Behavioral Analysis Engine

Is Burst For You?

What Burst Is Not

Prerequisites

Next Steps

Digging Deeper

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Contributors 4

Languages