Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make shell component serialisation plugable #697

Open
wants to merge 417 commits into
base: master
Choose a base branch
from
Open

Make shell component serialisation plugable #697

wants to merge 417 commits into from

Conversation

jsgilmore
Copy link

This pull request makes multilang components plugable (issue #373) and relates to issue #654, in that it is now possible to implement protocol buffer serialisation.

Changes in this pull request:
Moved multilang classes to multilang directory.
Added BoltMsg, ShellMSg and SpoutMsg objects. These objects are used when communicating with a serializer, instead of raw JSON.
All JSON serialisation has been moved to a new JsonSerializer.
A serialiser is selected when creating a new ShellBolt or ShellSpout, by passing it in to the constructor. A default option of no serialiser automatically selects the built-in JsonSerializer.

Nathan Marz and others added 30 commits November 13, 2012 16:52
The new implementation separates the various concerns formerly mixed up in the
example code.  Also, we are now using tick tuples (introduced in Storm 0.8)
instead of spawning manual threads for carrying out periodical tasks.

Lastly, we add 192 unit tests for the new implementation that brings the test
coverage for the Rolling Count from 0% to almost 100%.

Note: Adding those unit tests required changes to the build (m2-pom.xml),
notably new test dependencies and moving the existing Java code from src/jvm/*
to src/jvm/main/*.  The latter was required so that the test runner triggered
by Maven can tell code (src/jvm/main) and tests (src/jvm/test) apart.
Complete refactoring of the Rolling Count example
SlidingWindowCounterTest: add missing @test annotation to unit test
Yes, we iterate through the list twice, but we no longer have
a function that does two things.
This commit is best explained by describing the behavior of
RollingTopWords in its absence.

Let's say the word "Bieber" appears in the tweet stream at a very high
rate for half an hour, and then is *never heard again*.

This will insert the word "Bieber" into our rankings object downstream.

Note that there is nothing to *remove* the word "Bieber" from the
rankings object downstream. Let us assume that, during its half-hour,
the word "Bieber" appears *more than five times as often* as other
highly-ranked words. Then the very *last* "Bieber" report (in which
"Bieber" has only appeared for the first minute of the five-minute
window) will still be able to hold a place in the rankings. Objects are
only lowered in the rankings by being outranked, or by *appearing again*
with a lower score.

*With* this change, we wipe zeros, and *then* wipe a slot. This means
that when we wipe the slot containing the last Bieber mentions, we
don't at that moment remove "Bieber" from our map. The *next* report
will include "Bieber" with a score of zero, removing "Bieber" from the
ranking object downstream.
added kafkaspout lag to transactionaltridentkafkaspout
ptgoetz and others added 30 commits April 9, 2014 15:49
…or-storm

STORM-196: When JVM_OPTS are set, storm jar fails to detect storm.jar from environment
…args

Conflicts:
	pom.xml
	storm-core/src/clj/backtype/storm/ui/core.clj
Utils#readCommandLineOpts (STORM-173)
…bator-storm

STORM-194: Support list of strings in *.worker.childopts, handle spaces
…com/dschiavu/incubator-storm

STORM-173: Treat command line "-c" option number config values as such
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet