Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

METRON-1704 Message Timestamp Logic Should be Shared [Feature Branch] #1146

Conversation

nickwallen
Copy link
Contributor

@nickwallen nickwallen commented Aug 7, 2018

The Profiler can operate using either processing time or event time. This is controlled by the user by defining the "timestampField" option in their Profiler configuration.

There is logic that determines the timestamp of a message. If the Profiler is configured to use processing time, then system time is returned by this logic. If the Profiler is configured to use event time, then the timestamp is extracted from a message field.

This logic is currently duplicated across both ports of the Profiler; the REPL and Storm. This should be pulled into metron-profiler-common/ so that the logic can be shared and also used by the Spark port.

This is a pull request against the METRON-1699-create-batch-profiler feature branch.

Testing

  1. Launch the development environment.
  2. Ensure alerts are created in the Alerts UI.
  3. Ensure the Service Check in Ambari passes.
  4. Follow the instructions in the Profiler README to create a basic profile in the REPL.
  5. Follow the instructions in the Profiler README to create a simple profile using the Profiler topology in Storm.

Pull Request Checklist

  • Have you included steps to reproduce the behavior or problem that is being changed or addressed?
  • Have you included steps or a guide to how the change may be verified and tested manually?
  • Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
  • Have you written or updated unit tests and or integration tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

* @param route The message route.
* @param context The Stellar execution context.
*/
@Override
public void distribute(JSONObject message, long timestamp, MessageRoute route, Context context) {
public void distribute(MessageRoute route, Context context) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MessageRoute abstraction now contains both the message (a JSONObject) and the timestamp (a Long). Previously this was passed along separately. It is much simpler for all of the ports (Spark included) to wrap this all into the MessageRoute.

/**
* Responsible for creating the {@link Clock}.
*/
private ClockFactory clockFactory;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ClockFactory helps with the timestamp logic. This used to live in the ProfileSplitterBolt as that is where the timestamp extract logic lived. Now the MessageRouter does this for all Profiler ports.

Optional<Long> timestamp = clock.currentTimeMillis(message);

// can only route the message, if we have a timestamp
if(timestamp.isPresent()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I removed the timestamp logic from the REPL. The MessageRouter does this for us now.

@nickwallen nickwallen changed the title METRON-1704 Message Timestamp Logic Should be Shared METRON-1704 Message Timestamp Logic Should be Shared [Feature Branch] Aug 7, 2018

// what time is it?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I removed the timestamp logic from Storm. The MessageRouter does this for us now.

// what is the name of the entity in this message?
String entity = executor.execute(profile.getForeach(), state, String.class);
route = Optional.of(new MessageRoute(profile, entity));
// what time is is? could be either system or event time
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp logic now lives here. This allows it to be shared by all of the Profiler ports; Storm, REPL, and Spark.

@mmiklavc
Copy link
Contributor

lgtm @nickwallen. +1 via inspection.

@nickwallen
Copy link
Contributor Author

Thanks for the review. Merged.

@nickwallen nickwallen closed this Aug 16, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants