Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed Events Playback on re-connect #213

Merged
merged 1 commit into from May 11, 2015

Conversation

Projects
None yet
5 participants
@scoheb
Copy link
Contributor

commented Mar 12, 2015

The missed events playback manager is now able to:

  • maintain a last known alive timestamp of events that were received by the
    Gerrit Server connection.
  • Upon re-connect, a request is made to the Gerrrit Events-log plugin installed on the Gerrit
    Server to determine which events may have been missed while the connection was down.
  • The events are then added to the Gerrit Trigger event queue to be processed.

This feature is only enabled if:

[JENKINS-23871]

@scoheb scoheb force-pushed the scoheb:missed-events-playback branch from 33d2799 to aad3a17 Mar 12, 2015

@jenkinsadmin

This comment has been minimized.

Copy link
Member

commented Mar 13, 2015

Thank you for a pull request! Please check this document for how the Jenkins project handles pull requests

@hugares

This comment has been minimized.

Copy link
Member

commented Mar 13, 2015

In the commit, your refer to the Audit plugin, you should change that to Events-log plugin

@rinrinne

This comment has been minimized.

Copy link
Member

commented Mar 13, 2015

In event-log, event visibility check is done per project. But Gerrit can handle access control per branch. So that plugin has the possibility to playback unnecessary events.

In this pluigin, playback process is started just after Gerrit connection is established. It means that events from both are received in the same duration. If "Build Current Patchset Only" feature is enabled, it might not work correctly because event order based on timestamp is broken.

@hugares

This comment has been minimized.

Copy link
Member

commented Mar 13, 2015

@rinrinne You are right, currently, the visibility check is done per project in the events-log plugin but we could easily modify it to do it per branch also.

* @return true if so, false otherwise.
*/
@JavaScriptMethod
public boolean isGerritMissedEventsNotSupported() {

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

The javadoc seems to say something different from what the name of the method and its implementation suggests.

And inverse logic like this is usually hard to grok when you read the code, so try to be careful with the naming to not confuse too much.


/**
* get DateRange from current and last known time.
* @return diff between current time and last known

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

This doesn't seem to produce a diff, only the last or current Date.

*/
public class GerritMissedEventsPlaybackManager implements ConnectionListener, GerritEventListener {

private static final Logger logger = LoggerFactory.getLogger(GerritMissedEventsPlaybackManager.class);

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

I think many the log levels in this class needs to be lowered.

logger.debug("found line: {}", line);
JSONObject jsonObject = JSONObject.fromObject(line);
GerritEvent evt = GerritJsonEventFactory.getEvent(jsonObject);
if (evt instanceof ChangeBasedEvent) {

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

so why only ChangeBasedEvents? why not ref-updated as well?

/**
* Vector that contains received Gerrit Events.
*/
protected Vector<ChangeBasedEvent> receivedEventCache = new Vector<ChangeBasedEvent>();

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

I would prefer you use List l = Collections.synchronizedList(new SomeListImplementation()) as it doesn't expose any specific implementation as well as clearly state synchronization intent instead of giving Java 1.1 flashbacks.

persist(triggeredEvent);
//add to cache
if (!playBackComplete) {
if (!receivedEventCache.contains(event)) {

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

Using a synchronizedSet here instead could maybe lower the complexity.

logger.debug("Event {} ALREADY in received cache for server: {}", event, serverName);
}
} else {
receivedEventCache = new Vector<ChangeBasedEvent>();

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

This will create a new object for every event coming in during normal operations, it seems a bit unnecessary exercising of the heap.

*/
protected Vector<ChangeBasedEvent> getEventsFromDateRange(Date lowerDate) throws IOException {

GerritServer server = PluginImpl.getInstance().getServer(serverName);

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

Findbugs will have an issue with this in the current core version we are depending on as PluginImpl.getInstance() is annotated with CheckForNull.

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

Use PluginImpl._getServer(...) as a simple shortcut to prevent that, but check server for null.

if (statusCode == HttpURLConnection.HTTP_OK) {
try {
InputStream bodyStream = execute.getEntity().getContent();
String body = IOUtils.toString(bodyStream, "UTF-8");

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

I don't think you can be sure that the server returned the content in UTF-8, use the content type from the response instead.

* Shutdown the listener.
*/
public void shutdown() {
GerritServer server = PluginImpl.getInstance().getServer(serverName);

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

CheckForNull

/**
* Map that contains timestamps.
*/
protected Map<String, EventTimeSlice> serverTimestamps = new ConcurrentHashMap<String, EventTimeSlice>();

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

From what I can see there is one GerritMissedEventsPlaybackManager per GerritServer and you are only using serverName as a key for putting and reading values. So it seems like only one value will be ever stored in this map?

*/
public static XmlFile getConfigXml() {
return new XmlFile(Jenkins.XSTREAM,
new File(Jenkins.getInstance().getRootDir(), "gerrit-trigger-server-timestamps.xml"));

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

CheckForNull

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

the file name should be a constant.

/**
* @return XmlFile corresponding to gerrit-trigger-server-timestamps.xml.
*/
public static XmlFile getConfigXml() {

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

Since this is the only method it could be moved into GerritMissedEventsPlaybackManager.
But now I get why you thought of using a map, but you are risking each server instance overwriting this file so other servers will loose their data.

A directory with one file per server would be more safe I think.

case HttpURLConnection.HTTP_NOT_FOUND:
logger.info("Gerrit Plugin " + pluginName + " is NOT installed");
return false;
case HttpURLConnection.HTTP_UNAUTHORIZED:

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

you could simplify these by parameterized messages. Messages.PluginHttpConnectionUnauthorized(pluginName)


DefaultHttpClient httpclient = new DefaultHttpClient();
HttpGet httpGet = new HttpGet(restUrl + "plugins/" + pluginName + "/");
httpclient.getCredentialsProvider().setCredentials(new AuthScope(null, -1),

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

You forgot to set the proxy.

I count now three places where a HttpClient is created in a similar manner with most likely cut & paste code.
A utility method in the config would help this a lot I think.

* method to verify if plugin is supported
*/
private void checkIfEventsLogPluginSupported() {
GerritServer server = PluginImpl.getInstance().getServer(serverName);

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

CheckForNull

if (serverTimestamps != null) {
EventTimeSlice myTimeStamp = serverTimestamps.get(serverName);
if (myTimeStamp != null) {
Date myDate = new Date(myTimeStamp.getTimeSlice());

This comment has been minimized.

Copy link
@rsandell

rsandell Mar 20, 2015

Member

getTimeSlice is claiming to use seconds, but Date is using ms right?

@scoheb scoheb force-pushed the scoheb:missed-events-playback branch from aad3a17 to 3b42f45 Apr 29, 2015

@scoheb

This comment has been minimized.

Copy link
Contributor Author

commented Apr 29, 2015

Hi @rsandell,

I think I addressed all your comments.

Please review at your convenience

Thanks,

Scott

public class EventTimeSlice {

private long timeSlice;
private Vector<GerritTriggeredEvent> events = new Vector<GerritTriggeredEvent>();

This comment has been minimized.

Copy link
@rsandell

rsandell May 5, 2015

Member

Use a synchronized List implementation instead, if synchronization is what you need, otherwise a normal List.

return;
}
Date timeStampDate = getDateFromTimestamp();
long diff = new Date().getTime() - timeStampDate.getTime();

This comment has been minimized.

Copy link
@rsandell

rsandell May 5, 2015

Member

System.currentTimeMillis() so you don't have to create a new object just for that value.

Date timeStampDate = getDateFromTimestamp();
long diff = new Date().getTime() - timeStampDate.getTime();
if (diff > 0) {
logger.debug("Non-zero date range from last-alive timestamp exists for server " + serverName

This comment has been minimized.

Copy link
@rsandell

rsandell May 5, 2015

Member
if (logger.isDebugEnabled()) {
  logger.debug("Non-zero date range from last-alive timestamp exists for server {}:{}", serverName, Util.getPastTimeString(diff));
}
logger.debug("Non-zero date range from last-alive timestamp exists for server " + serverName
+ ": " + Util.getPastTimeString(diff));
} else {
logger.debug("Zero date range from last-alive timestamp for server " + serverName);

This comment has been minimized.

Copy link
@rsandell

rsandell May 5, 2015

Member

logger.debug("Zero date range from last-alive timestamp for server {}", serverName);

List<GerritTriggeredEvent> events = getEventsFromDateRange(timeStampDate);
logger.info("({}) missed events to process for server: {} ...", events.size(), serverName);
for (GerritTriggeredEvent evt: events) {
logger.info("({}) Processing missed event {}", serverName, evt);

This comment has been minimized.

Copy link
@rsandell

rsandell May 5, 2015

Member

I guess this should be debug level as well?

This comment has been minimized.

Copy link
@scoheb

scoheb May 5, 2015

Author Contributor

I would argue maybe only the one inside the loop...the main one could be seen as valuable upon server startup.

This comment has been minimized.

Copy link
@rsandell

rsandell May 5, 2015

Member

Yes, the outer one would only turn up once per server and is a good indication, the inner one though has a risk of polluting the log.

}

if (serverTimestamp != null && ts < serverTimestamp.getTimeSlice()) {
logger.debug("Event has same time slice {} or is earlier...NOT Updating time slice.", ts);

This comment has been minimized.

Copy link
@rsandell

rsandell May 5, 2015

Member

so maybe you should return here instead of persisting to disk?

* @return XmlFile corresponding to gerrit-trigger-server-timestamps.xml.
*/
public static XmlFile getConfigXml(String serverName) {
File xmlFile = new File(Jenkins.getInstance().getRootDir() + GERRIT_SERVER_EVENT_DATA_FOLDER

This comment has been minimized.

Copy link
@rsandell

rsandell May 5, 2015

Member

CheckForNull on Jenkins.getInstance()

Try using the new File(parent, child) construct instead of string concatenation. Or jenkins.getRootPath().child(x).child(y).

server.triggerEvent(evt);
receivedEventCache.add(evt);
logger.debug("Added event {} to received cache for server: {}", evt, serverName);
persist(evt);

This comment has been minimized.

Copy link
@rsandell

rsandell May 5, 2015

Member

Is there some way you can avoid writing to disk for every single event? If Jenkins has been down for 24h there could be thousands of events, and that would take a really long time to process if each event should also be persisted to disk before the next one can be processed.

This comment has been minimized.

Copy link
@scoheb

scoheb May 5, 2015

Author Contributor

We need some way of knowing what timestamp to use to request missed events from Gerrit when the jenkins instance comes back up. At some point, we need to persist the timestamp.

I can look into something like MapDB that maybe more performant.

Do you have any ideas? I'll gladly check them out.

This comment has been minimized.

Copy link
@rsandell

rsandell May 6, 2015

Member

Well, on line 253 there is also a call to persist(triggeredEvent); So I'm thinking that its not even needed here. IIRC server.triggerEvent(evt); will put the event on the event queue so the persist on line 253 is in a different thread and shouldn't slow us down as much.

This comment has been minimized.

Copy link
@scoheb

scoheb May 6, 2015

Author Contributor

ok, I'll re-run my load tests with line 219 commented out.

Thanks!

@scoheb scoheb force-pushed the scoheb:missed-events-playback branch from 3b42f45 to ca72abb May 8, 2015

@scoheb

This comment has been minimized.

Copy link
Contributor Author

commented May 8, 2015

I removed the persist() call and re-ran the stress tests. They ran fine.

Thanks @rsandell

@rsandell

This comment has been minimized.

Copy link
Member

commented May 11, 2015

Looks good, just fix the compilation error in the test and I think we are good to merge.

Missed Events Playback on re-connect
The missed events playback manager is now able to:

- maintain a last known alive timestamp of events that were received by the
Gerrit Server connection.
- Upon re-connect, a request is made to the Gerrit Events-Log plugin installed on the Gerrit
 Server to determine which events may have been missed while the connection was down.
- The events are then added to the Gerrit Trigger event queue to be processed.

This feature is only enabled if:

- The REST api is configured.
- The Gerrit Events-log plugin is installed on the Gerrit Server
-- Please see https://gerrit.googlesource.com/plugins/events-log/

[JENKINS-23871]

Change-Id: I1cfacad309885b88d23f8925d164a6aa427af764

@scoheb scoheb force-pushed the scoheb:missed-events-playback branch from ca72abb to 13b4cc2 May 11, 2015

@scoheb

This comment has been minimized.

Copy link
Contributor Author

commented May 11, 2015

Sorry about that @rsandell ...it has been fixed.

rsandell added a commit that referenced this pull request May 11, 2015

Merge pull request #213 from scoheb/missed-events-playback
Missed Events Playback on re-connect

@rsandell rsandell merged commit 98f0c06 into jenkinsci:master May 11, 2015

1 check passed

Jenkins This pull request looks good
Details
@rsandell

This comment has been minimized.

Copy link
Member

commented May 11, 2015

W00t!

@scoheb

This comment has been minimized.

Copy link
Contributor Author

commented May 11, 2015

Now the question is when will this be in a release? :)

@rsandell

This comment has been minimized.

Copy link
Member

commented May 12, 2015

My plan is to get #227 merged and then do a beta release, let that stew for a week and hopefully get some beta testing on it. And then release 2.14.

@scoheb

This comment has been minimized.

Copy link
Contributor Author

commented May 12, 2015

I'd also like to remove the "unreviewed patches listener" since this PR is a replacement for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.