Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race conditions in channel #303

Merged
merged 4 commits into from
Jan 17, 2017
Merged

Conversation

guperrot
Copy link
Member

No description provided.

@msftclas
Copy link

Hi @guperrot, I'm your friendly neighborhood Microsoft Pull Request Bot (You can call me MSBOT). Thanks for your contribution!


It looks like you're a Microsoft contributor (Guillaume Perrot). If you're full-time, we DON'T require a Contribution License Agreement. If you are a vendor, please DO sign the electronic Contribution License Agreement. It will take 2 minutes and there's no faxing! https://cla.microsoft.com.

TTYL, MSBOT;

@codecov-io
Copy link

codecov-io commented Jan 14, 2017

Current coverage is 100% (diff: 100%)

Merging #303 into develop will not change coverage

@@           develop   #303   diff @@
=====================================
  Files           60     60          
  Lines         2711   2725    +14   
  Methods          0      0          
  Messages         0      0          
  Branches       494    499     +5   
=====================================
+ Hits          2711   2725    +14   
  Misses           0      0          
  Partials         0      0          

Powered by Codecov. Last update f504ef8...c0c5a04

MobileCenterLog.debug(LOG_TAG, "enqueue(" + groupState.mName + ") pendingLogCount=" + groupState.mPendingLogCount);

/* Increment counters and schedule ingestion if we are not disabled. */
if (!mEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove ! and swap if and else body?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@@ -339,93 +369,103 @@ private synchronized void triggerIngestion(final @NonNull String groupName) {

/* Get a batch from Persistence. */
final List<Log> batch = new ArrayList<>(groupState.mMaxLogsPerBatch);
final int currentState = mStateChangeCounter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename the local variable? Getting confused about State, better to have Counter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rename both to make it more explicit.

}
});
}

private synchronized void deleteLogsOnSuspended(GroupState groupState, int currentState, List<Log> logs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is a recursion. Once the Channel starts deleting logs on suspend, it should keep deleting even though the state gets changed in the middle. This is more like a PM question but we need to decide whether Channel keeps calling failure callbacks upon state change or not.

Copy link
Member Author

@guperrot guperrot Jan 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it should not proceed, the disabling was cancelled, IO should stop at first chance we get.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to make sure that the logs should be gone at the time it is disabled.
It will send logs after re-enable that were created before it was disabled. I expect all the logs will be discarded after it is disabled. When it is enabled again, it should be in a clean states without any pending logs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Settled offline with team, it's better to accidentally send logs that were disabled than accidentally call back old logs with a failure callback after it's already re-enabled. Either way it's a weird corner case but we had to choose between the two.

if (logs.size() >= CLEAR_BATCH_SIZE && groupState.mListener != null) {
deleteLogsOnSuspended(groupState);
} else {
mPersistence.deleteLogs(groupState.mName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If state gets changed, it will stop deleting logs for previous calls. Seems not right. Just checkStateDidNotChange out of the method is enough I think.

Copy link
Member Author

@guperrot guperrot Jan 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When state change, we should stop what we are doing at the first chance we get.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is right but I think it is not in this case.

* State checker. If this counter changes during a call to persistence, we have to ignore the result in the callback.
* Cancelling a database call would be unreliable, and if it's too fast you could still have the callback being called.
*/
private int mStateChangeCounter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable can only be changed in synchronized method as of now so it is not a problem at all. However I recommend to use AtomicReference just in case to prevent any mistakes from future implementations.
Just my opinion but what do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They say in Javadoc that its not a general replacement for int, and it has overhead and we still need synchronized for general thread safety (even before this patch the code was meant to be thread safe).

Bitrise does not seem to support the previous version anymore.
@PrepareForTest({DefaultChannel.class, IdHelper.class, DeviceInfoHelper.class, DatabasePersistenceAsync.class, MobileCenterLog.class})
public class AbstractDefaultChannelTest {

static final String TEST_GROUP = "group_test";
Copy link
Contributor

@jaeklim jaeklim Jan 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protected for all defaults?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code analysis would complain since they are in same package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't complain (I tried). We can change that later if we have any sub-classes in other packages.

@guperrot guperrot merged commit 2b65219 into develop Jan 17, 2017
@guperrot guperrot deleted the fix_channel_race_conditions branch January 17, 2017 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants