Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nativeSetIntTicketToken JNI implementation in android_build #1218

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

TedaLIEz
Copy link
Contributor

@TedaLIEz TedaLIEz commented Oct 11, 2023

nativeSetIntTicketToken in LogManagerProvider.java is incomplete in the last PR: #1131

This PR aims:

  1. Implementation of nativeSetIntTicketToken in JNI, which turns out to be the setTicketToken in ILogManager.java
  2. Thread-safe in AuthTokensController.cpp to avoid race condition, we found some crashes because of race condition when using this sdk in Android system: Scudo ERROR: race on chunk header

Why adding locks in AuthTokensController.cpp?

We get the token after the app log in one account (MSA, AAD, etc), and our application is a multi-account application. So currently our solution is we cache the token once we get from the server, then every time client calls the logEvent method, it will set the ticket token internally. We don't have the dependencies on logger instanceswhen we in the process of getting token.

We can't guarantee all the logEvent method will be called only in main thread only, so this introduces the multi-threading issues.

@TedaLIEz
Copy link
Contributor Author

@lalitb Would you like to review this PR? We want to leverage this API to support our telemetry, thanks!

lalitb
lalitb previously approved these changes Oct 12, 2023
Copy link
Contributor

@lalitb lalitb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. But good to have approval from @anod too.

@@ -24,6 +24,7 @@ namespace MAT_NS_BEGIN {

status_t AuthTokensController::SetTicketToken(TicketType type, char const* tokenValue)
{
LOCKGUARD(m_lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect SetTicketToken to be called after SDK bootstrapping, from multiple thread ? If so, we should also lock in the getter methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, SetTicketToken will be called in multiple threads after we get the logmanager instance. Do you mean we should lock in the GetAuthTokensController method?

getLogManager is already guarded with a lock:

static ILogManager* getLogManager(jlong nativeLogManager)
{
    std::lock_guard<std::mutex> lock(jniManagersMutex);
    if (nativeLogManager < 0 || nativeLogManager >= jniManagers.size())
    {
        return nullptr;
    }

    return jniManagers[nativeLogManager]->manager;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means we need to lock GetUserTokent(), GetDevicetokens(), GetTickets() methods too. And these methods are called in event upload hot-path. These locks will add the latency in the event upload.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxgolov Do you see any issue in adding the mutex lock in GetUserTokent(), GetDevicetokens(), and GetTickets() methods - which are called in the hot path of event upload?

Copy link
Contributor

@maxgolov maxgolov Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need to apply a bit more holistic approach to mutex here:

  • mutex works just for the moment while the reference is obtained. But then the collection is returned by reference -> allows for potential concurrent modification and breakage outside of the function call.
  • status_t AuthTokensController::Clear() then also needs a mutex.

Ideally the Get* methods should be copying the contents into collection passed by the caller. Then you can use a mutex to guarantee that the copy of this collection is done under the mutex. Otherwise the way as it's currently implemented in the PR - is all unreliable, unfortunately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@lalitb lalitb requested a review from anod October 12, 2023 02:40
@lalitb
Copy link
Contributor

lalitb commented Oct 12, 2023

The CI failures are not related to this PR.

@lalitb lalitb dismissed their stale review October 12, 2023 03:25

Need to discuss further on mutex lock.

@TedaLIEz TedaLIEz requested a review from a team as a code owner October 12, 2023 18:34
@@ -64,16 +65,19 @@ namespace MAT_NS_BEGIN {

std::vector<std::string>& AuthTokensController::GetTickets()
{
LOCKGUARD(m_lock);
Copy link
Contributor

@maxgolov maxgolov Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mutex here is kinda pointless: since the reference is returned, outside code could perform concurrent modification of the vector contents without a lock......

@@ -749,6 +749,7 @@ namespace MAT_NS_BEGIN

IAuthTokensController* LogManagerImpl::GetAuthTokensController()
{
LOCKGUARD(m_lock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this one is good. Please check if it's recursive_mutex, and you'd need to trace the flow where the function is called in the SDK. I honestly don't remember where (it's been 4+ years since I looked into this code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants