Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Converting GeoJsonSource features asynchronously #12580

Merged
merged 2 commits into from
Aug 21, 2018

Conversation

LukasPaczos
Copy link
Member

@LukasPaczos LukasPaczos commented Aug 8, 2018

Refs #8484.

When updating the GeoJsonSource, one of the most resources consuming work is the conversion between Android features and mbgl ones.

To make the UX better when updating big data sets, this PR introduces GeoJsonSource#setGeoJsonAsync which performs the conversion on a worker thread and the returns back to the core flow and notifies the listener when that flow is finished with OnGeoJsonSourceLoadedListener#onGeoJsonSourceLoaded at the point where the synchronous method would return.

Examples of improvements:

UI thread time Overall time Time saved
LineString geometry ~80k points 1662ms 2866ms ~42%
FeatureCollection ~50k features 204ms 1169ms ~83%

The trend continues that for simple geometries time spent on the UI thread is about 50% shorter and for more complicated collections about 80% shorter.

This is still a work-in-progress:

  • resolve issues with setting raw json string that throws a JNI exception
  • add tests for the new methods
  • polish docs

@LukasPaczos LukasPaczos added the Android Mapbox Maps SDK for Android label Aug 8, 2018
@LukasPaczos LukasPaczos requested review from ivovandongen and asheemmamoowala and removed request for ivovandongen August 8, 2018 18:08
* by {@link OnGeoJsonSourceLoadedListener#onGeoJsonSourceLoaded()} when the job is in the same state that
* a synchronous update would return at.
* <p>
* There can only be one asynchronous update running. Any overlapping ones will be rejected.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On gl-js we moved to a "coalescing" model for high frequency asynchronous setData calls (mapbox/mapbox-gl-js#5902). It might be a nice strategy to use here to make this interface a little more ergonomic -- instead of the caller having to worry about whether there's one in progress, they can rely on these guarantees:

  • Any currently running request will run to completion
  • The last request submitted will run to completion
  • Requests in the middle may get discarded without any action required on the part of the caller

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a great suggestion @ChrisLoer ! @LukasPaczos I think the coalescing model would be useful for #12509 as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea, let me try to find a way of doing that.

android::UniqueEnv _env = android::AttachEnv();

// Update the core source
source.as<mbgl::style::GeoJSONSource>()->GeoJSONSource::setGeoJSON(geoJSON);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GeoJSONSource::setGeoJSON is itself asynchronous, which makes me wonder whether we really need to define this as an asynchronous API at the java level, or if we can just update the behavior of the existing java setGeoJSON, which is already implicitly asynchronous. What logic would we expect someone to implement on top of the onSourceLoaded callback?


collectionRef.release();
callback.release();
callback = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More idiomatic way to do these two lines is just: callback.reset()

// Update the core source
source.as<mbgl::style::GeoJSONSource>()->GeoJSONSource::setGeoJSON(geoJSON);

collectionRef.release();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't actually clean up any resource: release() tells the unique_ptr that you're manually taking ownership of the resource. You want reset() instead.

https://en.cppreference.com/w/cpp/memory/unique_ptr/release

OnGeoJsonSourceLoadedListener::notifySourceLoaded(*_env, listener);
});

featureRef = jFeature.NewGlobalRef(env);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see, the reason the featureRef has to live until the callback is really just that it has to live until convertFeature is done, but we can't free the UniqueObject from a different thread than the one it was created, right?

Instead of making featureRef etc. members of GeoJSONSource and manually managing the state, one thing you might try is to make a shared_ptr to the UniqueObject, and then capture it in the callback lambda. The lambda will hold onto a copy of the shared_ptr, and then when the callback finishes it will free the resource without you having to do any further management.

So something like:

std::shared_ptr<jni::UniqueObject<type>> featureRef = jFeature.NewGlobalRef(env);
callback = ...
  [this, featureRef] {
    ...  do unrelated stuff
    featureRef.reset(); // This would actually happen implicitly at the end of the block, but since
                        // you're not otherwise using featureRef it's nice to be explicit
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we can't free the UniqueObject from a different thread than the one it was created, right?

That's a great question I don't know the answer to, but anyway, capturing the ref in the lambda looks clean, thanks!

@@ -96,6 +197,39 @@ namespace android {
source.as<mbgl::style::GeoJSONSource>()->GeoJSONSource::setGeoJSON(GeoJSON(geometry));
}

void GeoJSONSource::setGeometryAsync(jni::JNIEnv& env, jni::Object<geojson::Geometry> jGeometry,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Templating these functions should give you a great/straightforward way to avoid all this duplicated code for geometry/feature/featureCollection/etc.

How it actually looks will depend on a lot of the other changes I'm suggesting, but say you still have a public setGeometryAsync method. You could implement that as something like:

void GeoJSONSource::setGeometryAsync(jni::JNIEnv& env, jni::Object<geojson::Geometry> jGeometry) {
   setAsync(env, geometry);
}

template <class JNIType>
void GeoJSONSource::setAsync(jni::JNIEnv& env, jni::Object<JNIType> jObject) {
  std::shared_ptr<jni::UniqueObject<JNIType>> jObjectRef = ...;
   ... all the previously duplicated code
   converter->self().invoke(&FeatureConverter::convert<JNIType>, **jObjectRef, callback->self());
}

@LukasPaczos
Copy link
Member Author

Re. #12580 (comment)

GeoJSONSource::setGeoJSON is itself asynchronous, which makes me wonder whether we really need to define this as an asynchronous API at the java level, or if we can just update the behavior of the existing java setGeoJSON, which is already implicitly asynchronous. What logic would we expect someone to implement on top of the onSourceLoaded callback?

My impression was that the rendering itself is really quick and most of the time spent during the update is the features conversion and tiles generation (?) which is synchronous. With this in mind, when updating the source, we can be fairly certain that when the synchronous method returns, the changes will be visible in a couple of milliseconds time tops. That's why I opted for a separate method with a callback which can be used to start other UI updates.

However, if the general idea is that we should let the map do its asynchronous job by itself and users should not worry about the visibility of the data after the update (it will be there for sure at some point), which is as valid point, I'm happy to just update the old methods. Let me know what you think.

@ChrisLoer
Copy link
Contributor

In that case I'd lean pretty heavily towards not adding an extra asynchronous interface. From an API user's point of view, it doesn't really matter that much whether an asynchronous method takes half a millisecond or 50 milliseconds -- the point is that you can't write code that makes a change and then depends on that change immediately being there. The classic GL case where users have to think about the rendering being asynchronous is the gap between making a change and when the change shows up in queryRenderedFeatures. Even when that gap is very small, code that relies on the change being synchronous will break. Also, the gap isn't always that small -- it can take hundreds of milliseconds before symbol changes show up in query results.

If we were introducing the "you can't have multiple callbacks in flight at once" constraint, you might need to make an async version just to avoid breaking current users of the synchronous version. But I think coalescing neatly solves that problem -- although it is technically a change to the behavior of the current API, I think we can feel pretty confident it matches with the user's original intent. An alternative that stays even closer to the original behavior would be to maintain a queue of requests (this is essentially what GeometryTile/GeometryTileWorker does with its message queue)... but I think coalescing is better in this case.

Since there's not a really strong distinction between the behaviors of the "async" and the "sync" APIs, we definitely shouldn't make the user try to figure out all the subtle differences that might make them choose one over the other. Better to just not make them think and say "hooray, we made this work better for large data sets!"

@LukasPaczos
Copy link
Member Author

Thanks for the opinion @ChrisLoer.

The classic GL case where users have to think about the rendering being asynchronous is the gap between making a change and when the change shows up in queryRenderedFeatures. Even when that gap is very small, code that relies on the change being synchronous will break.

This is key and it makes me think that we shouldn't introduce another interface as well.

@LukasPaczos LukasPaczos force-pushed the lp-async-geojson-wip branch 2 times, most recently from 7b942f2 to 084c5f4 Compare August 13, 2018 12:36
@LukasPaczos
Copy link
Member Author

This one is ready for another round.

Copy link
Contributor

@ChrisLoer ChrisLoer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great set of improvements, I think it's coming together! You can still whittle down some of the template boiler plate code.

I haven't looked at the tests at all yet.

@@ -188,7 +188,8 @@ public GeoJsonSource(String id, Geometry geometry, GeoJsonOptions options) {
}

/**
* Updates the GeoJson with a single feature
* Updates the GeoJson with a single feature. The update is mainly performed asynchronously,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the qualifier "mainly" seems unnecessary and maybe slightly confusing.

// which in practice means, that any update that started processing is going to finish
// and the last scheduled update is going to finish as well. Any updates scheduled during processing can be canceled.
// Conversion from Java features to core ones is done on a worker thread and once finished,
// the control of the updated is passed to the core on the main thread.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a little trouble reading this on the first pass

"the control of the updated is passed to the core on the main thread"

Maybe that could be something like

" ownership of the converted features is returned to the calling thread"

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely sounds better, thanks!

template <class JNIType>
void GeoJSONSource::setCollectionAsync(jni::JNIEnv& env, jni::Object<JNIType> jObject) {

std::shared_ptr<jni::jobject> object = std::shared_ptr<jni::jobject>(jObject.NewGlobalRef(env).release()->Get(), GenericGlobalRefDeleter());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Did you also consider just using std::make_shared<jni::UniqueObject>? That way you wouldn't have to worry about the release/get/custom-deleter logic because jni::UniqueObject would take care of it all. I know it's a little weird to have a shared_ptr to a unique_ptr, but I can't think of a real problem with doing it that way...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capturing from a convo that we can't use std::make_shared here as we need to pass a custom destructor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think specifically what's happening is that shared_ptr has a move contructor that takes a unique_ptr as an argument (see #13 at https://en.cppreference.com/w/cpp/memory/shared_ptr/shared_ptr). Anyway, your current approach seems fine.


// no updates are being process, execute this one
update = std::move(awaitingUpdate);
awaitingUpdate.reset();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reset should be unnecessary because the move assignment will have already emptied awaitingUpdate.

return;
}

// no updates are being process, execute this one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "being processed" or "in process"

Update::~Update() = default;

CollectionUpdate::CollectionUpdate(std::shared_ptr<jni::jobject> object, std::unique_ptr<Actor<Callback>> callback) {
this->object = object;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we usually do foo = foo_ instead of this->foo = foo.

};

struct CollectionUpdate : Update {
std::shared_ptr<jni::jobject> object;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have the separate CollectionUpdate and StringUpdate classes here just to handle the difference in holding on to a different type of object? I think you can probably just get rid of this by capturing the object by value in the lambda (e.g. concentrate all the templating logic on just making different kinds of update function that all have the same signature and capture everything they need to run on their own, and then the Update class becomes nothing more than a simple holder for the updateFn and the callback). I haven't thought this through entirely, but I think you can still get to significantly less code. The templating changes so far definitely look like moving in the right direction, though!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. I was able to clean up the templates mess and I'm trying to differentiate the Updates only by the lambda that schedules the conversion.

@LukasPaczos LukasPaczos force-pushed the lp-async-geojson-wip branch 2 times, most recently from bb5c172 to 7f55204 Compare August 14, 2018 11:57
callback.invoke(&Callback::operator(), GeoJSON(geometry));
}

Update::Update(Converter _converterFn, std::unique_ptr<Actor<Callback>> _callback) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a class, you can do the initialization as

... )
, converterFn(_converterFn)
, callback(std::move(_callback)
{}

This is mostly just a matter of style although it can save a tiny bit of work (with the assignment in the body of the constructor, you have to default-initialize the values first and then assign them).

}

source.setGeoJson(Point.fromLngLat(20, 55));
uiController.loopMainThreadForAtLeast(1000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 second timeouts in tests always worry me because eventually the test runs in some (temporarily) really slow environment and fails, and then causes a lot of confusion. Is there a way to explicitly wait for a "map done rendering" event before querying? If not, it's not too big a deal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to hook into map callbacks before but that wasn't reliable either, or maybe I just wasn't looking at the right things at the time. This is, unfortunately, an approach we've taken with most of the tests that need to wait for the renderer and it's definitely worth exploring for alternatives.

Copy link
Contributor

@ChrisLoer ChrisLoer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ready to go. 🙂 🚀

Caveat: I normally like to run tests in a debugger and also exercise code in a test app. I haven't done that here basically because setting up an Android test environment is pretty slow for me and I haven't had time. It would be good to get someone else from the Android team to exercise this.

@LukasPaczos LukasPaczos added this to the android-v6.5.0 milestone Aug 17, 2018
Copy link
Member

@tobrun tobrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Android Mapbox Maps SDK for Android
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants