Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OgreTerrainGroup thread safety regarding Terrain creation #587

Merged
merged 3 commits into from
Dec 4, 2017

Conversation

SNiLD
Copy link
Contributor

@SNiLD SNiLD commented Nov 27, 2017

TerrainGroup will use Terrain instance inside thread but allows destroying it while it's being processed in the thread.

@SNiLD
Copy link
Contributor Author

SNiLD commented Nov 27, 2017

This change also kind of makes LoadRequest::loadingTaskNum redundant but I don't want to remove it so that backwards compatibility does not break.

@paroj
Copy link
Member

paroj commented Nov 27, 2017

I am not very familiar with the Terrain component API. Is TerrainGroup supposed to be inherited from? If not you can remove LoadRequest::loadingTaskNum as it is protected.

if (it != mTerrainLoadRequests.end())
{
// We would like to abort the request here but WorkQueue does not provide proper tools for this
// (the data we need will be destroyed if we call abortRequest()).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which data are you referring to here? As far as I can see abortRequest() will only delete ResourceRequest::loadParams

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are probably referring to the TerrainSlot*. Would calling abortRequest() and immediately deleting it here work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're correct that the problem is TerrainSlot*. It can't be deleted however since Terrain* would also be deleted and that is used in the thread. If WorkQueue provided a method to abort request if it was not being processed right now it would be fine. That way we would have tools to abort all other requests except the one that's currently being processed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WorkQueue has function abortPendingRequestsByChannel(uint16 channel) so if there was function like abortPendingRequest(RequestID id) as well it could be done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there was function like abortPendingRequest(RequestID id) as well it could be done.

this will probably be useful for others as well. So I would prefer this solution over the currently proposed workaround.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make this change but currently I'm a bit busy so it might take a few days.

@SNiLD
Copy link
Contributor Author

SNiLD commented Nov 28, 2017

I think @sinbad meant TerrainGroup to be inherited from. Many of the important functions like define*() are virtual.

@paroj
Copy link
Member

paroj commented Nov 28, 2017

I think @sinbad meant TerrainGroup to be inherited from. Many of the important functions like define*() are virtual.

I would not assume that this is intentional. At some point he just made everything virtual (Java style).
A quick look at the samples and Ogitor showed that neither are inheriting from TerrainGroup. I assume you do neither - so feel free to consider everything protected to be actually private.

 * Added separate function to abort pending requests.
 * Renamed TerrainLoadRequests to TerrainPrepareRequests.
 * Removed obsolete LoadRequest::loadingTaskNum.
@SNiLD
Copy link
Contributor Author

SNiLD commented Dec 4, 2017

This should work now.

Disappointingly I found out that calling Terrain::prepare() from background thread is not safe anyways since another terrain might try to access these resources (height data and quadtree) when they're neighbours. Fix to make Terrain::prepare() thread safe should be a separate issue though.

@SNiLD
Copy link
Contributor Author

SNiLD commented Dec 4, 2017

Actually there could be at least semi-remedy to this if TerrainGroup had disconnectNeighbour() as well like connectNeighbour(). I'll try this as well.

freeTerrainSlotInstance(lreq.slot);
return;
}
freeTerrainSlotInstance(lreq.slot);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be return; after the free here. I'll try if this fixes the neighbour issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that did not fix the issue, unfortunately.

@SNiLD
Copy link
Contributor Author

SNiLD commented Dec 4, 2017

TerrainGroup::disconnectNeighbour() does not make sense since Terrain will remove itself from neighbours when it's destroyed. The only case where it would be needed is when terrain instance prepare is aborted and is currently being processed in TerrainGroup. However once that function returns it will check for mTerrainPrepareRequests and free the slot if it was not found so it is never even connected.

Making Terrain thread safe regarding the neighbour access should be separate issue as stated before.

@paroj
Copy link
Member

paroj commented Dec 4, 2017

👍 from my side. Is this ready to merge?

@SNiLD
Copy link
Contributor Author

SNiLD commented Dec 4, 2017

Yes this is ready.

@paroj paroj merged commit f6b63e4 into OGRECave:master Dec 4, 2017
@SNiLD SNiLD deleted the TerrainGroup-thread-safety branch December 4, 2017 16:34
@paroj
Copy link
Member

paroj commented Dec 5, 2017

after this merge the "Terrain" sample crashes. Can you reproduce?

t == NULL here:

mTerrainGroup->loadAllTerrains(true);
#endif
if (mTerrainsImported)
{
TerrainGroup::TerrainIterator ti = mTerrainGroup->getTerrainIterator();
while(ti.hasMoreElements())
{
Terrain* t = ti.getNext()->instance;

@SNiLD
Copy link
Contributor Author

SNiLD commented Dec 5, 2017

Yes, it crashes for me too. Instances taken from mTerrainGroup->getTerrainIterator() seem to be null. Maybe I've missed something. Let me investigate.

Root::getSingleton().getWorkQueue()->addRequest(
mWorkQueueChannel, WORKQUEUE_LOAD_REQUEST,
Any(req), 0, synchronous);
mTerrainPrepareRequests.insert(TerrainPrepareRequestMap::value_type(slot, id));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the problem line. When synchronous is true it won't go into background thread and there won't be anything in mTerrainPrepareRequests so it will be deleted.

Copy link
Member

@paroj paroj Dec 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local fix:

TerrainPrepareRequestMap::iterator iter =
                mTerrainPrepareRequests.insert(TerrainPrepareRequestMap::value_type(slot, 0)).first;
iter->second = Root::getSingleton().getWorkQueue()->addRequest(
                mWorkQueueChannel, WORKQUEUE_LOAD_REQUEST, Any(req), 0, synchronous);

Copy link
Contributor Author

@SNiLD SNiLD Dec 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't insert to that iterator at that point anymore since it will be invalid at that point. This is how I'd fix it:

            std::pair<TerrainPrepareRequestMap::iterator, bool> ret = mTerrainPrepareRequests.insert(TerrainPrepareRequestMap::value_type(slot, 0));
            assert(ret.second == true);
            WorkQueue::RequestID id =
                Root::getSingleton().getWorkQueue()->addRequest(
                    mWorkQueueChannel, WORKQUEUE_LOAD_REQUEST,
                    Any(req), 0, synchronous);
            if (!synchronous)
                ret.first->second = id;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this fix stops Terrain sample from crashing at start.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah.. yes.. I figured. seems to be properly fixed this way

@paroj
Copy link
Member

paroj commented Dec 5, 2017

inserting in mTerrainPrepareRequests after doing a synchronous request is too late. But even when fixing this, the sample crashes on shutdown. Please investigate.

@SNiLD SNiLD restored the TerrainGroup-thread-safety branch December 5, 2017 11:10
@SNiLD
Copy link
Contributor Author

SNiLD commented Dec 5, 2017

What is the proper way to merge that change as well? I pushed it to the same branch that you've already merged.

@paroj
Copy link
Member

paroj commented Dec 5, 2017

you have to create a new pull-request using the same branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants