New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreaded Job based Terrain Patch generation. #2163

Merged
merged 90 commits into from May 6, 2013

Conversation

Projects
None yet
6 participants
@fluffyfreak
Contributor

fluffyfreak commented Mar 27, 2013

Description:

There's a few things here, job, multithreading, memory usage reduction.

JobSwarm:

I've integrated a simple multithreading job library called JobSwarm and then I've parcelled up the terrain patch generation made it use as many hardware threads as you can throw at it :)

Multithreading:

The library can be used for many other tasks. I've only implemented what I need for the terrain system.

Memory Usage Reduction:

Sat on Earth with the terrain detail to max I've reduced the footprint from ~700MiB to ~280MiB. Most of this was done by storing only the height instead of the full vector3d of positions, then reconstructing it in UpdateVBOs - a 2/3rds saving.
Normals are single precision floats so a 50% saving. The colour is a Color3ub (3 bytes) instead of vector3d (24 bytes) so there's another 21 byte saved PER VERTEX!!!

I think there's more that can be done here, the normal can be packed into 2 16bit values stored in a uint32 because it doesn't need all that precision, that could then be reconstructed either in UpdateVBOs. Or it could be turned into a texture and used in the shader to reconstruct the full normal. See this article on options. The heightmap could probably be done as a 32bit float too if I subtracted the clipCentroid length at generation time instead of leaving it until UpdateVBOs.

Problems:

A lack of testing, I've done what I can, going in and out of the game, exiting and restarting etc. Going from planet to planet, system to system and it all seems robust... however.
Whenever I've worked on multithreaded tech' before it's never during development that you see the problems, only when people are using it so it needs others to take a look.

On my quad core i7 with 8 hardware threads this thing is much faster at generating the terrain. On my dual core laptop, it might actaully be a bit slower :/

Summary:

Summary? Um, it's great? please test! Kthxbye!

Andy

fluffyfreak added some commits Mar 6, 2013

Add a jobswarm project and files/folder.
Set it up within the Pi class environment.
Add the GeoPatchID files.
Integrate GeoPatchID with GeoPatch.
Essentially begin mini-migration over to seeNoEvil_00 branches development.
Work-in-Progress.
Adding the jobswarm job itself whilst figuring out to get it access to the data it needs from the terrain and geosphere instances without pointers etc.
WIP - getting there, think I've got the initial job setup correct, th…
…en the PatchJob handling logic will probably work. Receiving the completed jobs? Not so much.
Enable many multi-core in JobSwarm instead of just 1, silly oversight.
Finally actually get a hacked version working using N-cores (6 I think).
Crash due to patches being deleted when their children are waiting for data to be generated.
Pi.cpp - try and wait until all jobs complete before shutting down.
GeoSphere, remove all of the threading stuff.
So many other little changes.
Split the GeoSphere stuff into GeoPatch and GeoPatchContext files.
Lots of tweaks and changes but still massive problems with Release mode :(
Re-enable the full depth of the quad tree.
Optional wireframe definition.
Removed use to deque for SplitRequestData, just an array now.
Move some stuff around.
Still have the memory consumption bug.
This fixes the memory consumption bug by making sure that we have a v…
…alid position for the camera before allowing it to generate any terrain.

Also fixed a couple of wonky line endings in the GeoPatch.h
A lot of changes, all interlinked.
Made use of ScopedPtr<> and ScopedArray<> for a lot more pointer types to try and track down some apparent memory leaks caused by cancelling jobs.
Added cancelling of jobs. Seems to work but I'm not happy with it yet.
Removed ability to cancel terrain jobs.
Added a class static flag to abort all patch jobs.
Added base class for patch jobs, derived single and quad patch jobs.
Changed ALL terrain height and colour to be const to make calling them cleaner.
Made Pi::pJobs into a ScopedPtr so it get's cleaned up at program exit, no longer delete it when leaving a game.
Generate more heights & vertices than we need in the GenerateMesh met…
…hod to create a 1 vertex border. Sample this to get the correct normals and colours. Save the core, borderless, heights and discard the rest.

Inefficient but effective.
Removed all of the vertex copy, edge normal and colour generation from the main thread. Simplified the code a fair bit.
Added new Color3ub class, save extra 1 byte per-vertex for terrain pa…
…tches.

Moved the PatchJobs into GeoPatchJobs.h/cpp files.
Cleaned up the split/merge code limiting.
Removed more dead methods/functions.
Early out of the GenerateMesh in BasePatchJob when aborting patch creation.
A lot of whitespace noise, sorry.
Stop passing the GeoSphere pointer, return results by the SystemPath and a static method - all done in main thread from job_OnFinish method.
Removed a rogue pragma optimize off.
Hacking in UI to limit number of threads used.
Non-functional with this commit.
Merge branch 'master' of https://github.com/pioneerspacesim/pioneer i…
…nto JustLookingAgain

Conflicts:
	win32/vc2012/pioneer.vcxproj.filters
@fluffyfreak

This comment has been minimized.

Show comment
Hide comment
@fluffyfreak

fluffyfreak Mar 31, 2013

Contributor

Going to update this to master later.

Contributor

fluffyfreak commented Mar 31, 2013

Going to update this to master later.

Merge branch 'master' of git://github.com/pioneerspacesim/pioneer int…
…o JustLookingAgain

Conflicts:
	src/GeoSphere.cpp
	src/GeoSphere.h
@fluffyfreak

This comment has been minimized.

Show comment
Hide comment
@fluffyfreak

fluffyfreak Apr 28, 2013

Contributor

Okay, just pushed a fix for that issue.
Hyperspacing seems to work fine, terrain generation is still faster than the old single thread (much faster).
I'd like to find a way to avoid some of those mutexes, semaphores and conditions which are causing a lot of waiting... but I might tackle that this afternoon once I'm back from seeing Iron Man 3 ;)

Contributor

fluffyfreak commented Apr 28, 2013

Okay, just pushed a fix for that issue.
Hyperspacing seems to work fine, terrain generation is still faster than the old single thread (much faster).
I'd like to find a way to avoid some of those mutexes, semaphores and conditions which are causing a lot of waiting... but I might tackle that this afternoon once I'm back from seeing Iron Man 3 ;)

@fluffyfreak

This comment has been minimized.

Show comment
Hide comment
@fluffyfreak

fluffyfreak Apr 28, 2013

Contributor

Switched this PR over to use the robn:JobQueue work.

It's up to date with master, needs testing on Linux and Mac (preferrably).

Contributor

fluffyfreak commented Apr 28, 2013

Switched this PR over to use the robn:JobQueue work.

It's up to date with master, needs testing on Linux and Mac (preferrably).

@robn

This comment has been minimized.

Show comment
Hide comment
@robn

robn Apr 28, 2013

Member

Ahh, this didn't post - I lost power to part of my house and lost internet, but didn't realise. So this is slightly behind, but here we go...

I think switching now is fine. We're shouldn't rush this sort of thing, but with the release schedule changes there's certainly no reason to try and go too fast.

I'm not necessarily saying that we should use RobQueue (stupid name, screw you!). It was just an approach that didn't seem quite as messy as JobSwarm.

I moved the initialisation down because it theoretically should go under SDL initialisation if its going to use SDL threads (I say theoretically because SDL threads don't technically require it, but it reads nicer). That's one reason. The other was that I preferred a complete thread teardown at game exit because it didn't have a "cancel all threads" mechanism. But we've determined that we need that, so using the same queue and initialising it early should be fine.

Btw, I'm going to try and do the cancellation this week. I say "try" because there's release process retooling and savefile stuff to do this week as well.

Member

robn commented Apr 28, 2013

Ahh, this didn't post - I lost power to part of my house and lost internet, but didn't realise. So this is slightly behind, but here we go...

I think switching now is fine. We're shouldn't rush this sort of thing, but with the release schedule changes there's certainly no reason to try and go too fast.

I'm not necessarily saying that we should use RobQueue (stupid name, screw you!). It was just an approach that didn't seem quite as messy as JobSwarm.

I moved the initialisation down because it theoretically should go under SDL initialisation if its going to use SDL threads (I say theoretically because SDL threads don't technically require it, but it reads nicer). That's one reason. The other was that I preferred a complete thread teardown at game exit because it didn't have a "cancel all threads" mechanism. But we've determined that we need that, so using the same queue and initialising it early should be fine.

Btw, I'm going to try and do the cancellation this week. I say "try" because there's release process retooling and savefile stuff to do this week as well.

fluffyfreak added some commits Apr 28, 2013

Create a finished queue and lock per-thread.
Set a maximum number of threads to be 64 - for now anyway.
Add a thread id to each job runner to use with the above.
@fluffyfreak

This comment has been minimized.

Show comment
Hide comment
@fluffyfreak

fluffyfreak Apr 29, 2013

Contributor

RobQueue I should definitely have changed the filenames :D

Anyway I think I've found an optimisation to do with the finish lock.
There was only one finished lock but I had 7 threads + main all trying to access it which is one reason it was waiting so much.
Instead I've added a finished job queue and lock per-thread and the main thread only tries to take a single job from each finish queue per-update. Not sure if taking only one job at a time is a good idea or not but will try it out.

Contributor

fluffyfreak commented Apr 29, 2013

RobQueue I should definitely have changed the filenames :D

Anyway I think I've found an optimisation to do with the finish lock.
There was only one finished lock but I had 7 threads + main all trying to access it which is one reason it was waiting so much.
Instead I've added a finished job queue and lock per-thread and the main thread only tries to take a single job from each finish queue per-update. Not sure if taking only one job at a time is a good idea or not but will try it out.

@fluffyfreak

This comment has been minimized.

Show comment
Hide comment
@fluffyfreak

fluffyfreak May 4, 2013

Contributor

Updated to master.

Tested, solid as a very solid thing.
In all the hyperspacing, loading/saving etc it's been utterly bulletproof so far.

Don't know what more needs doing, personally I think it's as ready as it can be.

Contributor

fluffyfreak commented May 4, 2013

Updated to master.

Tested, solid as a very solid thing.
In all the hyperspacing, loading/saving etc it's been utterly bulletproof so far.

Don't know what more needs doing, personally I think it's as ready as it can be.

@shadmar

This comment has been minimized.

Show comment
Hide comment
@shadmar

shadmar May 5, 2013

Contributor

Great work, New Hope takes about 24 seconds to get down full detail with everything on max (about 80-90 seconds before)
Found no problems.

Quad Core i7 something using hyperthreading.

Contributor

shadmar commented May 5, 2013

Great work, New Hope takes about 24 seconds to get down full detail with everything on max (about 80-90 seconds before)
Found no problems.

Quad Core i7 something using hyperthreading.

@fluffyfreak

This comment has been minimized.

Show comment
Hide comment
@fluffyfreak

fluffyfreak May 6, 2013

Contributor

Could probably go even faster if we could control the thread priority. Then we could have as many threads as there are cores but make the worker threads lower priority so they don't starve the main thread.

Contributor

fluffyfreak commented May 6, 2013

Could probably go even faster if we could control the thread priority. Then we could have as many threads as there are cores but make the worker threads lower priority so they don't starve the main thread.

@robn

This comment has been minimized.

Show comment
Hide comment
@robn

robn May 6, 2013

Member

Not forgotten. I'm short on time at the moment and what I have I'm spending on the new savefile stuff.

If you're happy with the skanky cancellation as it is, then we can probably merge this.

Member

robn commented May 6, 2013

Not forgotten. I'm short on time at the moment and what I have I'm spending on the new savefile stuff.

If you're happy with the skanky cancellation as it is, then we can probably merge this.

@fluffyfreak

This comment has been minimized.

Show comment
Hide comment
@fluffyfreak

fluffyfreak May 6, 2013

Contributor

I didn't think we could control priority with the SDL_threads stuff?
Anyway yeah I know your busy :)

It seems solid so unless someone has something else they want doing too it then yes I;d be delighted to see it merged :)

Contributor

fluffyfreak commented May 6, 2013

I didn't think we could control priority with the SDL_threads stuff?
Anyway yeah I know your busy :)

It seems solid so unless someone has something else they want doing too it then yes I;d be delighted to see it merged :)

@robn

This comment has been minimized.

Show comment
Hide comment
@robn

robn May 6, 2013

Member

SDL 1.2 has no control over thread priority. SDL 2 does. I vote for waiting until we switch to that (high on my list, though not much point until the UI is ported otherwise we'll be rewriting a whole bunch of stuff only to rip it out again. But I digress...)

Ok, lets merge it :)

Member

robn commented May 6, 2013

SDL 1.2 has no control over thread priority. SDL 2 does. I vote for waiting until we switch to that (high on my list, though not much point until the UI is ported otherwise we'll be rewriting a whole bunch of stuff only to rip it out again. But I digress...)

Ok, lets merge it :)

fluffyfreak added a commit that referenced this pull request May 6, 2013

Merge pull request #2163 from fluffyfreak/JustLookingAgain
Multithreaded Job based Terrain Patch generation.

@fluffyfreak fluffyfreak merged commit 4c668fb into pioneerspacesim:master May 6, 2013

@fluffyfreak

This comment has been minimized.

Show comment
Hide comment
@fluffyfreak

fluffyfreak May 6, 2013

Contributor

DONE!!! :)

Contributor

fluffyfreak commented May 6, 2013

DONE!!! :)

@walterar

This comment has been minimized.

Show comment
Hide comment
@walterar

walterar May 6, 2013

Contributor

Auch! :)

CXX SystemPath.o
In file included from ./../Pi.h:16:0,
from StarSystem.cpp:9:
./../JobQueue.h:33:2: error: se debe usar una llave-clase cuando se declara un friend
./../JobQueue.h:33:2: error: la declaración friend no nombra una clase o función
./../JobQueue.h:34:2: error: se debe usar una llave-clase cuando se declara un friend
./../JobQueue.h:34:2: error: la declaración friend no nombra una clase o función
In file included from ./../Pi.h:16:0,
from Galaxy.cpp:7:
./../JobQueue.h:33:2: error: se debe usar una llave-clase cuando se declara un friend
./../JobQueue.h:33:2: error: la declaración friend no nombra una clase o función
./../JobQueue.h:34:2: error: se debe usar una llave-clase cuando se declara un friend
./../JobQueue.h:34:2: error: la declaración friend no nombra una clase o función
make[2]: *** [StarSystem.o] Error 1

Contributor

walterar commented May 6, 2013

Auch! :)

CXX SystemPath.o
In file included from ./../Pi.h:16:0,
from StarSystem.cpp:9:
./../JobQueue.h:33:2: error: se debe usar una llave-clase cuando se declara un friend
./../JobQueue.h:33:2: error: la declaración friend no nombra una clase o función
./../JobQueue.h:34:2: error: se debe usar una llave-clase cuando se declara un friend
./../JobQueue.h:34:2: error: la declaración friend no nombra una clase o función
In file included from ./../Pi.h:16:0,
from Galaxy.cpp:7:
./../JobQueue.h:33:2: error: se debe usar una llave-clase cuando se declara un friend
./../JobQueue.h:33:2: error: la declaración friend no nombra una clase o función
./../JobQueue.h:34:2: error: se debe usar una llave-clase cuando se declara un friend
./../JobQueue.h:34:2: error: la declaración friend no nombra una clase o función
make[2]: *** [StarSystem.o] Error 1

@walterar

This comment has been minimized.

Show comment
Hide comment
@walterar

walterar May 6, 2013

Contributor

ping @fluffyfreak @robn

Just to be sure you will see this. :)

Contributor

walterar commented May 6, 2013

ping @fluffyfreak @robn

Just to be sure you will see this. :)

@laarmen

This comment has been minimized.

Show comment
Hide comment
@laarmen

laarmen May 6, 2013

Contributor

Error messages in Spanish aren't that useful for a dev team
that uses English as its only working language.

Next time, please run the command prepended by LANG=C to have the output
in English.

Contributor

laarmen commented May 6, 2013

Error messages in Spanish aren't that useful for a dev team
that uses English as its only working language.

Next time, please run the command prepended by LANG=C to have the output
in English.

robn added a commit to robn/pioneer that referenced this pull request May 6, 2013

@walterar

This comment has been minimized.

Show comment
Hide comment
@walterar

walterar May 6, 2013

Contributor

Thanks @laarmen

@robn I updated with e0215f4 and now get this:

CXX StarSystem.o
In file included from ./../Pi.h:16:0,
from Galaxy.cpp:7:
./../JobQueue.h:38:2: error: a class-key must be used when declaring a friend
./../JobQueue.h:38:2: error: friend declaration does not name a class or function
./../JobQueue.h:39:2: error: a class-key must be used when declaring a friend
./../JobQueue.h:39:2: error: friend declaration does not name a class or function
In file included from ./../Pi.h:16:0,
from StarSystem.cpp:9:
./../JobQueue.h:38:2: error: a class-key must be used when declaring a friend
./../JobQueue.h:38:2: error: friend declaration does not name a class or function
./../JobQueue.h:39:2: error: a class-key must be used when declaring a friend
./../JobQueue.h:39:2: error: friend declaration does not name a class or function
make[2]: *** [Galaxy.o] Error 1

Contributor

walterar commented May 6, 2013

Thanks @laarmen

@robn I updated with e0215f4 and now get this:

CXX StarSystem.o
In file included from ./../Pi.h:16:0,
from Galaxy.cpp:7:
./../JobQueue.h:38:2: error: a class-key must be used when declaring a friend
./../JobQueue.h:38:2: error: friend declaration does not name a class or function
./../JobQueue.h:39:2: error: a class-key must be used when declaring a friend
./../JobQueue.h:39:2: error: friend declaration does not name a class or function
In file included from ./../Pi.h:16:0,
from StarSystem.cpp:9:
./../JobQueue.h:38:2: error: a class-key must be used when declaring a friend
./../JobQueue.h:38:2: error: friend declaration does not name a class or function
./../JobQueue.h:39:2: error: a class-key must be used when declaring a friend
./../JobQueue.h:39:2: error: friend declaration does not name a class or function
make[2]: *** [Galaxy.o] Error 1

robn added a commit to robn/pioneer that referenced this pull request May 6, 2013

Zordey pushed a commit to Zordey/pioneer that referenced this pull request Nov 21, 2013

Merge pull request #2163 from fluffyfreak/JustLookingAgain
Multithreaded Job based Terrain Patch generation.

Zordey pushed a commit to Zordey/pioneer that referenced this pull request Nov 21, 2013

Zordey pushed a commit to Zordey/pioneer that referenced this pull request Nov 21, 2013

@fluffyfreak fluffyfreak deleted the fluffyfreak:JustLookingAgain branch Jan 25, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment