Skip to content
This repository has been archived by the owner on Apr 18, 2022. It is now read-only.

Finalize renderer design #19

Closed
ebkalderon opened this issue Jan 29, 2016 · 7 comments
Closed

Finalize renderer design #19

ebkalderon opened this issue Jan 29, 2016 · 7 comments
Assignees
Labels
diff: hard Achievable, but may require efforts from multiple experienced developers. pri: important Something other teams are relying on, or a low-level, critical piece of functionality. type: feature A request for a new feature.
Milestone

Comments

@ebkalderon
Copy link
Member

Though the main priority right now is to stabilize the entity-component-system API first (issue #10), we can finish designing the parallel renderer, with the internal restructuring of the engine into modular crates (issue #13).

As described on the relevant design document on the wiki, our aims are high throughput, data-driven design, optimization for next-gen APIs, demo recording and real-time playback, and network transparency, i.e. for tool slaving).

Please take a look at the drafted renderer design for reference. Feedback is welcome!

Progress will be worked on in the renderer branch.

@ebkalderon ebkalderon added type: feature A request for a new feature. diff: hard Achievable, but may require efforts from multiple experienced developers. pri: important Something other teams are relying on, or a low-level, critical piece of functionality. labels Jan 29, 2016
@ebkalderon ebkalderon added this to the 1.0 milestone Jan 29, 2016
@ebkalderon
Copy link
Member Author

Something not accounted for in the diagram (draft 1.5) is the overhead of IR generation. It is assumed that for each layer, the equivalent IR is generated by looping through each element and building a render list (high level description of how to process an object/light/uniform). Depending on how many elements there are per-layer, this might be unnecessarily slow.

Perhaps for each layer, there should be a work-stealing thread pool building these render lists in parallel? The number of threads in this pool is determined on startup based on the hardware resources available.

@kvark
Copy link
Member

kvark commented Feb 1, 2016

That can be evaluated later on by just replacing for for parallel_for ;)

@ebkalderon
Copy link
Member Author

Posted on February 19, 2016 9:51 PM in Gitter:

I've been researching a ton of stuff about GFX, replay systems, the Rust language itself (I'm still learning, after all). I'm thinking about some more changes to the renderer design, possibly eliminating the need for the IR.

The reason why I proposed an API-agnostic IR in the renderer was to allow quick and easy transmission of frames over a network (similar to RDP or X11), so we can support remote tool slaving. However, I've come to the conclusion that this is impractical for the following reasons:

  1. If the development system is running at 60 FPS and the testbed slave system is running at 30 FPS, then we'll have a problem. This can be solved by limiting the rendering framerate so the two systems match, but this seems very hacky.
  2. Only the visual output is being duplicated. The other aspects of the game, most notably sound, isn't being transferred.
  3. You cannot walk over to the slave system, pick up your HID(s), and playtest right away. The slave is just a "dumb terminal" with no knowledge of the game state or logic.

Instead, I propose to eliminate the IR and for GFX command buffers to be generated directly by the frontend (I see you smiling @kvark!). Tool slaving will be handled by the engine and not by the renderer directly, like so:

  1. The engine is initialized on the master system, and another copy of the engine is initialized on the slave system. Both are identical to one another.
  2. Master initiates a network connection with the slave, the slave accepts.
  3. The master's initial game state, the RNG seed, and world tick interval is sent over the network and used to seed the slave system.
  4. Once the slave system has finished loading this data, it acknowledges to the master to start the game.
  5. The master and slave mutate the world independently. The user is playing the game on the master system, and his/her input signals are captured and sent over the network to the slave.
  6. Assuming the engine is deterministic, this should be enough to mirror the master's state onto the slave (visuals, audio, physics, etc).

Demo recording/replaying (similar to Quake demos or vktrace/vkreplay) works similarly, except the data is serialized to disk instead of being transferred over the network.

This method isn't perfect, and some jittering and stuttering is to be expected, especially if there's no lag compensation, etc. But it's better than what we've got! 😄

@White-Oak
Copy link
Contributor

@ebkalderon this looks nice, but what's the use case of this master-slave system?
I think it adds a lot of complexity, and I'm not really aware of usecases.
Can you give some examples and/or provide with the examples of engines with the similar system?

@ebkalderon
Copy link
Member Author

@White-Oak It allows for people to preview and playtest their games directly on their target platforms (ideally mobile devices or consoles) without needing to deploy them by hand. You can modify your scripts or step through them in a debugger on your development machine and watch the output on your external devices. Any updates to your game's resource files will also propagate over the network to all the slave devices with little user interaction. At any point, you can drop the master-slave connection and hand over the slave devices with the current version of the game to the playtesters.

Two AAA engines I know for sure that have this functionality (there may be more):

Recording and replaying entire game sessions from disk has numerous applications as well:

  1. The most obvious application for developers is the replaying of hard-to-find gameplay glitches or analyzing playtester reactions to your game. These demo files are small and compact and can be easily emailed around by members of the development team. The engine's determinism allows for people to pause, zoom in, rewind, slow down, and otherwise inspect the level as the player progresses.
  2. On the players' side is the efficient creation and transmission of machinima (Quake or Source Filmmaker, anyone?). Demo files let you see the session replayed in real-time in the engine, as opposed to through an AVI or MP4 recording. Recording overhead is also much lower than, say, Fraps since only the players' inputs are captured per frame instead of actual audio/video.

One of the project's goals is having a solid toolset and fostering rapid iteration times. Having such functionality available to the public in a freely available game engine would be kick ass!

@ebkalderon
Copy link
Member Author

I would like to say something about the current renderer design. On February 9, 2016 4:48 PM, I reasoned that the backend and frontend should both be exposed to make implementing networked tool slaving easier.

However, with my recent comment a few days ago about tool slaving being an engine-wide issue, I realize that my original proposal is no longer necessary. It's now possible for the frontend and backend to assume their correct levels of abstraction. These changes should be landing soon on the renderer branch.

@Xaeroxe
Copy link
Member

Xaeroxe commented Sep 15, 2017

Renderer rewrite complete, and this issue has gone quite stale, so I'm closing it.

@Xaeroxe Xaeroxe closed this as completed Sep 15, 2017
CBenoit pushed a commit to CBenoit/amethyst that referenced this issue Apr 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
diff: hard Achievable, but may require efforts from multiple experienced developers. pri: important Something other teams are relying on, or a low-level, critical piece of functionality. type: feature A request for a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants