Skip to content

TeamSystem

djewsbury edited this page Feb 26, 2015 · 2 revisions

Distributed world data file & team workflow systems

Let's imagine a team building an open-world game with an XLE based engine. We should consider may different types of teams: large teams working from a shared office, distributed teams working in an open-source like development model, and very small teams of 1 to 3.

The open-world environment is defined by many separate data files:

  • placements (or static objects placed in the world)
  • simulation data attached to objects (like scripts and properties)
  • pathfinding and graphical helper markup (like safe/unsafe zones, lights and environmental settings)

During the course of a normal day, many people might be authoring bits of data. That leads to some basic work-flow problems.

  • What happens if two different people want to make changes to the same file on the same day?
  • After a person makes some changes, when should that be pushed out to the rest of the team?
  • What if two people make changes to two different files, but they are logically incompatible?
  • eg, one person modifies AI pathfinding data related to a house... But someone else moves the entire house somewhere else
  • what if a small subteam wants to experiment with certain changes before the whole team has access?
  • eg, imagine the design team is working on a new multi-player experimental idea. They want to try it out amongst themselves first. They they might push to the full dev team. Then to a group of trusted testers, and only then to the wider community
  • that full process might take months, but meanwhile other small changes might be pushed out directly to the wider community more quickly

We should also consider cases in which multiple different versions of the world are required:

  • different release territories might require small changes to the world (eg, one version for a Japanese release, and another for a Chinese release)
  • multiple overlapping dev-cycles might be happening simultaneously
  • eg, small maintenance changes to the world may be required
  • while longer term development on an expansion pack takes place

To handle these cases elegantly, we need some ability to merge world data.

Ideal goals

Distributed data

Many of the problems outlined here suggest the need for some kind of "distributed" data storage method. That is distributed in the same way that git is distributed.

So, in the case where a subteam is working with experimental data:

  • the first stage involves one author changing data locally on their single machine
  • when ready, they can push to a subteam server
  • everyone on that subteam can now access that data, and can make changes and collaborate as needed
  • that subteam server can also be updated with changes coming from other teams (via "upstream" servers)
  • when ready, those changes can be pushed into the next, more widely accessable server
  • and so on

Likewise, this style of distributed data is handy when dealing with multiple versions of the world for different release regions.

This kind of distribution model is ideal for common games development work-flows (and very flexible for different variation of the above).

Frequent pushes

One of the best ways to avoid collisions between multiple people working on the same data is to encourage frequent pushes to shared servers.

For example, an older work-flow might look like this:

  • a designer "checks-out" or "locks" a file in the morning
  • they make changes throughout the day
  • later in the day, they "check-in" or "unlock" the file to commit their changes

In this work-flow, the designer has an exclusive lock on the file for many hours. While this can prevent merge problems, it means that in a big team, there may be cases where one person may have to wait for another person to be finished.

It also means that building an environment must take a waterfall type model. For example:

  1. first, artist A builds the background environment first
  2. then, designer B places objects and details
  3. then, designer C places AI markup data and writes scripts
  4. then, designer D creates cutscenes in the area

Each worker wants to write to the same files, but only one person can work at once. Consider what happens if designer C finds some fundamental problem with the background environment. Something that was not apparent until this final stage? Then the waterfall must start again.

It would be much more ideal if all 4 workers can work together, at the same time. This should results in a work-flow that is more immediate and more creative.

However, to achieve that, we need the ability for each worker to frequently push their changes to the other workers (and also frequently receive changes from the other workers). If artist A changes some part of the environment, designer B can immediately go in an change the objects around that area. And designer C can start adjusting the AI markup data at the same time. And designer D can adjust cutscene camera positions to suit the new changes.

Not only can all 4 work together, but the time from original idea to final implementation has been dramatically reduced.

But how frequent does "frequent pushes" need to be? Does it mean once per hour? Once per minutes? Once per second? If every push becomes some kind of commit record, at what point does it become too many?

Intelligent merging

When working with large world data, at some point users will want to merge data from multiple sources. Merging is so useful, it's hard to overlook. So it's something we want to be able to do, if we can.

But, of course, there are a lot of problems than can come from merging! Merge conflicts and conflict resolution are big problems.

If we want to support merging, that really means we need to plan the data formats around the idea of merging. That is, we need to arrange the data in a way that will naturally reduce the frequently of merge conflicts, and present fewer problems when they do occur.

Some data formats just fall apart completely on merge errors. For example, merging visual studio project and solution files has often caused problems for me. Other data formats can be more robust.

We should consider this when designing the data format itself.

Intelligent upgrading

During large scale development, we sometimes need to change our data formats. When this happens, we need to upgrade the existing data to the new format.

This can cause huge problems! Changing the data is one problem. But another is dealing with rolling out the change. If an executable only supports the latest data format, then all developers need to synchronise upgrading their executable with upgrading their data.

It's just trouble! We want some intelligent way to deal with this problem and avoid day-to-day problems.

History of changes

Often, we need to know how a piece of data has changed over time. We might be tracking down a bug, or maybe just comparing a historical version of something to the latest version.

Getting a history of changes requires 2 things:

  • we need to record that history in the first place
  • we also need some way to display those changes

Raw view of data

GUI tools are great for authoring data. But sometimes we just want to be able to look at a raw version of the data (for example, just a text file or database table). It can be helpful for finding problems or for considering optimisations.

Ideas for solutions

There are 2 basic approaches to this problem.

Database backend

One option is to use a SQL-style database as a backend. This some really good advantages:

  • "migrations" provide a great way to upgrade data for version changes
  • easy queries and searches
  • lots of tools and technologies available to solve specific problems

A database would also be great for allowing instantaneous data updates. Changes one person makes on one PC could be immediately committed to the database, and picked up by other users immediately. This would be great for many reasons:

  • developers can see changes in real time
  • testers playing the game will see the changes artists are making immediately
  • (so if an art bug is found, it could be immediately addressed and the tester can provide real-time feedback)
  • a developer can use an editor tool on one PC while the game is running on another PC (or handheld/console). Changes will immediately be reflected in the game.

But using a database may not be well suited to some other goals:

  • merging data
  • history of changes
  • distributed model

There are some technologies that support some of these features. But there doesn't seem to be any single perfect solution.

Source control backend

A slightly simplier, but also quite useful solution would be to use source control in combination with text data files. This has some great advantages:

  • simple, but well understood technologies. Easy to integrate into libraries
  • history and raw views are automatically handled
  • the distribution system we want can be configured using existing tools
  • merging can be handed with standard tool, so long as the text data format is well suited to text merging

"Commits" are also a handy way to collect changes into a clear packet. We don't want too many commits, because it would make the history overwhelming. Commits also have a message and username attached, which again is important to make a meaningful history.

There are some disadvantages to this method, however:

  • less straight-forward to upgrade data to latest version
  • more difficult to get instantaneous broadcast of changes
  • no built-in support for queries and searches

I heard that when "git" was first developed, it was just a bunch of bash scripts. Then it was expanded, eventually ported to C, then GUI tools where developed, and so one. It started very humbly, and then grew and grew. This sort of suggests to me that the same kind of approach could be used for this system within XLE. It only needs minimal functionality to start with, using some basic bindings and reused functionality. But, if we pick the best methods, over time it can grow and grow.

Considerations

Distributed source control work-flow considerations

Now, consider the source control backend idea.

When using a distributed source control system, we can either use a push-based integration method, or a pull-request-based integration method.

Consider submitting an commit to a server in the push-based method:

  • a client must first fetch and merge/rebase from the server
  • merge errors are handled locally at this point
  • when the client is perfectly up-to-date with the server, they can push new commits

In the pull-request-based method:

  • a client can invoke a pull-request on a server at any time
  • clients aren't forced to update to the same status as the server
  • the merging occurs on the server, and if successful, the new changes can be integrated into the main branch

The biggest difference is push-based methods require that the client updates all local data to the same version as the server. We can call this a "pull-before-push" restriction.

In some cases this restriction might not be ideal.

Another problem is that (using git, or other similar systems), we have to pull the entire repository. For a very large repository, this could end up being a lot of data. If a developer is only working on a small part of the world, they may only be interested in updating that specific area. But with git, it's just possible to pull only a part of a repository. This may be most significant for open-source type teams that may not be working together in a shared office.

The pull-request-based method might be more ideal in this situations, because it can reduce the amount of data sent back and forth between the client and the server. But to use this method we need to have some continuous integration software running on the server to handle the pull requests. This software should check for merge problems, and will probably require user intervention when merge problems occur.

It may be ideal to use multiple repositories. Perhaps different repositories for different types of data.

We also need some way to detect changes to the server, perhaps via a subscribable service running on the server. Clients should have the option of listening for commits, and automatically pulling them as they happen.

Structuring data for merges

To use a source control back-end, we must be able to store all work data in text files. Text files can give us the merging, history, and raw views behaviour we need.

Normally this will probably require storing data 2 formats: one text file format for tools, and a second binary format that can be streamed and used in-game. That requires a little extra work within the engine to support both formats (and perform conversions at the right time, and manage them appropriately, etc).

To avoid merge problems, there are some rules to follow:

  • avoid tree based data structures (in the way that xml tags are hierachical)
  • If we have a objects arranged in a tree, they should be represented as a linear list within the data, with parent (or child/sibling) links included
  • avoid sorting/reordering data in lists
  • We might be encouraged to have a list of objects sorted by name. However, this isn't ideal -- because if the name changes, the object must be moved in the list. That makes the history more difficult to follow, and merge errors more likely
  • use parallel lists intelligently
  • imagine we have a list of objects and associated properties
  • rather than having giant objects with many properties; consider using multiple "parallel" lists
  • each list should contain only a subset of the properties
  • we want to separate the properties in such a way that different developers are likely to modify different lists
  • for example, their may be certain properties that a scripting designer is more likely to modify, and other properties that a world building designer is more likely to modify
  • if all the properties are stored together in the same list, both designers could frequently be making changes in the same general area of the file
  • but if they are separate, changes are more likely to be in separate parts of the file (and so merge problems are less likely)
  • relational database style links between objects may be more robust under merges than other types of links

Large binary data

Some data can only really be stored as large binary data. Terrain data and textures are the most important examples. But how should this fit into the system? This type of large binary data will probably have less frequent updates. If possible, we want this large data to fit with the distributed model -- but how can that be done seamlessly?

There are a few possible options:

  • git-media / git-annex / git-bigfiles
  • Mercurial largefiles extension
  • Plastic SCM (but don't want to force commerical software requirements on users)
  • don't use source control at all, but maybe just have a stub file / link within source control that points to binary data somewhere else
  • just use SVN

Conclusion

All in all, the source control based method seems to actually be both simpler and more complete. If there was a database solution that had exactly the features we want, that would probably be ideal.

However, it seems the perfect solution doesn't exist. But it certain seems like it would be easier to modify source control & text files based solution to meet our needs, rather than attempting to patch together a solution out of database technologies.