Skip to content

A Future of Visual Storytelling

mlevere edited this page Jul 7, 2020 · 8 revisions

A Future of Visual Storytelling

In the future, anyone with a story idea will be able to develop and visualize their story without technical limitations. They'll be able to direct a visual story by themselves. They'll be able to collaborate real time, building virtual scenes with people next to them or in different countries. They'll be able to iterate freely and easily to make their stories better.

In the future, anyone with a set of storyboards will be able to pick up a camera and start shooting. They won't need to find or build real sets. They won't need to be in the same country as the performers. As they shoot each shot, the scene will be roughly edited together, in real time. The technology will become invisible. They will only need to worry about the story and the performance.

In the future, any person with a story idea will be able to produce their stories into full reality. They won't be restricted by where they came from or who they are now, but rather, empowered because of where they came from and who they are now. They won't worry about money or possessing technology. They will have a place to create stories, find others to work with, and find audiences that connect with their work. They will be free to play and explore creative story ideas ultimately creating visual stories with their most valuable asset: their unique point of view.

You don't need anyone else's permission or money to get started. And if you're smart and creative you don't need anyone else's permission or money to produce an entire visual story to completion.

Some Side Notes

This is not the future, it is a future.

These are a collection of my dumb ideas of a very near future of visual storytelling concepts, processes, and technologies that I'm thinking about. I'm not trying to push anything on anyone. I want everyone to have their own idea of the future so we can have conversations and an exchange of ideas. I'm writing this mostly to organize my thoughts, but also to get feedback from you!

"As far as the future, most of it is already here, it's just not very evenly distributed." – Abraham Lincoln.

Steal the future from everywhere you see it. I'm personally live-action focused, and I'm incredibly surprised live-action production hasn't stolen simple genius ideas from animation and theatrical productions. They haven't even attempted. They aren't even aware of interesting processes from video game development, car manufacturing, university research, or tinkerers on youtube.

All the concepts talked about are in place by some people already or in different contexts. As far as technology, most of it exists already, it might just be inaccessible. I want to try to make the concepts and technology accessible to most people if possible.

Why is visual storytelling important?

Stories humanize people. We need to find the humanity in other people's viewpoints. People's stories help people understand each other.

Visual stories can create an immediate emotional impact on a viewer in a way that no other art form can.

What is a Visual Story?

Visual Stories are ideas conveyed through sequential images or even a single image. In almost every case, the story involves the imagery of people or personified objects that immediately have an emotional effect on the viewer and do not require translation. Each sequential image progresses the idea.

Examples of visual stories are: animated or live action: movies, tv shows, commercials, webisodes, tiktoks, graphic novels, comics, slide shows, or even a single image. The examples vary in length, format, consumption device, subject matter, production method - but at the highest level, they are images shown in order to convey ideas.

Are sounds/music visual stories? Absolutely. In many cases just as emotionally moving as a the visuals.

1. Story Development

The story is the most important part of visual storytelling. Does that statement sound as basic to you as it feels for me to write it? I hope so, because it should be plain as day.

Making stories is hard. It has been the same level hard for the past 10,000 years. It will be the same level hard 10,000 years into the future. Technology may help with minor busywork annoyances, but the ability to contrive a story from nothing, infuse it with love and soul, solve it's infinite potential issues, and economically have an emotional impact on the viewer, all while seeming true and not contrived is a human miracle.

Certainly, you've seen a big budget movie where it was technically astounding, but the story just sucked. What's the point of burning tons of production cash if the story sucks? Why not wait until you have fully developed the story?

Probably because making stories is hard, and people are lazy.

Therefore, the development of the story should be the greatest interest of visual storytellers all the way through production, and even after it.

Writing Visually

If we consume visual stories as images, why would a writer use words to describe images to synthesize a story? Why not develop a story visually? Instead of writing with words, write with images.

It's easy to understand why we write with words. We use collections of words as symbols to create ideas. We can speak and listen to words to communicate ideas. We can write words down and read them. We can easily mass distribute written words. Words are cheap and effective. But they are also lossy symbols. They can never exactly convey the true meaning and feeling of the writer. They require the writer to translate the idea to words, and the reader to translate the words to the idea. Those are two translations where the original idea can be confused.

In animation, specifically at a company like Pixar, writers do, in fact, write with images. "Story Artists" draw sequences of images to convey stories. Storyboards. This is an incredibly effective way of conveying a story because the hand drawn images are very close to the desired end product. Story artists can visably explain ideas understood immediately, which would be very difficult to write in words. Image sequences also demonstrate timing, pacing and tone, which is often near impossible to do in words.

The downside of drawing images to synthesize stories is that you need to be able to draw. However, I have been assured by many story artists that the important component of "Story Artist" is "Story". A great storyteller who has little drawing ability can put together stick figure drawings and is able to move a viewer emotionally. Conversely, a technically great illustrator with no clear idea of story often just makes hollow pretty pictures.

Therefore, I believe great visual storytellers should be able to visually write without having the ability to draw. This is why we built Storyboarder. Initially, Storyboarder was built as a conventional storyboarding tool centered around illustration. Yet, computers have a great ability to render 3D scenes of environments, objects, and characters. So we built a tool that allows people to easily create scenes, pose characters and set up cameras so they can easily create images to tell stories.

Furthermore, you can put on a VR headset to create and manipulate the whole scene just as you would in a real set. It essentially allows you to write and direct a story on the fly.

Editing is very easy by going back into a scene and making changes. There isn't a sense of loss of throwing away a drawing – there is a sense of productivity and creativity by incrementally making things better.

https://www.youtube.com/watch?v=31CdxnXAxiQ

"Film stories are still really rooted in theater. You're frequently stuck doing things that you could do much more poetically. The story has to be written for it and I'm not quite sure who would do it because writers don't write visual things." – Stanley Kubrick

With modern creative tools, it's now possible for writers to write visual things.

So at this point, there is no reason why a visual storyteller should not be writing visually using a tool like Storyboarder or any other tool that works for them.

Importance of Play

All of the most creative things I've ever done have been by accident. I was trying something I didn't need to do because: why not? I was playing. With curiosity and wonder you take little low cost risks to see what will happen. You learn new things. Sometimes those things are fantastic. Sometimes you are the first person to discover something and often the first to discover it from your particular point of view.

Keep the costs low and the creative risks high.

When creating or doing something takes a lot of effort or has a big expense, we don't take small risks. We make plans. We make plans to make plans. We often spend more time trying to mitigate the potential disasters of the plan than actually executing. This limits happy accidents. This limits risk taking. This limits play.

It's weird and sad how people don't play as they get older. Maybe it's because our bodies work slower. Maybe it's because we think we've already discovered everything. Maybe it's because we don't even remember the joy of discovery.

Is it possible to live in the world of your story? Is it possible to act the scene out?

When Walt Disney was developing the story of Snow White, he would grab a random employee from the halls and bring them to the theater. He would act out the full story of Snow White from his head. Every time was a little different. He would try different ideas and often improvise new ideas on the spot. He did this hundreds of times, coming up with tons of new ideas.

Why not play the story out hundreds of times? Improvise new ideas because: why not?

When the cost to try new ideas is low, and the technology and tooling blend into the background, you can reach a flow state. You are hyper focused on creation. You can try new things and you get immediate feedback. New ideas give you new ideas. The instant you have to wait for something, this is broken.

When creation feels like a game, the stakes are low. The barrier is low. The risks are easy to take. The art is better.

Collaboration

Creativity is reactive. No one synthesizes a completely original idea without prompt. The conversations between people are people reacting to each others ideas. A question is asked for clarification. A person responds with a new thought they'd never thought of before. Another person brings a fact or prior art into the conversation sparking new ideas.

Conversation and collaboration between people is an excellent tool for synthesizing new ideas.

Yet, recent story development is thought of as a solitary act. The sole writer creates a unique, idiosyncratic story with their own take on dramatic conflict.

A strong, unique point of view is absolutely necessary in a story. But why not benefit from multiple minds of similar style, interest and taste, each bringing their own strengths to the development of the story? With a strong central story director, the story can maintain the point of view, while incorporating the very best input from each writer.

Additionally, collaborators don't need to be in the same place. In fact, it seems absurd to find creative, unique, talented people (which is near impossible), expecting they will all live in the same city. We have tools to coexist in virtual worlds together, walking around, playing, and creating together. You can do this within Storyboarder.

community

Music Sooner Than Later

Some of the most beautiful stories are told through music. Some of the greatest memories are tied to iconic original themes.

So why not develop a story with music as a fundamental writing tool in the first place?

Music and thoughtful sound design have a huge impact on how people feel when they watch a visual story. Music can set the mood and tone through the usage of well known cliches. In fact, in the absence of a meaningful story, music is the only hope to keep that visual turd from sinking into the dark abyss of irrelevance.

Iteration

Iteration is easily the most important concept around the future of visual storytelling. I buried it down here because most people aren't even reading this far. But you are. Hey. Isn't this nice? Let's just take a moment to appreciate each other. Thanks for reading this. Anyways...

No creation is great right out of the gate. It is made better over time. You listen to feedback. You change things. You try new things or gain a new perspective. You change things. Every iteration makes the work better.

Iteration is how stories have historically developed. Myths would be created by someone and told to a group of people. People would retell the stories, keeping the best aspects of the story and losing the worst parts. Every time it was retold, it would be slightly different, each time, the story would be iterated on. The bible is a great example of a collection of prototypical myths iterating over several thousand years. All religious mythology and folklore is the result of this.

In video game development, no game is just made and released to the public. When a game designer has a novel idea for a game, they are feeling pretty good that they are on to something. They make a prototype. They make something people can play and receive feedback on. This isn't done two or three times. This is done thousands of times. When the game is ready for release, perhaps a kernel of the vision is still intact, but the resulting game is very different from the way they originally intended. Even after release, changes are being made to improve the game.

In feature-length narrative screenwriting, there is some sort of arbitrary "A draft and two revisions" rule of thumb. It seems to be centered around worker protectionism more than the work itself. But who really gives a shit about how screenwriting is typically done? Mythology has been iterating on iconic stories for 10,000 years. Screenwriting has only existed for roughly a hundred years and has produced only a handful of rare single hit successes.

Yet in animation, particularly at Pixar, stories are heavily iterated on. Animation is much more expensive per minute to produce than live-action. Therefore, if you are going to spend a ton on production, you should invest in the story you are trying to produce. Stories at Pixar start with the kernel of an idea. The story is workshopped. Scenes are written by story artists that produce visual sequences to pitch story ideas to the team and the director. Because these hand drawn sequences can be edited into full versions of a movie, Pixar can show that version to test audiences and their trusted panel of storytellers for feedback. The feedback is taken and incorporated into revisions of sequences, or even wholly new ideas. This makes the stories better. Pixar stories are developed over the course of two years. Their story and commercial hit rate is nearly 100%.

Even though Pixar literally invented computer graphics, their most valuable innovation has been how they iterate on story. I don't know why every storyteller or entertainment studio hasn't stolen their process. I think it is because Pixar originated as a group of creative software engineers. The value of technological iteration is burned into their culture. Whereas most other creative executives still cling to the golden touch fallacy.

But iteration is hard particularly in visual storytelling because the costs are so high. Do you have 10 story artists at your disposal to visualize your story concepts? I don't. Furthermore, changes to stories have ripple effects. If you change a fundamental value of a character in the story, you have to make sure that change is reflected in the rest of the story. This problem increases exponentially based on the length of the story. Even worse, artists are resistant to losing their work. Imagine if you just drew a beautiful sequence of 300 images. But, a great change was made to a character that will improve the story. This will nullify your sequence and it must be redone.

This is why we built Storyboarder. We wanted scenes and shots to be easily changed so that contrary to feeling loss while iterating story, you would feel a sense of improvement. A creative tool like Storyboarder will reduce the change costs so that iteration is pain free.

Also, because the storyboard sequences generated with Storyboarder are very similar to the desired output (a visual story), full sequences of the entire visual story can be presented to test audiences for feedback. This can be done hundreds or thousands of times.

2. Pre-Production

Pre-production is the stage where the story has been developed enough that it makes sense to move forward and prepare for production. However, this doesn't mean the story is done. Ideally, through the process working with the cast, workshopping the story, and further testing, the story will further develop prior to production. Through the step of proto-production, the story will be iterated on at least one more time.

Casting

There are a lot of great actors few people have seen yet. If your story is great, you do not need a well-known actor. If the cost of your production is low, you do not need a well-known actor to sell your story to viewers.

Culturally, you want your actors on your team. You want them fully invested in the story. You want them to make your story ring true. You want them to be able to iterate on the story with you. You want them to make the story better.

This will take more work than usually required of a typical narrative story. It's important that they are willing to do this.

Edited Animatic Readings

Bring the actors together to perform alongside the storyboard animatic and record the voices. In the sessions, if the actors have concerns or ideas, talk them through and try new things if needed. Afterwards, sync the vocal performances to the storyboard animatic. The result should be very compelling. If it's not, the story or the performances have issues. Make changes. Try it again if needed.

Proto-production

A proto-production is a low cost prototype of a visual story. In software or game development terms, it would be like a beta test.

Shoot the movie before shooting the movie.

The objective is to shoot the entire visual story without built-up sets, crew, tight time constraints, or high costs. Think: a large warehouse. A couple moveable walls, platforms, tables and chairs. The actors, the writers, and you – with a camera. You'll shoot the whole story there. For months if you need to.

The result is a fully edited visual story that can be viewed by anyone. If the visual story doesn't move your viewers emotionally, it's not because there aren't built sets or fancy visual effects. It's because your story sucks and or your performances are bad.

Wouldn't it suck to find out your story really just isn't working? No one is feeling it. Even you. That's what releasing the first prototype of a game is like. But it's not a bad thing. It's incredibly eye opening and a huge opportunity to make things better.

You've already visually written your story and iterated on it many times. You shouldn't even be shooting a proto-production if you weren't confident in your animatic. So it should be very clear what's not working with this proto-production and what needs to change.

I was talking to a well-known movie actor about the idea of a proto-production. He told me that he actually prefers acting in theater. However, he won't sign on to a theatrical production unless he has the time to devote two full months to workshopping the play. When he signed on, he liked the story of the script. He wouldn't have if he didn't. But he describes the first day of workshop as having a feeling that this really isn't going to work. Yet, through everyone's feedback, iterating with the director, the script changes, it becomes really good. And even still, on opening night, there will be a line in the play, that no one in the production ever thought twice about, that gets a huge laugh. And they will change the script to lean into that laugh a bit more. It's an incredibly rewarding experience.

That same actor will be in a movie where he isn't even allowed to read the whole script, but what he is reading seems like trash. He'll ask producers and the directors about it and they will assure him that everything is going to be great. He shows up that day for the shoot. Does his lines. Waits a year. He was right. The story was trash.

On a technical basis, we want to make tools that make much of the technical aspects of production invisible. We want to focus on the story and the performances, but in order to keep costs down, we won't have a crew.

The entire visual story was storyboarded shot for shot in Storyboarder. Each board has tons of metadata: the characters in the shot, what they are doing, where the camera is, what the lens is, etc. So there is a shot list for the whole story. You can shoot to the storyboards in this case, and avoid shooting coverage.

We built a system called "Shot Core" that imports all of your shots and metadata for the story. The shots can be reordered into a schedule. When you pick up the camera, it shows you the storyboard of the shot you should be shooting. When you record a take, Shot Core logs it and shuttles it automatically off the camera. There's a small device attached to the camera that lets you rate a take or advance to the next shot. As you are shooting a scene, Shot Core automatically assembles a rough edit of the scene in real time. The whole system is designed to keep you on track, organized, and not worry about anything except the story.

If you want to shoot anything different, you can make impromptu shots. You can even rework the whole scene with no problem.

It's important for the technical aspects of shooting a proto-production to be extremely flexible because you are also workshopping the story. If an actor improvises a line that sparks a new idea, it should be explored. If there is a clear problem, it's important to pause, work it out, and try potential solutions.

The result should be very compelling. If it's not, the story or the performances have issues. Make changes. Shoot a proto-production again if needed.

Testing and Iterating

Tyler Perry makes a series of movies starring his character, Madea. The movies are a series of situational comedy sketches targeted to Black audiences. Tyler Perry will randomly show up to local night theaters in Atlanta and try new sketch ideas on stage. If the sketch doesn't do well with the Black audience, he might come back the next night after revising the sketch. And if it kills, that sketch will probably be in the next Madea movie. Tyler Perry is a creative genius. He tests his stories and iterates on them in near real time.

If there is a central theme to the future of visual storytelling, it is that trying, doing, testing, changing and repeating is extremely important to story development. The process started small and low cost, but you were able to take huge creative risks. With each iteration you were able to improve the story as the costs become greater.

The result of each iterative step should be compelling. If it's not, there's a big problem. There's no sense moving on to the next step and going through the final step of costly production. It's an idiotic waste if you can't fix those issues earlier.

When you get feedback, you can translate the feedback into real ideas and make changes on the story. This is key to iterating on story and making it better.

But how do you know if the story is good or improving? In Tyler Perry's case, he is performing to his targeted audience for laughs. When he gets laughs, it's passing the test. When he gets huge laughs, he surpasses his expectation. And when he gets no laughs, something is very wrong. Certainly he thought it might work if he tried it. But he took a risk and it didn't work out.

Your taste and your point of view are the most important components to telling a visual story with an idiosyncratic, unique point of view. Luckily, because of this, you mostly know when things are rotten. But you don't know exactly how your target audience is going to react to the story risks you've taken. And that's the job: to entertain them. Therefore, if you can get their feedback and incorporate it into story changes, you'll be very successful at entertaining.

You will only be producing your visual story once. So make sure the proto-production has been well tested before moving ahead.

Pre-production work

Concurrent to proto-production, you have to prepare everything you need for production.

Because of advancements in compute speed and game development, scenes can be shot using virtual environments and composited with the real performances of the actors. This allows us to shoot in locations that are impossible or don't exist, and in many cases, it is just cheaper to shoot virtually. Therefore, we need to design those virtual environments.

3D designs of the sets need to be created, textured, and lit properly. This doesn't need to be super high quality or exact, because you can make changes to the environment in post compositing.

But the most important aspect is the lighting, because when the actors are photographed, the lights on them will be physically reflected from the virtual world. Luckily, lighting is very easy to change in realtime, in production.

But, anything the actors touch needs to be built. This means that if an actor touches a wall, you need to build that wall. If they pick something up, that needs to be a real thing. If they walk on the floor, the floor needs to be dressed.

3. Production

You've storyboarded the whole movie, you've created a vocal performance animatic, you've shot it at least once. Storywise, production should be the easiest aspect of the entire process.

The objective is: Photograph and record the actors' performance of the story from intentional camera angles. Collect all the appropriate technical information needed to composite that photography for the final image in post production.

Every person on the production team should be contributing creatively with the shared vision of the best version of the story. Culturally, it shouldn't feel like time is running out. Everyone is on the same team and no one is killing themselves to get stuff done. It's a relaxed environment.

Technology is used to help accomplish technical tasks that blend into the background. There are fewer technical people, which means a reduced cost of production which means longer time to capture the best performance of the story.

Everyone on the same team

Do you need more than 6 people to produce a visual story?

Everyone in your production team should have been along for the whole ride. There should be no mercenaries or hired guns. Everyone should know the story and want the same thing, to make the best possible representation of the story.

https://a24films.com/notes/2019/12/seduce-and-destroy-with-josh-safdie-benny-safdie-and-paul-thomas-anderson

Paul Thomas Anderson recalls meeting Stanley Kubrick after Paul made Boogie Nights:

"We talked about the size of his crew. Cause he had like 6 people. And I go, “Wow you know you have a small crew,” and he’s like, “Well how many do you use?” You know it was one of those moments where you really have to assess like, “Like a fucking hundred,” and that’s too many. And he made a very good point."

Stanley Kubrick was a curious autodidact who surrounded himself with small group of similar autodidacts. If he was interested in something, he learned it. His attitude was that nothing is impossible or too complex to learn. He wrote, directed, photographed, designed, engineered, edited, and was probably also a great break dancer. Because his crew had similar skills and everyone was intimately aware of his sense of taste and style, he was able to delegate tasks to autonomous, creative, intelligent people that would get shit done.

Additionally, they embraced technology that made their jobs easier so they could extend their productions for the purpose of capturing story.

Fluid process

If something is not working storywise, anyone should be able to call a stop to address it. If someone has an idea to make it better, it should be explored.

In the latter half of the 20th century, Toyota developed TPS, Toyota Production System. It's a conceptual, cultural philosophy around the efficient production of automobiles. It's a precursor for iterative agile development. One component of TPS is that any single person in the factory can stop the production of the entire factory to alert the factory of a problem or an idea to make the process better.

Technology that fades into the background

In entertainment production, I often hear that anything that can go wrong, will go wrong. I don't feel that way when I'm using my phone. I don't feel that way when I'm using my computer. I don't feel that way when I'm playing a game. Yet, when people are talking about capturing performances on a technical basis, this seems to be the story. I think it is because the tools that are used are poorly designed and rely too much on human operation. Humans doing technical tasks has a huge propensity for error.

Technology should be built that removes technical tasks from people so they can focus on story and performances. Good user interface design simplifies the complicated technical task for the creative user so they can freely create with very little friction.

Typical entertainment production does not invest in technology. They buy taped together ad-hoc tools. They do not design tools. They do not build tools. They do not envision the future.

Technology should not be developed by a cottage industry of high margin businesses to be sold at a yearly trade show. Technology should be designed with specific purpose by the creative artists themselves and ideally shared with other creators.

Virtual Production

Virtual Production is a concept that describes producing photography where at least part of the image was generated by a computer, in real time. A simple example of this is: Shooting an actor against a green screen and 3D tracking the camera's position and rotation in real time, so you can composite the actor on top of a 3D rendered background.

Greenscreening and matte painting have been used for the past 60 years to put actors in environments that would previously be impossible or too expensive.

The difference now is that the actors can move around in the space, in real time, reacting to the the virtual environment. The director can move the camera around and see the composition of the shot as they are shooting it. The actors and director can try things in the space. They can play in the environment.

Disney's The Jungle Book was a great example of one of the first productions that used a realtime game engine to visualize virtual sets. The director of Jungle Book, Jon Favreau, has really embraced the concept of virtual production. In his subsequent TV series, The Mandalorian, he uses a full LED walled environment for many of his scenes. The LED walls give the actors a sense of their environment, but more importantly the LED lights are used to light the actors realistically, as if they are in the real scene. So if the scene was on a sunset beach, they would be lit golden from the side, and cool and dark from the other. In 3D terms, this is called Global Illumination, or GI. Lack of GI in 3D rendering is was makes 3D look so fake. And having real GI on actors is what makes the Mandalorian's virtual scenes look so real.

Additionally, because they are completely surrounded by LEDs, they can use the real time camera position and rotation of the camera to project the frustrum of the camera as a green box behind the actors. The green box is only as big as the camera sees. Because it's small, it reduces the amount of green light bouncing off the actors. This is very smart, because even though they are using a virtual set to light the actors and visualize the shots, they can still change the details of the virtual worlds later in post production.

In fact, it would really suck if you had to design immensly detailed 3D worlds ahead of production knowing that you might not even end up shooting parts of them.

Some problems with the Mandalorian LED wall enclosure is that the installation of such a huge scale enclosure is difficult. The cost of the led wall material is incredibly expensive and it generates a ton of heat.

Therefore, the ideal virtual production system would be something that: is cheap, relatively easy to set up, provides photographic global illumination of the actors, and is easy to segment actors out of the image to be composited on a virtual world.

What if, instead an LED wall enclosure, the scene was surrounded by disparate remote controlled led light panels? You could place many of them around the periphery of the scene, loosely pointing at the center of the scene. A 360 degree camera placed in the center of the scene could perform a calibration procedure to get the exact position of each light, and as a result, the virtual world could be sampled to tell the light the exact color and intensity to emit. The result would be a very approximate, cheaper form photographic of GI.

As for masking out the actors from the background for compositing, what if instead of visible light, infrared light was used? Humans can percieve IR light, so it is invisible, so it can be used in smart ways to contain information. Remote controls work with IR light. VR controllers use IR light so they can be tracked by IR cameras. What if you had 2 identical cameras with the same lens? One would allow only IR to pass through, and the other would only allow visible light. Using a mirror, you capture both images. If you flooded the actors from the camera with infrared light, you would be able to use the infrared image as the alpha channel.

Even simpler, if you were able to capture infrared light as a seperate channel as rgb on the same camera sensor, the setup would be even more elegant.

This solution assumes real photography is necessary. In the future, it won't be.

Imagine that your scene has been all storyboarded out. Why can't an actor just put on a lightweight vr headset that allows them to walk around freely in the space? The headset would track their body movement, and their facial expressions perfectly. Other actors could be in the same virtual space, coexisting, interacting with each other. Yet, they wouldn't have to be in the same physical space. The director and the other collaborators could be in the space. They could manipulate the scene, moving objects, directing actors, moving cameras. The entire production could be produced virtually. Also, because the facial expressions and voice could be transferred to any virtual likeness, the appearance of the actors could be anyone or anything. But most interestingly, because each character could be played by the same person, an entire production could be dreamed of, developed, directed, performed and produced by a single person.

If you ignore visual fidelity, that future is already now. Within 5 years, this type of production will be indistinguishable from reality.

Robotics

While we are still stuck in physical reality, the ability to move things and articulate them in physical space is really important. If I'm a single person making a youtube video, I have to move the camera into the position I want it. I have to pan and tilt it to make sure the framing is good. If I move too much, I might be out of the frame, ruining the shot without another person to articulate the camera. In the future, I should be able to at least remotely control the position and rotation of the camera. Given the simple rules, I should let the camera operate itself, panning as I move. As a game developer, I would be able to make the autonomous movements of a 3D camera like that in 20 minutes. Why can't I build that today?

Within the past 2 years, robotics have gotten much cheaper. Sensing objects and environments in 3D has gotten much better and additionally cheaper.

So imagine a camera, attached to a robotic arm, mounted to a base platform. The base platform can move in any direction. The arm has 6 base points of articulation so the end of the arm can be in any x, y, z position and any x, y, z rotation within it's reach. At the end of the arm is the camera, which is remotely controlled as well.

Instead of moving the camera, you pick up your phone. Through your phone, you see the what the camera sees. Your phone is able to track it's own position in 3D space, so the position and rotation of your phone can be translated to the relative position and rotation of the robot camera. You might even have specific constraints, like only move along a predefined line, or orbit around a specific point in space.

Additionally, you would be able to record very specific camera movements that could be played back perfectly, every time.

But most importantly, you could tell it how to perform as a semi autonomous camera. For example, you could frame the shot that you want, but you could tell it that you wanted to keep a particular subject in a particular part of the frame, allowing for a little wiggle room, but no sudden camera movements, and apply a human handheld type camera shake. Or you might want the camera to dolly slowly, keeping 2 characters in the 2/3rd of the screen no matter how much they move.

This is really easy to build in a virtual model (video game), but not all that difficult to build using robots and current technology.

If you think about putting physical movement in additional things like lights, you can imagine a concept of dynamic sets.

What if you wanted to go from one predefined setup to another? You could just advance the shot, and all of the lights would move to their new positions and the camera would move to its position and the shot would be ready to shoot in a minute.

On Set Editing

Why aren't you editing your scene together as it's being shot? Our tool, Shot Core, does a rough assembly edit in real time as you shoot. But why not really edit the scene together before you move onto the next scene? The whole scene is right there, and you can reshoot or try something different if you want. Bong Joonho does this. It's some basic no duh type shit.

4. Post Production

With heavy use of virtual production, the primary set of work is around visual compositing.

Editing

To minimize the expense of visual compositing, editing has to be completed first.

Compositing

Because most of the photographed scenes in the visual story are shot with virtual production, almost every shot of the story needs to be composited. This basically means the video clips of the actors need to be masked so that the background is removed, that needs to be composited on top of a video of the virtual world generated from the position of the camera.

First, we need the camera track. We have the data from the 3D tracking device attached to the top of the camera during shooting. We take that data and also perform an optical camera track using only the video. The reality is that the optical camera track will work better in most cases because it is frame perfect. The 3D tracking device is more physically accurate, however there is noise in the data. Combining the best of both yields an excellent track that can be fed into the renderer.

Next, we were recording a 3D depth sensor next to the camera. This gives us the distance between the camera and the actors or objects in the camera. For rendering, we will know the specific depth that we want to set focus. We will render everything behind the plane on that particular depth, and then everything in front of that depth.

We will use those two images to sandwich the image of the actors.

Next, we need to isolate the actors. We can use the separate IR image as the mask. It needs to be calibrated per frame to match the visible light image perfectly. The contrast of the image should be boosted so that the actors are white and the background is black. Then, any black holes need to be filled with white. Any great errors need to be hand painted.

Ideally, this work will be done automatically. Put in the information, get out a rendered shot. Errors in the shots could be spot checked and manually fixed.

Other artistic composition elements can be placed, like dynamic volumes (fog, steam, explosions).

AI

AI or Machine Learning has produced some really interesting visual research in the past few years. Unfortunately, currently, most novel example demos are unuseable on a practical basis. But there is some work that could be used procedurally that would speed up work by a ton. One really great example are "deep fakes" or face replacement. The quality is excellent and the technique can currently be used to do face replacement much faster and cheaper than manually compositing 3D face renderings ontop of video.

The areas that I think AI will particularly interesting are:

Rotoscoping - Masking subjects out of a video. In ML, this is usually called image segmentation or background removal.

Relighting - Take an existing video and change the position of the light sources to relight the video.

Face replacement - Replace a video clips face with an existing face model.

Performance replacement - Take an existing shot with a performance and camera movement, and replace the performance matching the existing style of the video.

Dynamic retiming - Independantly retime performance from camera movement.

Speech Editing / Speech Synthesis - Change words in a sentence in a performance.

Audio Cleanup - Eliminate background from noisy recordings.

Audio style matching - Given an existing recording of a voice actor, match the style with a new actor's voice.

5. Distribution

Visual Storytellers want their visual stories to be seen by the most targeted people. Everyone watches visuals on their phones, computer, or tvs through the internet.

With the cost of production being much lower, you can specifically target groups that were typically considered niche, better serve that audience, and make a market that previously didn't exist.

Also, great stories are universal. If your story is great, you transcend your initial audience to much wider appeal.

Where are the people?

Fish where the fish are. Where are the fish? Youtube. Are fish anywhere else? Maybe, but who gives a shit? Youtube is massive. Distribution is free. And in the future, you will be able to post on demand paid content there.

First Act on Youtube

I don't understand why the first act of every movie isn't on Youtube. I don't understand why the first episode of every TV show isn't on Youtube. Everyone is on Youtube looking to watch something that connects with them. The first act of a movie sets up the world, introduces the character(s) in that world and then, BAM, something happens that forces the character(s) on a journey. If that shit is good, the only thing I'm thinking is, "WHAT HAPPENS NEXT?!?" I'm hooked. It costs $5 bucks to watch the rest and everyone else says it's good? Done. Even if I have to go to another platform to watch the rest of the movie, I'm sold. If I watched the first episode of a great TV show, I'd be sold. It's free, and potentially infinitely scalable marketing.

Why wouldn't every producer of visual stories post at least the first part of their story on Youtube? Because they are cowards. They want to sell their content trash to a handful of greater fools who don't know any better. Those fools want to package that shit into their streaming platform or sell it in the form of a fancy trailer. The producers are afraid, and generally correctly, that their stories are so bad that if people were to see even the first act, they wouldn't want to continue watching, let alone pay to watch.

Make the best shit. Put your shit out there and don't be a coward.

Clone this wiki locally