Static and Animated 3D Scene Generation from Free-form Text Descriptions

We propose a novel approach to generate 3D scenes (both animated and static) from text using a Transformer based NLP architecture and non-differential renderer.

Dependencies

Blender (version 2.78c) (Please see details from here)
Pytorch (==1.2.0)
Transformers (==3.0.2)
Numpy
Pickel
OpenCV

A simplistic .yml file has been added for reference.

Few Examples

Description	Ground truth	Generated Scene
"A rocking cyan matte sphere, a small rocking gray matte object, a small rocking shiny cylinder, a large rocking blue matte object, a spinning blue matte cube, a small moving brown sphere and a rocking blue shiny cube."
"Draw a large yellow colored cylinder of matte texture, a large cyan colored cube of matte texture, a large brown colored cylinder of shiny texture, a large red colored cube of matte texture and a large brown colored cylinder of shiny texture."

All other examples are under '\Output' folder.

Run Prediction

To run prediction on the model, M_static -

cd scripts
python runner.py --type "image" --target "image" --pred_count 15

Here, pred_count specifies number of prediction to run. For evaluation, 64 sample test files have been attached.

To run prediction on the model, M_animated -

cd scripts
python runner.py --type "video" --target "video" --pred_count 15

To run prediction on the model, M_full -

cd scripts
python runner.py --type "combined" --target "image" --pred_count 15

Replace target with video to generate videos instead. An image takes around 3-4 seconds to be rendered. A video takes around 2-4 minutes to be rendered.

All generated images(static scenes) and videos(animated scenes) are saved in the output folder.

Run Evaluation

To run evaluation on the model, M_static -

cd scripts
python runner.py --type "image" --sector "evaluate"

To run evaluation on the model, M_animated -

cd scripts
python runner.py --type "video" --sector "evaluate"

To run evaluation on the model, M_full -

cd scripts
python runner.py --type "combined" --sector "evaluate"

Run Prediction on user given input

For the model, M_static -

cd scripts
python runner.py --type "image" --sector "predict_single" --description <YOUR_DESCRIPTION>

For the model, M_animated -

cd scripts
python runner.py --type "video" --sector "predict_single" --description <YOUR_DESCRIPTION>

For the model, M_full -

cd scripts
python runner.py --type "combined" --sector "predict_single" --description <YOUR_DESCRIPTION>

Dataset Generation

Our dataset is generated on top of the CLEVR dataset. The CLEVR dataset includes some JSON files that include all the scenes they used. We take these JSON files and generate 13 kinds of scene descriptions for each of these files. Follow these steps to generate the dataset:

Download the CLEVR dataset
Pass the JSON file path to line 5
If you want to generate image descriptions, use the templates from description_template_numpy.py file, and for video descriptions, use the templates from description_template_video_numpy.py file in line 1
Run the python file
```
python generate_description_numpy.py
```
Pre-calculate the pickle files of TransformerXL output for training and testing the model using this notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
description_generation		description_generation
image_generation		image_generation
output		output
per_description_file/testA		per_description_file/testA
per_video_file/testA		per_video_file/testA
scripts		scripts
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Static and Animated 3D Scene Generation from Free-form Text Descriptions

Dependencies

Few Examples

Run Prediction

Run Evaluation

Run Prediction on user given input

Dataset Generation

About

Releases

Packages

Languages

oaishi/3DScene_from_text

Folders and files

Latest commit

History

Repository files navigation

Static and Animated 3D Scene Generation from Free-form Text Descriptions

Dependencies

Few Examples

Run Prediction

Run Evaluation

Run Prediction on user given input

Dataset Generation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages