Tyler Habowski Insight AI Project | Session AI.SV.2020A
Generate 3D models from text input to increase the speed and efficacy of the design process.
To run the code on this repo follow these steps:
-
Download all data and update configuration file according to download locations in configs.py.
a. IMPORTANT NOTE: Downloading the ShapeNet / PartNet databases requires authorization from the organizers, usually takes ~3-4 business days.
b. Once approved, download ShapeNetCore database called 'Archive of ShapeNetCore v2 release' from here.
c. PartNet also requires filling out an additional and separate form to download. That process can be started here.
d. Note that not all of the data from these databases is utilized and thus only parts of the archives need to be unzipped to save signficant time and memory. Only the .solid.binvox files are used from ShapeNetCore and only the .json files are used from PartNet to generate the descriptions. However, the programs assume they are in the same relative folder structure.
-
Train shape encoder model using vae.py file. Confirm model performance using provided visualization methods.
-
Gather data and generate descriptions for objects.
a. Run through partnetmeta.py which gathers information from the PartNet database.
b. Then run through descriptor.py which uses that output in order to generate the randomized object descriptions.
-
Train text encoder model using text2shape.py. Confirm model performance using provided visualization methods.
-
Create TSNE plots using tsne.py which generates a pandas csv file with relevant info for use in the streamlit app.
-
Run the streamlit_app.py file with Streamlit.
a. Navigate to the folder with the file in a terminal.
b. Run the command "streamlit run streamlit_app.py".
The following files are used by the main programs:
- utils.py Contains many commonly useful algorithms for a variety of tasks.
- textspacy.py Class that contains the text encoder model.
- cvae.py Class that contains the shape autoencoder model.
- logger.py Class for easily organizing training information like logs, model checkpoints, plots, and configuration info.
- easy_tf2_log.py Easy tensorboard logging file from here modified for use with tensorflow 2.0
The demo is down now as I've moved onto other projects and haven't had the time to maintain this, leaving below info for posterity.
This tab allows you to input a description and the generator will make a model based on that description. The 3D plotly viewer generally works much faster in Firefox compared to chrome so use that if chrome is being slow.
The bottom of this tab shows similar descriptions to the input description. Use these samples to see new designs and learn how the model interprets the text.
- Table (8436)
- Chair (6778)
- Lamp (2318)
- Faucet (744)
- Clock (651)
- Bottle (498)
- Vase (485)
- Laptop (460)
- Bed (233)
- Mug (214)
- Bowl (186)
This tab shows the plot of the shape embedding vectors reduced from the full model dimensionality of 128 dimensions down to 2 so they can be viewed easily. The method for dimensionality reduction was TSNE.
- Color data
- This selector box sets what determines the color of the dots (the class selections are particularly interesting!)
- Plot dot size
- This sets the dot size. Helpful when zooming in on a region.
- Model IDs
- This allows for putting in multiple model IDs to see how they're connected on the graph.
- Anno IDs to view
- From the hover text on the TSNE plot points you can see the 'Anno ID' (annotation ID) and enter it into this box to see a render of the object and 1 of its generated descriptions.
- Multiple IDs can be entered and separated by commas.
- The renders can be viewed in the sidebar or in the main area below the TSNE graph.
Additionally, using the plotly interface you can double click on a category in the legend to show only that category of dots. Or click once to toggle showing that category. You can also zoom in on specific regions to see local clustering in which case it may be useful to increase the plot dot size.
The shape embeddings are very well clustered according to differt shape classes but also to sub categories inside those classes. By playing with the color data, it can be seen that the clusters are also organized very strongly by specific attributes about the object such as is it's overall width, length, or height.
This tab is just for fun and is intended to show how well the model can interpolate between various object models. Note that this runs the model many times and as such can be quite slow online. You may need to hit 'stop' and then 'rerun' from th menu in the upper right corner to make it behave properly.
To generate these plots, the algorithm finds the nearest K shape embedding vectors (K set by the variety parameter in the sidebar) and randomly picks one of them. Then it interpolates between the current vector and the random new vector and at every interpolated point it generates a new model from the interpolated latent space vector. Then it repeats to find new vectors.
- Starting shape
- This sets the starting category for the algorithm but it will likely wander off into other categories after a bit
- Variety parameter
- This determines the diversity of the models by setting how many local vectors to choose from.
Selected results from the streamlit app:
Interpolating between various swivel chairs:
Interpolating between various random sofas:
(many more gifs available in the media/ folder of this repo)