# ShapeNetSEM Subset Creation

The ShapeNet SEM dataset is very large with roughly 12k meshes, this is a notebook to create a smaller subset of the dataset which can then be used to play around with and to create a proof of concept model.

## 1. Creating a subset of the dataset

In [1]:
import pandas as pd
import random

In [2]:
metadata_df = pd.read_csv('../Data/ShapeNetSem/Files/metadata.csv')
categories_df = pd.read_csv('../Data/ShapeNetSem/Files/categories.synset.csv')

In [3]:
metadata_df.head()

Unnamed: 0,fullId,category,wnsynset,wnlemmas,up,front,unit,aligned.dims,isContainerLike,surfaceVolume,solidVolume,supportSurfaceArea,weight,staticFrictionForce,name,tags
0,wss.1004f30be305f33d28a1548e344f0e2e,WallArt,n3445436,globe,"0\,0\,1","0\,-1\,0",0.015032,"111.97104\,84.16881\,0.0",,,,,,,globe blue spruce,*
1,wss.100f39dce7690f59efb94709f30ce0d2,"Chair,Recliner",n4069540,"recliner,reclining chair,lounger","0\,0\,1","0\,-1\,0",0.012947,"111.34567\,100.547745\,96.13275",,,,,,,couch,carpet recliner
2,wss.101354f9d8dede686f7b08d9de913afe,"Speaker,_Attributes",n3696785,"loudspeaker,speaker,speaker unit,loudspeaker s...",,,0.01362,"43.43313\,60.591843\,32.17259",,,,,,,sound speaker,"audio,audio speaker,music,sound,sound speaker,..."
3,wss.1018f01d42ae7fad52249d8432f6087e,Sword,n4380981,"sword,blade,brand,steel",,,0.010424,"78.23693\,4.360932\,18.058533",,,,,,,sword,"blade,sword,weapon"
4,wss.1022fe7dd03f6a4d4d5ad9f13ac9f4e7,"Chair,OfficeChair",n3005231,chair,"0\,0\,1","0\,1\,0",0.017984,"60.366123\,98.00925\,66.79712",,,,,,,office chair,"fauteuil de bureau,office chair"


In [4]:
categories_df.head()

Unnamed: 0,category,matchLevel,synset,synset words,synset gloss
0,1Shelves,0,n4197095,shelf,a support that consists of a horizontal surfac...
1,2Shelves,0,n4197095,shelf,a support that consists of a horizontal surfac...
2,3Shelves,0,n4197095,shelf,a support that consists of a horizontal surfac...
3,4Shelves,0,n4197095,shelf,a support that consists of a horizontal surfac...
4,5Shelves,0,n4197095,shelf,a support that consists of a horizontal surfac...


In [5]:
num_cat_sample = 10
meshes_per_cat = 10

sampled_categories = categories_df.sample(n=num_cat_sample, random_state=42)
sampled_metadata_df = pd.DataFrame()
print(type(sampled_categories))
for category in sampled_categories['category']:
    print(category)

"""
final_df = pd.merge(sampled_metadata_df, categories_df[['category', 'synset words', 'synset gloss']], on='category', how='left')
final_df = final_df[['fullId', 'category', 'name', 'tags', 'synset words', 'synset gloss']]
final_df.head()"""

<class 'pandas.core.frame.DataFrame'>
AccentChair
Tank
Ladder
Rabbit
ComputerMouse
Wallet
Vanity
Motorcycle
Butterfly
Ottoman


"\nfinal_df = pd.merge(sampled_metadata_df, categories_df[['category', 'synset words', 'synset gloss']], on='category', how='left')\nfinal_df = final_df[['fullId', 'category', 'name', 'tags', 'synset words', 'synset gloss']]\nfinal_df.head()"

In [6]:
num_cat_sample = 20
meshes_per_cat = 10

sampled_categories = categories_df.sample(n=num_cat_sample, random_state=42)
sampled_metadata_df = pd.DataFrame()

for category in sampled_categories['category']:
    try:
        category_metadata = metadata_df[metadata_df['category'].str.contains(category, case=False, na=False)]
        sample_size = min(meshes_per_cat, len(category_metadata))
        sampled_category_metadata = category_metadata.sample(n=sample_size, random_state=42)
        sampled_metadata_df = pd.concat([sampled_metadata_df, sampled_category_metadata])
        if sample_size < meshes_per_cat:
            print(category, sample_size)
    except ValueError:
        print(category)
        continue

final_df = pd.merge(sampled_metadata_df, categories_df[['category', 'synset words', 'synset gloss']], on='category', how='left')
final_df = final_df[['fullId', 'category', 'name', 'tags', 'synset words', 'synset gloss']]
final_df.shape

Ladder 4
Rabbit 0
Wallet 1
Vanity 4
Motorcycle 7
Butterfly 1
Pedestal 4


(151, 6)

In [7]:
def create_subset_by_random_sampling(metadata_df, categories_df, num_cat_sample = 20, meshes_per_cat = 10):
    remaining_categories = categories_df['category'].to_list()
    sampled_categories = []
    sampled_metadata_df, metadata_with_category_df, final_df = pd.DataFrame(), pd.DataFrame(), pd.DataFrame()

    while len(sampled_categories) < num_cat_sample:
        sampled_category = random.choice(remaining_categories) # select a category at random
        metadata_entries = metadata_df[metadata_df['category'].str.contains(sampled_category, case=False, na=False)] # for cases where there might be more than one category
        num_samples = len(metadata_entries)

        if num_samples >= meshes_per_cat: # make sure there are minimum meshes and only sample in that case
            sample_size = min(meshes_per_cat, len(metadata_entries))
            sampled_metadata_row = metadata_entries.sample(n=sample_size, random_state=42)
            sampled_metadata_df = pd.concat([sampled_metadata_df, sampled_metadata_row])
            sampled_categories.append(sampled_category)

        remaining_categories.remove(sampled_category)
    # print(sampled_metadata_df.shape)
    sampled_metadata_df.reset_index(inplace=True)

    # Joining category info with metadata info
    for _, row in sampled_metadata_df.iterrows():
        # Split the categories by commas (for cases like 'Laptop, PC')
        categories_list = row['category'].split(',')
        synset_words_cat, synset_gloss_cat = [], []
        category_row = pd.DataFrame()
        flag = False
        for category in categories_list:
            category_row = pd.DataFrame()
            if '_' not in category: # No attribute categories
                category = category.strip()
                category_row = categories_df[categories_df['category'] == category] # Find category info
                synset_words = category_row['synset words'].to_list()
                synset_gloss = category_row['synset gloss'].to_list()

                # To check that it has not already been added to the list, gets rid of duplicate category info problem (chair was getting duplicated as same info for multiple categories eg Chair and OfficeChair)
                if not set(synset_words).issubset(set(synset_words_cat)):
                    synset_words_cat.extend(synset_words) # Add info to lists
                if not set(synset_gloss).issubset(set(synset_gloss_cat)):
                    synset_gloss_cat.extend(synset_gloss) # Add info to lists
                flag = True

        if flag:
            # Create a new row with the current metadata and synset info
            expanded_row = row.copy()  # Copy the current row
            expanded_row['synset words'] = synset_words_cat # Use extended list of all sub categories 
            expanded_row['synset gloss'] = synset_gloss_cat # Use extended list of all sub categories
            
            # Append the expanded row to the new DataFrame
            metadata_with_category_df = pd.concat([metadata_with_category_df, expanded_row.to_frame().T], ignore_index=True)
        else:
            print(category, " not available")

    if not metadata_with_category_df.empty:
        final_df = metadata_with_category_df[['fullId', 'category', 'name', 'tags', 'synset words', 'synset gloss']]
        final_df.reset_index()
        
    return final_df

trial = create_subset_by_random_sampling(metadata_df, categories_df, num_cat_sample=1, meshes_per_cat=2)
trial

Unnamed: 0,fullId,category,name,tags,synset words,synset gloss
0,wss.5b009a44661c47e2a4ee05a5737b7178,"DrinkingUtensil,WineGlass",wine glass,"glass,red wine,red wine glass,white wine glass...","[container, wineglass]",[any object that can be used to hold things (e...
1,wss.9d5ae32fe7c6f825c8e7963449f8577,"DrinkingUtensil,Cup",drink,"bourbon,cup,drink,glass,ice,liquor,whiskey","[container, cup]",[any object that can be used to hold things (e...


## 2. Creating a semantic descriptor (text prompt)

In [8]:
import pandas as pd
import random

We need to create a semantic natural language descriptor using name, tags, synset words and synset_gloss. We will also need some text variation while doing this to improve generalisation. To perform this task, we will have two approaches to create different types of semantic descriptors that will be used as the text prompt for our multimodal model, these are: -
1. Template Approach: using a template where we will simply substitute values of the columns
2. LLM Appproach: giving the values along with a prompt to an LLM and getting a generated natural language description

### 2.1 Template Descriptor

Eg and variations: 

- **Template:** "A [name] which is commonly known as [tags]. It is associated with the following characteristics: [synset words]. A general description of this item is: [synset gloss]." 
- **Output:** "A Wooden Chair, which is commonly known as furniture, chair, seating. It is associated with the following characteristics: chair, seating, armchair. A general description of this item is: A piece of furniture used for sitting."

- **Template:**"The [name] is a [synset words] often used for [tags]. It is made of [material] and can be described as: [synset gloss]." 
- **Output:**  "The Wooden Chair is a chair, seating, armchair often used for furniture, seating. It is made of wood and can be described as: A piece of furniture used for sitting."

- **Template:**"[name] is a [synset words] designed for [tags]. It is a [material] item that serves the purpose of [synset gloss]." 
- **Output:**  "Wooden Chair is a chair, seating designed for furniture, seating. It is a wooden item that serves the purpose of: A piece of furniture used for sitting."


Example: For a Chair:

* Name: "Wooden Chair"
* Tags: "furniture, chair, seating"
* Synset Words: "chair, seating, armchair"
* Synset Gloss: "A piece of furniture used for sitting."
The generated description could be: "A Wooden Chair, which is commonly known as furniture, chair, seating. It is associated with the following characteristics: chair, seating, armchair. A general description of this item is: A piece of furniture used for sitting."

In [9]:
template1 = """A [name] which is commonly known as [tags]. It is associated with the following characteristics: [synset words].
A general description of this item is: [synset gloss]."""

template2 = """The [name] is a [synset words] often used for [tags]. It can be described as: [synset gloss]."""

template3 = """[name] is a [synset words] designed for [tags]. It serves the purpose of [synset gloss]."""

In [10]:
ssm_with_descriptor = create_subset_by_random_sampling(metadata_df, categories_df, num_cat_sample=20, meshes_per_cat=10)
ssm_with_descriptor.to_csv('testing_desc.csv', index=False)

In [11]:
def apply_template(row, template):
    tags = row['tags']
    if pd.isna(tags) or tags == "*":
        tags = "<no_tags>"
    elif isinstance(tags, list):
        tags = ", ".join(tags)
    
    synset_words = row['synset words']
    if isinstance(synset_words, list):
        synset_words = [item for item in synset_words if not pd.isna(item)]
        if not synset_words:
            synset_words = "<blank>"
        else:
            synset_words = ", ".join(synset_words)
    
    synset_gloss = row['synset gloss']
    if isinstance(synset_gloss, list):
        synset_gloss = [item for item in synset_gloss if not pd.isna(item)]
        if not synset_gloss:
            synset_gloss = "<blank>"
        else:
            synset_gloss = ", ".join(synset_gloss)
        
    return template.replace("[name]", row['name']).replace("[tags]", tags).replace("[synset words]", synset_words).replace("[synset gloss]", synset_gloss)
    

In [12]:
ssm_with_descriptor['template1_desc'] = ssm_with_descriptor.apply(lambda row: apply_template(row, template1), axis=1)
ssm_with_descriptor['template2_desc'] = ssm_with_descriptor.apply(lambda row: apply_template(row, template2), axis=1)
ssm_with_descriptor['template3_desc'] = ssm_with_descriptor.apply(lambda row: apply_template(row, template3), axis=1)

In [13]:
ssm_with_descriptor.head()

Unnamed: 0,fullId,category,name,tags,synset words,synset gloss,template1_desc,template2_desc,template3_desc
0,wss.9a9508597dee231d4e205745311c3a,USBStick,usb memory stick sketchyphysics,"computer,drive,flash,memory,port,sketchyphysic...","[memory device,storage device]",[a device that preserves information for retri...,A usb memory stick sketchyphysics which is c...,The usb memory stick sketchyphysics is a mem...,usb memory stick sketchyphysics is a memory ...
1,wss.71f6598d5426fb34c33dcf45f2780ed8,USBStick,red usb,,"[memory device,storage device]",[a device that preserves information for retri...,A red usb which is commonly known as <no_tag...,"The red usb is a memory device,storage devic...","red usb is a memory device,storage device de..."
2,wss.6de307ddb4317eae1c816f27d2a33b03,USBStick,cig usb drive,"cig,cigarette,flash drive,smoke,thumb drive,us...","[memory device,storage device]",[a device that preserves information for retri...,A cig usb drive which is commonly known as cig...,"The cig usb drive is a memory device,storage d...","cig usb drive is a memory device,storage devic..."
3,wss.ab82d56cf9cc2476d154e1b098031d39,USBStick,mp,"music,usb item","[memory device,storage device]",[a device that preserves information for retri...,"A mp which is commonly known as music,usb item...","The mp is a memory device,storage device often...","mp is a memory device,storage device designed ..."
4,wss.353fe70ec5633fc5e05878fff8971272,USBStick,usb drive,,"[memory device,storage device]",[a device that preserves information for retri...,A usb drive which is commonly known as <no_tag...,"The usb drive is a memory device,storage devic...","usb drive is a memory device,storage device de..."


In [14]:
ssm_with_descriptor.shape

(200, 9)

In [22]:
# Remove the wss tag from id as it is not present in file structure

ssm_with_descriptor['fullId'] = ssm_with_descriptor['fullId'].apply(lambda x: x.replace('wss.', ''))

In [23]:
ssm_with_descriptor.head()

Unnamed: 0,fullId,category,name,tags,synset words,synset gloss,template1_desc,template2_desc,template3_desc
0,9a9508597dee231d4e205745311c3a,USBStick,usb memory stick sketchyphysics,"computer,drive,flash,memory,port,sketchyphysic...","[memory device,storage device]",[a device that preserves information for retri...,A usb memory stick sketchyphysics which is c...,The usb memory stick sketchyphysics is a mem...,usb memory stick sketchyphysics is a memory ...
1,71f6598d5426fb34c33dcf45f2780ed8,USBStick,red usb,,"[memory device,storage device]",[a device that preserves information for retri...,A red usb which is commonly known as <no_tag...,"The red usb is a memory device,storage devic...","red usb is a memory device,storage device de..."
2,6de307ddb4317eae1c816f27d2a33b03,USBStick,cig usb drive,"cig,cigarette,flash drive,smoke,thumb drive,us...","[memory device,storage device]",[a device that preserves information for retri...,A cig usb drive which is commonly known as cig...,"The cig usb drive is a memory device,storage d...","cig usb drive is a memory device,storage devic..."
3,ab82d56cf9cc2476d154e1b098031d39,USBStick,mp,"music,usb item","[memory device,storage device]",[a device that preserves information for retri...,"A mp which is commonly known as music,usb item...","The mp is a memory device,storage device often...","mp is a memory device,storage device designed ..."
4,353fe70ec5633fc5e05878fff8971272,USBStick,usb drive,,"[memory device,storage device]",[a device that preserves information for retri...,A usb drive which is commonly known as <no_tag...,"The usb drive is a memory device,storage devic...","usb drive is a memory device,storage device de..."


In [24]:
ssm_with_descriptor.to_csv('subset.csv', index=False)

### 2.2 LLM Generated Descriptor

Given below is an example prompt to the LLM:

[prompt start]

You are an expert in 3D object recognition. Given the following attributes of an object, generate a detailed and natural language description:

- Name: {name}
- Tags: {tags}
- Synset Words: {synset_words}
- Synset Gloss: {synset_gloss}

Based on these, describe the object as if explaining it to someone who has never seen it before.

[prompt end]

Such an approach could result in a more natural language descriptor that goes beyond the listed values.

## 3. Creating Snapshots for Image Modality