<td>
   <a target="_blank" href="https://labelbox.com" ><img src="https://labelbox.com/blog/content/images/2021/02/logo-v4.svg" width=256/></a>
</td>

<td>
<a href="https://colab.research.google.com/drive/1XSaiJlER0cC0yiekCg1eb9CuQw7lPOTL" target="_blank"><img
src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
</td>

<td>
<a href="https://github.com/Labelbox/labelpandas/blob/main/notebooks/metadata.ipynb" target="_blank"><img
src="https://img.shields.io/badge/GitHub-100000?logo=github&logoColor=white" alt="GitHub"></a>
</td>

# _**Creating Data Rows with Metadata with LabelPandas**_

## _**Documentation**_

### **Data Rows**
_____________________

**Requirements:**

- A `row_data` column - This column must be URLs that point to the asset to-be-uploaded

- Either a `dataset_id` column or an input argument for `dataset_id`
  - If uploading to multiple datasets, provide a `dataset_id` column 
  - If uploading to one dataset, provide a `dataset_id` input argument
    - _This can still be a column if it's already in your CSV file_

**Recommended:**
- A `global_key` column
  - This column contains unique identifiers for your data rows
  - If none is provided, will default to your `row_data` column
- An `external_id` column
  - This column contains non-unique identifiers for your data rows
  - If none is provided, will default to your `global_key` column  

**Optional:**
- A `project_id` columm or an input argument for `project_id`
  - If batching to multiple projects, provide a `project_id` column
  - If batching to one project, provide a `project_id` input argument
    - _This can still be a column if it's already in your CSV file_

### **Metadata**
_____________________

For metadata, the column name must be " `metadata` + `divider` + `metadata_type` + `divider` + `metadata_field_name` "
  - Example: `metadata///string///sample_metadata_field_name`
  - `metadata_type` must be one of the following:
    - `string`, `enum`, `datetime`, `number` 
  - If the `metadata_field_name` doesn't exist yet in Labelbox, LabelPandas will create it for you


The values for metadata fields must correspond with the metadata type per Labelbox docs
  - More here:
    - [Labelbox definition of metadata](https://docs.labelbox.com/docs/datarow-metadata)
    - [Labelbox docs on creating metadata](https://docs.labelbox.com/docs/createmodify-metadata-schema)    

## _**Code**_

Install LabelPandas

In [None]:
!pip install labelpandas --upgrade -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/189.2 KB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━[0m [32m184.3/189.2 KB[0m [31m5.6 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m189.2/189.2 KB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import labelpandas as lp
import pandas as pd

In [None]:
csv_path = "https://raw.githubusercontent.com/Labelbox/labelpandas/main/datasets/metadata.csv" # Path to your CSV file
api_key = ""

Load a CSV

In [None]:
df = pd.read_csv(csv_path)
df.head(10)

Unnamed: 0,external_id,row_data,global_key,metadata///string///LabelPandas-String,metadata///number///LabelPandas-Number,metadata///enum///LabelPandas-Enum,metadata///datetime///LabelPandas-Datetime
0,Euq7yrfb8tbDFpd-cv_cpg.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-metadata-test-Euq7yrfb8tbDFpd-cv_c...,Raw Text String 0,5256,C,03/05/1908 12:13 PM
1,gCbn5IeZtE92OaUbyl1ZjQ.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-metadata-test-gCbn5IeZtE92OaUbyl1Z...,Raw Text String 1,3999,B,01/30/1926 11:39 PM
2,9Y6-Vl3bwsZFTNxX8gqHYw.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-metadata-test-9Y6-Vl3bwsZFTNxX8gqH...,Raw Text String 2,809,A,01/23/1974 05:06 AM
3,1MnLIosQZmXH3T-iU-4mtQ.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-metadata-test-1MnLIosQZmXH3T-iU-4m...,Raw Text String 3,2673,B,09/11/1925 12:09 PM
4,y_9N4kVjlc_AO3C63k2L9w.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-metadata-test-y_9N4kVjlc_AO3C63k2L...,Raw Text String 4,2742,B,11/22/1977 01:04 PM
5,qm4W6ktKCGR22n21A3o_0A.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-metadata-test-qm4W6ktKCGR22n21A3o_...,Raw Text String 5,3409,B,06/28/1960 04:47 AM
6,pmkRRbZGfIYr-2YN8gwK2Q.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-metadata-test-pmkRRbZGfIYr-2YN8gwK...,Raw Text String 6,1467,A,07/31/2007 09:26 AM
7,2J23mch-V41VdHYVvedGWw.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-metadata-test-2J23mch-V41VdHYVvedG...,Raw Text String 7,9185,C,11/12/1965 03:29 AM
8,9GvpiX9gvFLLpzGN5CCcqA.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-metadata-test-9GvpiX9gvFLLpzGN5CCc...,Raw Text String 8,6681,B,02/04/1986 02:41 PM
9,-nvTzJ-2am0mxQPqnZzZBA.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-metadata-test--nvTzJ-2am0mxQPqnZzZ...,Raw Text String 9,8508,C,03/15/1965 12:38 PM


Create a Dataset (for demonstration purposes only)

In [None]:
client = lp.Client(lb_api_key=api_key)

In [None]:
datset_id = client.lb_client.create_dataset(name="LabelPandas-metadata").uid

Upload to Labelbox

In [None]:
results = client.create_data_rows_from_table(
    table = df,
    dataset_id = datset_id,
    skip_duplicates = False, # If True, will skip data rows where a global key is already in use,
    verbose = True, # If True, prints information about code execution
)

Creating Labelbox metadata field with name LabelPandas-String of type string
Creating Labelbox metadata field with name LabelPandas-Number of type number
Creating Labelbox metadata field with name LabelPandas-Enum of type enum
Creating Labelbox metadata field with name LabelPandas-Datetime of type datetime
Creating upload list - 10 rows in Pandas DataFrame
Beginning data row upload for dataset ID cle95d5id0sai07sf1o9e5295: uploading 10 data rows
Batch #1: 10 data rows
Success: Upload batch number 1 successful
Upload complete - all data rows uploaded
