<td>
   <a target="_blank" href="https://labelbox.com" ><img src="https://labelbox.com/blog/content/images/2021/02/logo-v4.svg" width=256/></a>
</td>

<td>
<a href="https://colab.research.google.com/drive/1bxaWWPYGZnvGfFbHyYAX-pgn6kVMHP7q" target="_blank"><img
src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
</td>

<td>
<a href="https://github.com/Labelbox/labelpandas/blob/main/notebooks/urls.ipynb" target="_blank"><img
src="https://img.shields.io/badge/GitHub-100000?logo=github&logoColor=white" alt="GitHub"></a>
</td>

# _**Creating Data Rows from Row Data URLs with LabelPandas**_

## _**Documentation**_

**Requirements:**

- A `row_data` column - This column must be URLs that point to the asset to-be-uploaded

- Either a `dataset_id` column or an input argument for `dataset_id`
  - If uploading to multiple datasets, provide a `dataset_id` column 
  - If uploading to one dataset, provide a `dataset_id` input argument
    - _This can still be a column if it's already in your CSV file_

**Recommended:**
- A `global_key` column
  - This column contains unique identifiers for your data rows
  - If none is provided, will default to your `row_data` column
- An `external_id` column
  - This column contains non-unique identifiers for your data rows
  - If none is provided, will default to your `global_key` column  

## _**Code**_

Install Labelpandas and Labelbox

In [None]:
## Install LabelPandas
!pip install labelpandas --upgrade -q

In [None]:
import labelpandas as lp
import pandas as pd

Define runtime variables

In [None]:
csv_path = "https://raw.githubusercontent.com/Labelbox/labelpandas/main/datasets/urls.csv" # Path to your CSV file
api_key = "" # Labelbox API Key

Load a CSV

In [None]:
df = pd.read_csv(csv_path)
df.head(10)

Unnamed: 0,external_id,urls,global_key
0,Euq7yrfb8tbDFpd-cv_cpg.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-Euq7yrfb8tbDFpd-cv_cpg.jpg
1,gCbn5IeZtE92OaUbyl1ZjQ.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-gCbn5IeZtE92OaUbyl1ZjQ.jpg
2,9Y6-Vl3bwsZFTNxX8gqHYw.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-9Y6-Vl3bwsZFTNxX8gqHYw.jpg
3,1MnLIosQZmXH3T-iU-4mtQ.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-1MnLIosQZmXH3T-iU-4mtQ.jpg
4,y_9N4kVjlc_AO3C63k2L9w.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-y_9N4kVjlc_AO3C63k2L9w.jpg
5,qm4W6ktKCGR22n21A3o_0A.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-qm4W6ktKCGR22n21A3o_0A.jpg
6,pmkRRbZGfIYr-2YN8gwK2Q.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-pmkRRbZGfIYr-2YN8gwK2Q.jpg
7,2J23mch-V41VdHYVvedGWw.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-2J23mch-V41VdHYVvedGWw.jpg
8,9GvpiX9gvFLLpzGN5CCcqA.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-9GvpiX9gvFLLpzGN5CCcqA.jpg
9,-nvTzJ-2am0mxQPqnZzZBA.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test--nvTzJ-2am0mxQPqnZzZBA.jpg


Optional: Rename Columns to LabelPandas-expected names

In [None]:
df = lp.rename_columns(
    table=df, 
    rename_dict = {
        "urls" : "row_data"
    }
)
df.head(10)

Unnamed: 0,external_id,row_data,global_key
0,Euq7yrfb8tbDFpd-cv_cpg.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-Euq7yrfb8tbDFpd-cv_cpg.jpg
1,gCbn5IeZtE92OaUbyl1ZjQ.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-gCbn5IeZtE92OaUbyl1ZjQ.jpg
2,9Y6-Vl3bwsZFTNxX8gqHYw.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-9Y6-Vl3bwsZFTNxX8gqHYw.jpg
3,1MnLIosQZmXH3T-iU-4mtQ.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-1MnLIosQZmXH3T-iU-4mtQ.jpg
4,y_9N4kVjlc_AO3C63k2L9w.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-y_9N4kVjlc_AO3C63k2L9w.jpg
5,qm4W6ktKCGR22n21A3o_0A.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-qm4W6ktKCGR22n21A3o_0A.jpg
6,pmkRRbZGfIYr-2YN8gwK2Q.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-pmkRRbZGfIYr-2YN8gwK2Q.jpg
7,2J23mch-V41VdHYVvedGWw.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-2J23mch-V41VdHYVvedGWw.jpg
8,9GvpiX9gvFLLpzGN5CCcqA.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test-9GvpiX9gvFLLpzGN5CCcqA.jpg
9,-nvTzJ-2am0mxQPqnZzZBA.jpg,https://labelbox.s3-us-west-2.amazonaws.com/da...,labelpandas-test--nvTzJ-2am0mxQPqnZzZBA.jpg


Create a Dataset (for demonstration purposes only)

In [None]:
client = lp.Client(lb_api_key=api_key)

In [None]:
datset_id = client.lb_client.create_dataset(name="LabelPandas-urls").uid

Upload to Labelbox

In [None]:
results = client.create_data_rows_from_table(
    table = df,
    dataset_id = datset_id,
    skip_duplicates = False, # If True, will skip data rows where a global key is already in use,
    verbose = True, # If True, prints information about code execution
)

Creating upload list - 10 rows in Pandas DataFrame
Beginning data row upload for dataset ID cle3ftwd71ipz070c8fqy8zmq: uploading 10 data rows
Batch #1: 10 data rows
Success: Upload batch number 1 successful
Upload complete - all data rows uploaded
