Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Changelog

## Version 2.4.1 (2020-07-22)
### Fixed
* `Dataset.create_data_row` and `Dataset.create_data_rows` will now upload with content type to ensure the Labelbox editor can show videos.

## Version 2.4 (2020-01-30)

### Added
Expand Down
56 changes: 56 additions & 0 deletions CONTRIB.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Labelbox Python SDK Contribution Guide

## Repository Organization

The SDK source (excluding tests and support tools) is organized into the
following packages/modules:
* `orm/` package contains code that supports the general mapping of Labelbox
data to Python objects. This includes base classes, attribute (field and
relationship) classes, generic GraphQL queries etc.
* `schema/` package contains definitions of classes which represent data type
(e.g. Project, Label etc.). It relies on `orm/` classes for easy and succinct
object definitions. It also contains custom functionalities and custom GraphQL
templates where necessary.
* `client.py` contains the `Client` class that's the client-side stub for
communicating with Labelbox servers.
* `exceptions.py` contains declarations for all Labelbox errors.
* `pagination.py` contains support for paginated relationship and collection
fetching.
* `utils.py` contains utility functions.

## Branches

* All development happens in per-feature branches prefixed by contributor's
initials. For example `fs/feature_name`.
* Approved PRs are merged to the `develop` branch.
* The `develop` branch is merged to `master` on each release.

## Testing

Currently the SDK functionality is tested using integration tests. These tests
communicate with a Labelbox server (by default the staging server) and are in
that sense not self-contained. Besides that they are organized like unit test
and are based on the `pytest` library.

To execute tests you will need to provide an API key for the server you're using
for testing (staging by default) in the `LABELBOX_TEST_API_KEY` environment
variable. For more info see [Labelbox API key
docs](https://labelbox.helpdocs.io/docs/api/getting-started).

## Release Steps

Each release should follow the following steps:

1. Update the Python SDK package version in `REPO_ROOT/setup.py`
2. Make sure the `CHANGELOG.md` contains appropriate info
3. Commit these changes and tag the commit in Git as `vX.Y`
4. Merge `develop` to `master` (fast-forward only).
5. Generate a GitHub release.
6. Build the library in the [standard
way](https://packaging.python.org/tutorials/packaging-projects/#generating-distribution-archives)
7. Upload the distribution archives in the [standard
way](https://packaging.python.org/tutorials/packaging-projects/#uploading-the-distribution-archives).
You will need credentials for the `labelbox` PyPI user.
8. Run the `REPO_ROOT/tools/api_reference_generator.py` script to update
[HelpDocs documentation](https://labelbox.helpdocs.io/docs/). You will need
to provide a HelpDocs API key for.
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Labelbox Python API
# Labelbox Python SDK

Labelbox is the enterprise-grade training data solution with fast AI enabled labeling tools, labeling automation, human workforce, data management, a powerful API for integration & SDK for extensibility. Visit http://labelbox.com/ for more information.

Expand Down Expand Up @@ -29,3 +29,6 @@ client = Client()
## Documentation

[Visit our docs](https://labelbox.com/docs/python-api) to learn how to [create a project](https://labelbox.com/docs/python-api/create-first-project), read through some helpful user guides, and view our [API reference](https://labelbox.com/docs/python-api/api-reference).

## Repo Organization and Contribution
Please consult `CONTRIB.md`
37 changes: 28 additions & 9 deletions labelbox/client.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from datetime import datetime, timezone
import json
import logging
import mimetypes
import os

import requests
Expand Down Expand Up @@ -75,7 +76,7 @@ def execute(self, query, params=None, timeout=10.0):
labelbox.exceptions.InvalidQueryError: If `query` is not
syntactically or semantically valid (checked server-side).
labelbox.exceptions.ApiLimitError: If the server API limit was
exceeded. See "How to import data" in the online documentation
exceeded. See "How to import data" in the online documentation
to see API limits.
labelbox.exceptions.TimeoutError: If response was not received
in `timeout` seconds.
Expand Down Expand Up @@ -112,14 +113,14 @@ def convert_value(value):
raise labelbox.exceptions.NetworkError(e)

except Exception as e:
logger.error("Unknown error: %s", str(e))
raise labelbox.exceptions.LabelboxError(str(e))
raise labelbox.exceptions.LabelboxError(
"Unknown error during Client.query(): " + str(e), e)

try:
response = response.json()
except:
raise labelbox.exceptions.LabelboxError(
"Failed to parse response as JSON: %s", response.text)
"Failed to parse response as JSON: %s" % response.text)

errors = response.get("errors", [])

Expand Down Expand Up @@ -171,9 +172,27 @@ def check_errors(keywords, *path):

return response["data"]

def upload_file(self, path):
"""Uploads given path to local file.

Also includes best guess at the content type of the file.

Args:
path (str): path to local file to be uploaded.
Returns:
str, the URL of uploaded data.
Raises:
labelbox.exceptions.LabelboxError: If upload failed.

"""
content_type, _ = mimetypes.guess_type(path)
basename = os.path.basename(path)
with open(path, "rb") as f:
return self.upload_data(data=(basename, f.read(), content_type))

def upload_data(self, data):
""" Uploads the given data (bytes) to Labelbox.

Args:
data (bytes): The data to upload.
Returns:
Expand All @@ -183,8 +202,8 @@ def upload_data(self, data):
"""
request_data = {
"operations": json.dumps({
"variables": {"file": None, "contentLength": len(data), "sign": False},
"query": """mutation UploadFile($file: Upload!, $contentLength: Int!,
"variables": {"file": None, "contentLength": len(data), "sign": False},
"query": """mutation UploadFile($file: Upload!, $contentLength: Int!,
$sign: Boolean) {
uploadFile(file: $file, contentLength: $contentLength,
sign: $sign) {url filename} } """,}),
Expand All @@ -199,9 +218,9 @@ def upload_data(self, data):

try:
file_data = response.json().get("data", None)
except ValueError: # response is not valid JSON
except ValueError as e: # response is not valid JSON
raise labelbox.exceptions.LabelboxError(
"Failed to upload, unknown cause")
"Failed to upload, unknown cause", e)

if not file_data or not file_data.get("uploadFile", None):
raise labelbox.exceptions.LabelboxError(
Expand Down
25 changes: 16 additions & 9 deletions labelbox/exceptions.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,18 @@
class LabelboxError(Exception):
"""Base class for exceptions."""
def __init__(self, message, *args):
super().__init__(*args)
def __init__(self, message, cause=None):
"""
Args:
message (str): Informative message about the exception.
cause (Exception): The cause of the exception (an Exception
raised by Python or another library). Optional.
"""
super().__init__(message, cause)
self.message = message
self.cause = cause

def __str__(self):
return self.message + str(self.args)


class AuthenticationError(LabelboxError):
Expand Down Expand Up @@ -31,9 +41,8 @@ def __init__(self, db_object_type, params):


class ValidationFailedError(LabelboxError):
"""Exception raised for when a GraphQL query fails validation (query cost, etc.)

E.g. a query that is too expensive, or depth is too deep.
"""Exception raised for when a GraphQL query fails validation (query cost,
etc.) E.g. a query that is too expensive, or depth is too deep.
"""
pass

Expand All @@ -47,10 +56,8 @@ class InvalidQueryError(LabelboxError):

class NetworkError(LabelboxError):
"""Raised when an HTTPError occurs."""
def __init__(self, cause, message=None):
if message is None:
message = str(cause)
super().__init__(message)
def __init__(self, cause):
super().__init__(str(cause), cause)
self.cause = cause


Expand Down
9 changes: 3 additions & 6 deletions labelbox/schema/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,16 +48,15 @@ def create_data_row(self, **kwargs):
# If row data is a local file path, upload it to server.
row_data = kwargs[DataRow.row_data.name]
if os.path.exists(row_data):
with open(row_data, "rb") as f:
kwargs[DataRow.row_data.name] = self.client.upload_data(f.read())
kwargs[DataRow.row_data.name] = self.client.upload_file(row_data)

kwargs[DataRow.dataset.name] = self

return self.client._create(DataRow, kwargs)

def create_data_rows(self, items):
""" Creates multiple DataRow objects based on the given `items`.

Each element in `items` can be either a `str` or a `dict`. If
it is a `str`, then it is interpreted as a local file path. The file
is uploaded to Labelbox and a DataRow referencing it is created.
Expand Down Expand Up @@ -91,9 +90,7 @@ def create_data_rows(self, items):

def upload_if_necessary(item):
if isinstance(item, str):
with open(item, "rb") as f:
item_data = f.read()
item_url = self.client.upload_data(item_data)
item_url = self.client.upload_file(item)
# Convert item from str into a dict so it gets processed
# like all other dicts.
item = {DataRow.row_data: item_url,
Expand Down
5 changes: 4 additions & 1 deletion labelbox/schema/project.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,8 @@ def export_labels(self, timeout_seconds=60):
""" Calls the server-side Label exporting that generates a JSON
payload, and returns the URL to that payload.

Will only generate a new URL at a max frequency of 30 min.

Args:
timeout_seconds (float): Max waiting time, in seconds.
Returns:
Expand Down Expand Up @@ -199,14 +201,15 @@ def setup(self, labeling_frontend, labeling_frontend_options):
if not isinstance(labeling_frontend_options, str):
labeling_frontend_options = json.dumps(labeling_frontend_options)

self.labeling_frontend.connect(labeling_frontend)

LFO = Entity.LabelingFrontendOptions
labeling_frontend_options = self.client._create(
LFO, {LFO.project: self, LFO.labeling_frontend: labeling_frontend,
LFO.customization_options: labeling_frontend_options,
LFO.organization: organization
})

self.labeling_frontend.connect(labeling_frontend)
timestamp = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
self.update(setup_complete=timestamp)

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

setuptools.setup(
name="labelbox",
version="2.4",
version="2.4.1",
author="Labelbox",
author_email="engineering@labelbox.com",
description="Labelbox Python API",
Expand Down
Loading