Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: docs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: docs
name: docs
# Builds the documentation website and publishes it to https://groundlight.github.io

As we rely more on GHA, we should get in the habit of writing docstrings at the top for what they do.

on:
push:
paths:
# Only run this workflow if there are changes in any of these files.
- .github/workflows/**
- docs/**
pull_request:
paths:
# Only run this workflow if there are changes in any of these files.
- .github/workflows/**
- docs/**
defaults:
run:
working-directory: docs

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Get code
uses: actions/checkout@v3
- name: Setup npm
uses: actions/setup-node@v3
with:
node-version: 18
cache: npm
- name: Install dependencies
run: npm install
- name: Build website
run: npm run build
- name: Deploy website
# Docs: https://github.com/peaceiris/actions-gh-pages#%EF%B8%8F-docusaurus
uses: peaceiris/actions-gh-pages@v3
# Only run this on pushes to the `main` branch (try-docusaurus-base is temporary for testing)
if: github.ref == 'refs/heads/main' || github.ref == 'refs/heads/try-docusaurus-base'
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
# Build output to publish to the `gh-pages` branch:
publish_dir: ./docs/build
# The following lines assign commit authorship to the official
# GH-Actions bot for deploys to `gh-pages` branch:
# https://github.com/actions/checkout/issues/13#issuecomment-724415212
# The GH actions bot is used by default if you didn't specify the two fields.
# You can swap them out with your own user credentials.
user_name: github-actions[bot]
user_email: 41898282+github-actions[bot]@users.noreply.github.com
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -165,3 +165,4 @@ poetry.lock
node_modules/

*.swp
**/.python-version
178 changes: 11 additions & 167 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,182 +1,26 @@
# Groundlight Python SDK

Groundlight makes it simple to understand images. You can easily create computer vision detectors just by describing what you want to know using natural language. Groundlight uses a combination of advanced AI and real-time human monitors to automatically turn your images and queries into a customized machine learning (ML) model for your application.
Check out our [documentation here](https://groundlight.github.io/python-sdk/docs/getting-started)!

## Computer vision made simple

How to build a working computer vision system in just 5 lines of python code:

```Python
from groundlight import Groundlight
gl = Groundlight()
d = gl.get_or_create_detector(name="door", query="Is the door open?") # define with natural language
image_query = gl.submit_image_query(detector=d, image=jpeg_img) # send in an image
print(f"The answer is {image_query.result}") # get the result
```

**How does it work?** Your images are first analyzed by machine learning (ML) models which are automatically trained on your data. If those models have high enough confidence, that's your answer. But if the models are unsure, then the images are progressively escalated to more resource-intensive analysis methods up to real-time human review. So what you get is a computer vision system that starts working right away without even needing to first gather and label a dataset. At first it will operate with high latency, because people need to review the image queries. But over time, the ML systems will learn and improve so queries come back faster with higher confidence.

*Note: The SDK is currently in "beta" phase. Interfaces are subject to change in future versions.*


## Managing confidence levels and latency

Groundlight gives you a simple way to control the trade-off of latency against accuracy. The longer you can wait for an answer to your image query, the better accuracy you can get. In particular, if the ML models are unsure of the best response, they will escalate the image query to more intensive analysis with more complex models and real-time human monitors as needed. Your code can easily wait for this delayed response. Either way, these new results are automatically trained into your models so your next queries will get better results faster.

The desired confidence level is set as the escalation threshold on your detector. This determines what is the minimum confidence score for the ML system to provide before the image query is escalated.

For example, say you want to set your desired confidence level to 0.95, but that you're willing to wait up to 60 seconds to get a confident response.

```Python
d = gl.get_or_create_detector(name="trash", query="Is the trash can full?", confidence=0.95)
image_query = gl.submit_image_query(detector=d, image=jpeg_img, wait=60)
# This will wait until either 60 seconds have passed or the confidence reaches 0.95
print(f"The answer is {image_query.result}")
```

Or if you want to run as fast as possible, set `wait=0`. This way you will only get the ML results, without waiting for escalation. Image queries which are below the desired confidence level still be escalated for further analysis, and the results are incorporated as training data to improve your ML model, but your code will not wait for that to happen.

```Python
image_query = gl.submit_image_query(detector=d, image=jpeg_img, wait=0)
```

If the returned result was generated from an ML model, you can see the confidence score returned for the image query:

```Python
print(f"The confidence is {image_query.result.confidence}")
```

## Getting Started

1. Install the `groundlight` SDK. Requires python version 3.7 or higher. See [prerequisites](#Prerequisites).

```Bash
$ pip3 install groundlight
```

1. To access the API, you need an API token. You can create one on the
[groundlight web app](https://app.groundlight.ai/reef/my-account/api-tokens).

The API token should be stored securely. You can use it directly in your code to initialize the SDK like:

```python
gl = Groundlight(api_token="<YOUR_API_TOKEN>")
```

which is an easy way to get started, but is NOT a best practice. Please do not commit your API Token to version control! Instead we recommend setting the `GROUNDLIGHT_API_TOKEN` environment variable outside your code so that the SDK can find it automatically.

```bash
$ export GROUNDLIGHT_API_TOKEN=api_2GdXMflhJi6L_example
$ python3 glapp.py
```



## Prerequisites

### Using Groundlight SDK on Ubuntu 18.04

Ubuntu 18.04 still uses python 3.6 by default, which is end-of-life. We recommend setting up python 3.8 as follows:

```
# Prepare Ubuntu to install things
sudo apt-get update
# Install the basics
sudo apt-get install -y python3.8 python3.8-distutils curl
# Configure `python3` to run python3.8 by default
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 10
# Download and install pip3.8
curl https://bootstrap.pypa.io/get-pip.py > /tmp/get-pip.py
sudo python3.8 /tmp/get-pip.py
# Configure `pip3` to run pip3.8
sudo update-alternatives --install /usr/bin/pip3 pip3 $(which pip3.8) 10
# Now we can install Groundlight!
pip3 install groundlight
pip install groundlight
```

## Using Groundlight on the edge

Starting your model evaluations at the edge reduces latency, cost, network bandwidth, and energy. Once you have downloaded and installed your Groundlight edge models, you can configure the Groundlight SDK to use your edge environment by configuring the 'endpoint' to point at your local environment as such:
Build a working computer vision system in just 5 lines of python:

```Python
```python
from groundlight import Groundlight
gl = Groundlight(endpoint="http://localhost:6717")
```

(Edge model download is not yet generally available.)

## Advanced

### Explicitly create a new detector

Typically you'll use the ```get_or_create_detector(name: str, query: str)``` method to find an existing detector you've already created with the same name, or create a new one if it doesn't exists. But if you'd like to force creating a new detector you can also use the ```create_detector(name: str, query: str)``` method

```Python
detector = gl.create_detector(name="your_detector_name", query="is this what we want to see?")
```

### Retrieve an existing detector

```Python
detector = gl.get_detector(id="YOUR_DETECTOR_ID")
```

### List your detectors

```Python
# Defaults to 10 results per page
detectors = gl.list_detectors()

# Pagination: 3rd page of 25 results per page
detectors = gl.list_detectors(page=3, page_size=25)
```

### Retrieve an image query

In practice, you may want to check for a new result on your query. For example, after a cloud reviewer labels your query. For example, you can use the `image_query.id` after the above `submit_image_query()` call.

```Python
image_query = gl.get_image_query(id="YOUR_IMAGE_QUERY_ID")
```

### List your previous image queries

```Python
# Defaults to 10 results per page
image_queries = gl.list_image_queries()

# Pagination: 3rd page of 25 results per page
image_queries = gl.list_image_queries(page=3, page_size=25)
```

### Adding labels to existing image queries

Groundlight lets you start using models by making queries against your very first image, but there are a few situations where you might either have an existing dataset, or you'd like to handle the escalation response programatically in your own code but still include the label to get better responses in the future. With your ```image_query``` from either ```submit_image_query()``` or ```get_image_query()``` you can add the label directly. Note that if the query is already in the escalation queue due to low ML confidence or audit thresholds, it may also receive labels from another source.

```Python
add_label(image_query, 'YES'). # or 'NO'
gl = Groundlight()
d = gl.get_or_create_detector(name="door", query="Is the door open?")
image_query = gl.submit_image_query(detector=d, image=jpeg_img)
print(f"The answer is {image_query.result}")
```

The only valid labels at this time are ```'YES'``` and ```'NO'```


### Handling HTTP errors
### How does it work?

If there is an HTTP error during an API call, it will raise an `ApiException`. You can access different metadata from that exception:

```Python
from groundlight import ApiException, Groundlight

gl = Groundlight()
try:
detectors = gl.list_detectors()
except ApiException as e:
# Many fields available to describe the error
print(e)
print(e.args)
print(e.body)
print(e.headers)
print(e.reason)
print(e.status)
```
Your images are first analyzed by machine learning (ML) models which are automatically trained on your data. If those models have high enough confidence, that's your answer. But if the models are unsure, then the images are progressively escalated to more resource-intensive analysis methods up to real-time human review. So what you get is a computer vision system that starts working right away without even needing to first gather and label a dataset. At first it will operate with high latency, because people need to review the image queries. But over time, the ML systems will learn and improve so queries come back faster with higher confidence.

_Note: The SDK is currently in "beta" phase. Interfaces are subject to change in future versions._
20 changes: 20 additions & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Dependencies
/node_modules

# Production
/build

# Generated files
.docusaurus
.cache-loader

# Misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local

npm-debug.log*
yarn-debug.log*
yarn-error.log*
41 changes: 41 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Website

This website is built using [Docusaurus 2](https://docusaurus.io/), a modern static website generator.

Comment on lines +3 to +4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course we'll want to update all this sample stuff.

### Installation

```
$ yarn
```

### Local Development

```
$ yarn start
```

This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

### Build

```
$ yarn build
```

This command generates static content into the `build` directory and can be served using any static contents hosting service.

### Deployment

Using SSH:

```
$ USE_SSH=true yarn deploy
```

Not using SSH:

```
$ GIT_USER=<Your GitHub username> yarn deploy
```

If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.
3 changes: 3 additions & 0 deletions docs/babel.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
module.exports = {
presets: [require.resolve('@docusaurus/core/lib/babel/preset')],
};
20 changes: 20 additions & 0 deletions docs/blog/2023-04-01-mdx-blog-post.mdx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to have a "blog" right now. We should build a place to put blog-like articles, but I don't want them to be presented that way.

Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
slug: mdx-blog-post
title: MDX Blog Post
authors: [michael]
tags: [docusaurus, markdown, mdx]
---

Blog posts support [Docusaurus Markdown features](https://docusaurus.io/docs/markdown-features), such as [MDX](https://mdxjs.com/).

:::tip

Use the power of React to create interactive blog posts.

```js
<button onClick={() => alert("button clicked!")}>Click me!</button>
```

<button onClick={() => alert("button clicked!")}>Click me!</button>

:::
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 25 additions & 0 deletions docs/blog/2023-04-02-welcome/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
slug: welcome
title: Welcome
authors: [michael]
tags: [hello, docusaurus]
---

[Docusaurus blogging features](https://docusaurus.io/docs/blog) are powered by the [blog plugin](https://docusaurus.io/docs/api/plugins/@docusaurus/plugin-content-blog).

Simply add Markdown files (or folders) to the `blog` directory.

Regular blog authors can be added to `authors.yml`.

The blog post date can be extracted from filenames, such as:

- `2019-05-30-welcome.md`
- `2019-05-30-welcome/index.md`

A blog post folder can be convenient to co-locate blog post images:

![Docusaurus Plushie](./docusaurus-plushie-banner.jpeg)

The blog supports tags as well!

**And if you don't want a blog**: just delete this directory, and use `blog: false` in your Docusaurus config.
4 changes: 4 additions & 0 deletions docs/blog/authors.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
michael:
name: Michael Vogelsong
url: https://github.com/mjvogelsong
image_url: https://github.com/mjvogelsong.png
8 changes: 8 additions & 0 deletions docs/docs/building-applications/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Building Applications",
"position": 2,
"link": {
"type": "generated-index",
"description": "Let's build some apps!"
}
}
10 changes: 10 additions & 0 deletions docs/docs/building-applications/edge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Using Groundlight on the edge

Starting your model evaluations at the edge reduces latency, cost, network bandwidth, and energy. Once you have downloaded and installed your Groundlight edge models, you can configure the Groundlight SDK to use your edge environment by configuring the 'endpoint' to point at your local environment as such:

```python
from groundlight import Groundlight
gl = Groundlight(endpoint="http://localhost:6717")
```

(Edge model download is not yet generally available.)
Loading