Skip to content

Commit

Permalink
Release 0.8.0 (#467)
Browse files Browse the repository at this point in the history
* Add old build of ts

* Add TS templates. Add correct validation for TS. Add sample csv for TS.

* Prettify templates ts

* Add drag & drop to TS

* Make another way to check ts in uploader

* Correct exception treatment with app_before_request_callback

* S3 region support, resolve URI based on current storage setup, add wip lsf builds

* Update error message

* Update LSF build from feature/ts

* Refactor templates

* Add tutorial. Templates reformats

* Fix valuetype lowercase. Update LSF build

* Split ts tutorial to templates/time_series.md and release notes

* Update LSF build

* Use LSF build with TS fixes and persistent state

* Update templates. Add data, completion, predictions support to config preview. Update LSF build

* Add more templates

* Update LSF

* Handle drag-n-drop import with cloud storage connected

* Change default labeling config

* Back default config

* Add dynamic sample task for TS

* Update LSF

* Add support for headless sample task

* Update LSF build

* Support inputFormat & separator in sample task generator

* Fix data examples

* Try to fix independ channels template

* Fix templates

* Update LSF

* Rename vars on backend, inplace generator/predefined sample task,change templates

* Update LSF build

* Generate sample task in render-label-studio

* Add units for ts-text linking template

* Fix templates

* Update LSF

* Fix data example without valueType

* Update LSF build

* Add tag docs. Work on templates

* Update LSF build

* Bugfix in time-series.csv endpoint

* Finish templates. Fix bugs in TS sample generation

* Pretty fix

* Add removable notes on setup page

* Folders drag&drop

* Add marked js

* Fix links to relation docs

* Fix one-point text in template

* Add wait in e2e tests

* Update LSF build

* Add timeless template

* Add env in tests

* Hide readme markdown if errors on render

* Update docs

* Rename templates

* Update docs

* Move readme to the bottom

* Show readme hidden by default

* Rework time_series.md

* Fix docs. Fix path to time-series.csv

* Rename endpoint static_time_series

* Fix generate example

* Fix upload arg in playground

* blog post text and images

* Update time_series.md (#469)

* Update time_series.md

* Fix upload arg in playground

Co-authored-by: makseq-ubnt <makseq@gmail.com>

* Fix time series md and other preview stuff

* modifying text, links, and images

* few changes in docs

* typos

* typo

* fixing case

* Change TS classification template

* Fix docs/blog errors

* Update LSF

* Change v0.8.0

Co-authored-by: nik <nik@heartex.net>
Co-authored-by: hlomzik <hlomzik@gmail.com>
Co-authored-by: Nikita Skryabin <nr@fenelon.ru>
Co-authored-by: Mikhail Maluyk <mikhail.maluyk@gmail.com>
Co-authored-by: Rhythm of vision <rov@Rhythms-Air.hsd1.ca.comcast.net>
  • Loading branch information
6 people committed Oct 27, 2020
1 parent c216398 commit a747855
Show file tree
Hide file tree
Showing 82 changed files with 3,679 additions and 164 deletions.
1 change: 1 addition & 0 deletions .github/workflows/e2e-tests.yml
Expand Up @@ -27,6 +27,7 @@ jobs:
uses: actions/cache@v1
env:
cache-name: cache-node-modules
collect_analytics: 0
with:
path: ~/.npm
key: npm-${{ runner.os }}-${{ hashFiles('e2e/package-lock.json') }}
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/python-package.yml
Expand Up @@ -35,6 +35,8 @@ jobs:
run: |
label-studio init my_project
- name: Test with pytest
env:
collect_analytics: 0
run: |
python -m pytest -vrP
coverage run -m --source=label_studio pytest
Expand Down
29 changes: 21 additions & 8 deletions docs/source/blog/index.html
Expand Up @@ -136,20 +136,20 @@
<!-- </a> -->
<!-- </div> -->

<!-- Release 0.7.0 -->
<!-- Release 0.8.0 -->
<div class="column">
<a href="/blog/release-070-cloud-storage-enablement.html">
<a href="/blog/release-080-time-series-labeling.html">
<div class="card">
<div class="image-wrap">
<div style="background-image: url(/images/release-070/s3-mascot-04.png); background-size:cover" class="image"></div>
<div style="background-image: url(/images/release-080/time-series-labeling-with-multiple-channels.png); background-size:cover" class="image"></div>
</div>
<div class="category">release notes</div>
<div class="desc">29 May 2020, 5 min read</div>
<div class="title">Label Studio 0.7.0 Release - Cloud Storage Enablement</div>
<div class="desc">26 Oct 2020, 7 min read</div>
<div class="title">Label Studio 0.8.0 Release - Time series is Everywhere!</div>
</div>
</a>
</div>

<!-- News letters -->
<div class="column">
<div class="card">
Expand All @@ -160,8 +160,21 @@
<iframe src="https://labelstudio.substack.com/embed" frameborder="0" scrolling="no" style="width:90%"></iframe>
</center>
</div>
</div>

</div>
</div>

<!-- Release 0.7.0 -->
<div class="column">
<a href="/blog/release-070-cloud-storage-enablement.html">
<div class="card">
<div class="image-wrap">
<div style="background-image: url(/images/release-070/s3-mascot-04.png); background-size:cover" class="image"></div>
</div>
<div class="category">release notes</div>
<div class="desc">29 May 2020, 5 min read</div>
<div class="title">Label Studio 0.7.0 Release - Cloud Storage Enablement</div>
</div>
</a>
</div>

<!-- Release 0.6.0 -->
Expand Down
133 changes: 133 additions & 0 deletions docs/source/blog/release-080-time-series-labeling.md
@@ -0,0 +1,133 @@

# Time Series Data Labeling

Time series is everywhere! Devices, sensors and events produce time series, for example, your heartbeat can be represented as a series of events measured every second, or your favorite step tracker recording a number of steps you take per minute.

All these signals can be used for ML model development, and we're excited to present you with one of the first time series data labeling solutions that work across a variety of use-cases and can help you develop ML applications based on time series data!

<br/>
<img src="/images/release-080/main.gif" class="gif-border" />

> Labeled time series data is crucial if you want to develop supervised ML models for pattern recognition. It can also serve as a ground truth data for validating methods performance. Read below for some of the scenarios and implementation details
## Labeling UI Performance

A majority of time series datasets tend to have a lot of points. Therefore the tool has to scale well to handle the situation when you have more than 100K points. Initially we've tried to use some existing frontend libraries that provide time series implementation, but it turned out that none of them were up for the task, even with just 10,000 points you'd start to experience the lag when zooming or panning. It was clear that we need to come up with a more robust implementation. We've based the rendering on d3 and after numerous optimization attempts we've got to the desired result:

### **1,000,000 data points and 10 channels**

<img src="/images/release-080/ui.gif" class="gif-border" />

Some of the techniques we have used include tiling - when we have a big number of datapoint we split it into chunks and render those chunks first, this helps us achieve great performance when the number of data points is very large. When you zoom out the algorithm samples specific points to give you an overview of your time series data.

## Working with a variety of input types out of the box

For examples below we will be using the following configuration:

```html
<View>
<TimeSeriesLabels name="label" toName="ts">
<Label value="Walk" />
<Label value="Run" />
</TimeSeriesLabels>

<TimeSeries name="ts" valueType="url" value="$csv" sep="," overviewChannels="sen1,sen2">
<Channel column="sen1" />
<Channel column="sen2" />
</TimeSeries>
</View>
```

> If you're new to Label Studio, [learn](https://labelstud.io/tags/) how you can use tags to setup different labeling interfaces for your data
Depending on where your time series data is coming from it can be formatted very differently. Label Studio provides a way to configure how time series parsing is done so you don't have to transform the original file. Let's start with a simple CSV like that:

```csv
time,sen1,sen2
100,1,23
101,2,34
102,3,45
```

CSV with weirdly formatted datetime, because you've captured that from a weird sensor that doesn't follow the standard:

```csv
time,sen1,sen2
2020-Feb-01 9:30,34.23,272
2020-Feb-01 9:31,251.23,352
2020-Feb-01 9:32,337.124,327
```

In that case, there is `timeFormat` that can handle parsing for you, it uses [strftime](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes).

The `valueType` controls whether the input is provided as-is, or via a URL. For example, the input file may look like a list of URLs and in that case `valueType="url"` will load the contents of each URL and expect a time series data inside.

```csv
csvURL
http://example.com/path/to/timeseries1.csv
http://example.com/path/to/timeseries2.csv
```

For the headless CSV, you can use a columns index to point to the right columns. For example, using `2` in Channel's `column` attribute would look for the third column (it starts from zero) inside headless CSV.

`timeColumn` is the name of the column with temporal data, notice that you can skip that altogether, and then it generates that for you.

You can also use `timeDisplayFormat` to configure the desired output of the temporal column. It can be a number or a date, if a temporal column is a date then use strftime to format it, otherwise, if it's a number then use [d3 number](https://github.com/d3/d3-format#locale_format) formatting.

## Zoom and Pan

Press `ctrl` key and use your mouse wheel to zoom and pan. If you have a huge time series, then changing the window position and size inside an overview may not let you zoom as much as you like, because it has a certain limit on its width, then you can continue zooming with a mouse wheel

<br/>
<img src="/images/release-080/zoom.gif" class="gif-border" />

## Multivariate and Univariate

There are plenty of ways how you can setup the plots, every defined channel is synchronized with any other channel defined inside the same time series object, giving you a multivariate time series labeling experience. You can also define multiple time series objects and get distinct objects.

<br/>
<img src="/images/release-080/multi-uni.png" />

Use the `Channel` tag to represent each additional time-series channel. By providing multiple channels you get a multivariate labeling interface and can label one channel by looking at the behavior of other items at the same timestamp on another channel.

> `showTracker` attribute on TimeSeries object controls if you see the tracker and holding `shift` key makes it sync between the channels
## Instance labeling and snapping to the point

Double-click to put a bar labeling one particular data point, instead of labeling an entire region. And when you're creating a region it always gets snapped to the closest point.

<br/>
<img src="/images/release-080/instance.png" />

## Configuring overview

By default, an overview is created from the first channel, but you have control over that. Use `overviewChannels` and define what columns are included, it uses the same format as the `column` parameter, and can also use multiple channels inside an overview if you comma separate it.

<br/>
<img src="/images/release-080/overview.png" />

## Synchronizing across data types [experimental]

It's not always the case that you can label time series just by looking at the plots. Different events may have different representations, and in such cases, visual support is required. TimeSeries tag can synchronize to audio or video.

<br/>
<img src="/images/release-080/videosync.png" />

This is an experimental feature right now, and we're working on finalizing the implementation, but if you have use-cases, ping us in [Slack](https://join.slack.com/t/label-studio/shared_invite/zt-cr8b7ygm-6L45z7biEBw4HXa5A2b5pw), we will help you to set it up.

## Next

Ready to try? [Install Label Studio](/guide/#Running-with-pip) following our guide and check the [template]() on time series configuration. Also, join the Slack channel if you need any help, have feedback, or feature requests.

Cheers!

## Resources

- Label Studio
- [Templates](/templates/time_series.html) - Label Studio pre configured templates for Time Series
- [TimeSeries](/tags/timeseries.html) - Time Series tag specification
- [Channel](/tags/timeseries.html#Channel) - Channel tag specification
- Machine Learning
- https://github.com/awslabs/gluon-ts - Probabilistic time series modeling in Python
- https://github.com/alan-turing-institute/sktime - sktime is a Python machine learning toolbox for time series with a unified interface for multiple learning tasks.
- https://github.com/blue-yonder/tsfresh - Time Series feature extraction package
37 changes: 37 additions & 0 deletions docs/source/blog/release-080-time-series.md
@@ -0,0 +1,37 @@
---
title: Label Studio Release Notes 0.8.0 - Time Series Support
type: blog
order: 99
---

## What problems does Label Studio solve with Time Series Labeling?

Time Series analysis is widely used in medical and robotics areas.

<GIF-with-labeling-demo>

## Quickstart

1. You need to install and run Label Studio (LS) first. It could be done by many ways using [pip](https://labelstud.io/guide/#Running-with-pip)
`pip install label-studio && label-studio start my_project --init`
or using [Docker](https://labelstud.io/guide/#Running-with-Docker), [Github sources](https://labelstud.io/guide/#Running-from-source) and [one-click-deploy](https://github.com/heartexlabs/label-studio#one-click-deploy) button.

2. Open LS in the browser (for local usage it will be [http://localhost:8080](http://localhost:8080) usually).

3. Go to Setup page ([http://localhost:8080/setup](http://localhost:8080/setup)). On this page you need to configure a labeling scheme for your project using LS tags. Read more about LS tags [in the documentation](/tags/timeseries.html). The fastest way to do it is to use templates which are available on Setup page:
<img src="/images/release-080/ts-templates.png" class="gif-border" />

4. Import your CSV/TSV/JSON via Import page ([http://localhost:8080/import](http://localhost:8080/import)).

5. Start Labeling ([http://localhost:8080/](http://localhost:8080/))


## Special cases

### Multiple time series in one labeling config

If you want to use multiple time series tags in one labeling config then you need manually host your CSV files and create JSON with tasks for import which contains links to CSV files. Or you can store time series data in tasks directly.

### Video & audio sync with time series

It's possible to synchronize TimeSeries with video and audio in Label Studio. Right now you can do it using HyperText tag with html objects `<audio src="path">`/`<video src="path">` and TimeSeries together. We have some solutions for this in testing stage and we can share it with you [by request in slack](https://join.slack.com/t/label-studio/shared_invite/zt-cr8b7ygm-6L45z7biEBw4HXa5A2b5pw).
9 changes: 9 additions & 0 deletions docs/source/guide/index.md
Expand Up @@ -171,4 +171,13 @@ Than you need to use `--cert` and `--key` option on start:

```
label-studio start test --cert certificate.pem --key key.pem
```


### Health check

LS has a special endpoint for health checks:

```
/api/health
```
3 changes: 3 additions & 0 deletions docs/source/guide/tasks.md
Expand Up @@ -16,6 +16,7 @@ Label Studio expects the JSON-formatted list of _tasks_ as input. Each _task_ is
- `<Audio value="$key">`: `value` is taken as a valid URL to audio file
- `<AudioPlus value="$key">`: `value` is taken as a valid URL to an audio file with CORS policy enabled on the server side
- `<Image value="$key">`: `value` is a valid URL to an image file
- `<TimeSeries value="$key">`: `value` is a valid URL to an CSV/TSV file if `valueType="url"` otherwise it should be JSON dict with column-arrays `"value": {"first_column": [...], ...}` if `valueType="json"`
* (optional) **id** - integer task ID
* (optional) **completions** - list of output annotation results, where each result is saved using [Label Studio's completion format](/guide/export.html#completions). You can import annotation results in order to use them in consequent labeling task.
* (optional) **predictions** - list of model prediction results, where each result is saved using [Label Studio's prediction format](/guide/export.html#predictions). Importing predictions is useful for automatic task prelabeling & active learning & exploration.
Expand Down Expand Up @@ -123,6 +124,8 @@ this is a second task,456

> Note: Currently CSV / TSV files could be imported only in UI.
> Note: If your config has one TimeSeries instance then CSV/TSV will be interpreted as time series data while import. This CSV/TSV will be hosted as a resource file. The LS will create a task automatically with a proper link to the uploaded CSV/TSV.
### Plain text

```bash
Expand Down
2 changes: 1 addition & 1 deletion docs/source/playground/index.html
Expand Up @@ -1745,7 +1745,7 @@ <h3>Output preview</h3>

// load sample task
$.post({
url: host + '/business/projects/upload-example/',
url: host + '/business/projects/upload-example/?playground=1',
data: {label_config: val}
})
.fail(function(o) {
Expand Down
7 changes: 4 additions & 3 deletions docs/source/tags/keypoint.md
Expand Up @@ -13,13 +13,14 @@ KeyPoint is used to add a keypoint to an image without label selection. It's use
| name | <code>string</code> | | name of the element |
| toName | <code>string</code> | | name of the image to label |
| [opacity] | <code>float</code> | <code>0.9</code> | opacity of keypoint |
| [fillColor] | <code>string</code> | <code>&quot;#8bad00&quot;</code> | keypoint color |
| [strokeWidth] | <code>number</code> | <code>1</code> | size of keypoint |
| [fillColor] | <code>string</code> | <code>&quot;#8bad00&quot;</code> | keypoint fill color |
| [strokeWidth] | <code>number</code> | <code>1</code> | width of the stroke |
| [stokeColor] | <code>string</code> | <code>&quot;#8bad00&quot;</code> | keypoint stroke color |

### Example
```html
<View>
<KeyPoint name="kp-1" toName="img-1" strokeWidth="4" fillColor="red" />
<KeyPoint name="kp-1" toName="img-1" />
<Image name="img-1" value="$img" />
</View>
```
10 changes: 6 additions & 4 deletions docs/source/tags/keypointlabels.md
Expand Up @@ -14,14 +14,16 @@ KeyPointLabels tag creates labeled keypoints
| name | <code>string</code> | | name of the element |
| toName | <code>string</code> | | name of the image to label |
| [opacity] | <code>float</code> | <code>0.9</code> | opacity of keypoint |
| [strokeWidth] | <code>number</code> | <code>1</code> | size of keypoint |
| [fillColor] | <code>string</code> | | keypoint fill color, default is transparent |
| [strokeWidth] | <code>number</code> | <code>1</code> | width of the stroke |
| [stokeColor] | <code>string</code> | <code>&quot;#8bad00&quot;</code> | keypoint stroke color |

### Example
```html
<View>
<KeyPointLabels name="kp-1" toName="img-1" strokeWidth="4">
<Label value="Face" background="red" />
<Label value="Nose" background="blue" />
<KeyPointLabels name="kp-1" toName="img-1">
<Label value="Face" />
<Label value="Nose" />
</KeyPointLabels>
<Image name="img-1" value="$img" />
</View>
Expand Down
16 changes: 4 additions & 12 deletions docs/source/tags/ranker.md
@@ -1,21 +1,17 @@
---
title: Ranker
type: tags
order: 416
order: 415
---

Ranker tag, used to ranking models

> Ranker has a complex mechanics and uses only the "prediction" field from the input task,
Ranker has a complex mechanics and uses only the "prediction" field from the input task,
please explore input task example carefully.

It renders given list of strings and allows to drag and reorder them.
To see this tag in action you have to use **Input task example** json below as task
on "Import" page:
1) setup given config,
2) go to Import,
3) copy-paste json to the input field and submit,
4) start the labeling.
To see this tag in action you have to use json below as task on "Import" page:
setup given config, go to Import, then copy-paste json to the input field and submit.

### Parameters

Expand All @@ -30,13 +26,9 @@ on "Import" page:
<View>
<Text name="txt-1" value="$text"></Text>
<Ranker name="ranker-1" toName="txt-1" ranked="true" sortedHighlightColor="red"></Ranker>
<Ranker name="ranker" value="$items"></Ranker>
</View>
```
### Example

Input task example

```json
[{
"data": {
Expand Down
4 changes: 1 addition & 3 deletions docs/source/tags/table.md
Expand Up @@ -14,7 +14,5 @@ Table tag, show object keys and values in a table

### Example
```html
<View>
<Table name="text-1" value="$text"></Table>
</View>
<Table name="text-1" value="$text"></Table>
```

0 comments on commit a747855

Please sign in to comment.