-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add old build of ts * Add TS templates. Add correct validation for TS. Add sample csv for TS. * Prettify templates ts * Add drag & drop to TS * Make another way to check ts in uploader * Correct exception treatment with app_before_request_callback * S3 region support, resolve URI based on current storage setup, add wip lsf builds * Update error message * Update LSF build from feature/ts * Refactor templates * Add tutorial. Templates reformats * Fix valuetype lowercase. Update LSF build * Split ts tutorial to templates/time_series.md and release notes * Update LSF build * Use LSF build with TS fixes and persistent state * Update templates. Add data, completion, predictions support to config preview. Update LSF build * Add more templates * Update LSF * Handle drag-n-drop import with cloud storage connected * Change default labeling config * Back default config * Add dynamic sample task for TS * Update LSF * Add support for headless sample task * Update LSF build * Support inputFormat & separator in sample task generator * Fix data examples * Try to fix independ channels template * Fix templates * Update LSF * Rename vars on backend, inplace generator/predefined sample task,change templates * Update LSF build * Generate sample task in render-label-studio * Add units for ts-text linking template * Fix templates * Update LSF * Fix data example without valueType * Update LSF build * Add tag docs. Work on templates * Update LSF build * Bugfix in time-series.csv endpoint * Finish templates. Fix bugs in TS sample generation * Pretty fix * Add removable notes on setup page * Folders drag&drop * Add marked js * Fix links to relation docs * Fix one-point text in template * Add wait in e2e tests * Update LSF build * Add timeless template * Add env in tests * Hide readme markdown if errors on render * Update docs * Rename templates * Update docs * Move readme to the bottom * Show readme hidden by default * Rework time_series.md * Fix docs. Fix path to time-series.csv * Rename endpoint static_time_series * Fix generate example * Fix upload arg in playground * blog post text and images * Update time_series.md (#469) * Update time_series.md * Fix upload arg in playground Co-authored-by: makseq-ubnt <makseq@gmail.com> * Fix time series md and other preview stuff * modifying text, links, and images * few changes in docs * typos * typo * fixing case * Change TS classification template * Fix docs/blog errors * Update LSF * Change v0.8.0 Co-authored-by: nik <nik@heartex.net> Co-authored-by: hlomzik <hlomzik@gmail.com> Co-authored-by: Nikita Skryabin <nr@fenelon.ru> Co-authored-by: Mikhail Maluyk <mikhail.maluyk@gmail.com> Co-authored-by: Rhythm of vision <rov@Rhythms-Air.hsd1.ca.comcast.net>
- Loading branch information
1 parent
c216398
commit a747855
Showing
82 changed files
with
3,679 additions
and
164 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
|
||
# Time Series Data Labeling | ||
|
||
Time series is everywhere! Devices, sensors and events produce time series, for example, your heartbeat can be represented as a series of events measured every second, or your favorite step tracker recording a number of steps you take per minute. | ||
|
||
All these signals can be used for ML model development, and we're excited to present you with one of the first time series data labeling solutions that work across a variety of use-cases and can help you develop ML applications based on time series data! | ||
|
||
<br/> | ||
<img src="/images/release-080/main.gif" class="gif-border" /> | ||
|
||
> Labeled time series data is crucial if you want to develop supervised ML models for pattern recognition. It can also serve as a ground truth data for validating methods performance. Read below for some of the scenarios and implementation details | ||
## Labeling UI Performance | ||
|
||
A majority of time series datasets tend to have a lot of points. Therefore the tool has to scale well to handle the situation when you have more than 100K points. Initially we've tried to use some existing frontend libraries that provide time series implementation, but it turned out that none of them were up for the task, even with just 10,000 points you'd start to experience the lag when zooming or panning. It was clear that we need to come up with a more robust implementation. We've based the rendering on d3 and after numerous optimization attempts we've got to the desired result: | ||
|
||
### **1,000,000 data points and 10 channels** | ||
|
||
<img src="/images/release-080/ui.gif" class="gif-border" /> | ||
|
||
Some of the techniques we have used include tiling - when we have a big number of datapoint we split it into chunks and render those chunks first, this helps us achieve great performance when the number of data points is very large. When you zoom out the algorithm samples specific points to give you an overview of your time series data. | ||
|
||
## Working with a variety of input types out of the box | ||
|
||
For examples below we will be using the following configuration: | ||
|
||
```html | ||
<View> | ||
<TimeSeriesLabels name="label" toName="ts"> | ||
<Label value="Walk" /> | ||
<Label value="Run" /> | ||
</TimeSeriesLabels> | ||
|
||
<TimeSeries name="ts" valueType="url" value="$csv" sep="," overviewChannels="sen1,sen2"> | ||
<Channel column="sen1" /> | ||
<Channel column="sen2" /> | ||
</TimeSeries> | ||
</View> | ||
``` | ||
|
||
> If you're new to Label Studio, [learn](https://labelstud.io/tags/) how you can use tags to setup different labeling interfaces for your data | ||
Depending on where your time series data is coming from it can be formatted very differently. Label Studio provides a way to configure how time series parsing is done so you don't have to transform the original file. Let's start with a simple CSV like that: | ||
|
||
```csv | ||
time,sen1,sen2 | ||
100,1,23 | ||
101,2,34 | ||
102,3,45 | ||
``` | ||
|
||
CSV with weirdly formatted datetime, because you've captured that from a weird sensor that doesn't follow the standard: | ||
|
||
```csv | ||
time,sen1,sen2 | ||
2020-Feb-01 9:30,34.23,272 | ||
2020-Feb-01 9:31,251.23,352 | ||
2020-Feb-01 9:32,337.124,327 | ||
``` | ||
|
||
In that case, there is `timeFormat` that can handle parsing for you, it uses [strftime](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes). | ||
|
||
The `valueType` controls whether the input is provided as-is, or via a URL. For example, the input file may look like a list of URLs and in that case `valueType="url"` will load the contents of each URL and expect a time series data inside. | ||
|
||
```csv | ||
csvURL | ||
http://example.com/path/to/timeseries1.csv | ||
http://example.com/path/to/timeseries2.csv | ||
``` | ||
|
||
For the headless CSV, you can use a columns index to point to the right columns. For example, using `2` in Channel's `column` attribute would look for the third column (it starts from zero) inside headless CSV. | ||
|
||
`timeColumn` is the name of the column with temporal data, notice that you can skip that altogether, and then it generates that for you. | ||
|
||
You can also use `timeDisplayFormat` to configure the desired output of the temporal column. It can be a number or a date, if a temporal column is a date then use strftime to format it, otherwise, if it's a number then use [d3 number](https://github.com/d3/d3-format#locale_format) formatting. | ||
|
||
## Zoom and Pan | ||
|
||
Press `ctrl` key and use your mouse wheel to zoom and pan. If you have a huge time series, then changing the window position and size inside an overview may not let you zoom as much as you like, because it has a certain limit on its width, then you can continue zooming with a mouse wheel | ||
|
||
<br/> | ||
<img src="/images/release-080/zoom.gif" class="gif-border" /> | ||
|
||
## Multivariate and Univariate | ||
|
||
There are plenty of ways how you can setup the plots, every defined channel is synchronized with any other channel defined inside the same time series object, giving you a multivariate time series labeling experience. You can also define multiple time series objects and get distinct objects. | ||
|
||
<br/> | ||
<img src="/images/release-080/multi-uni.png" /> | ||
|
||
Use the `Channel` tag to represent each additional time-series channel. By providing multiple channels you get a multivariate labeling interface and can label one channel by looking at the behavior of other items at the same timestamp on another channel. | ||
|
||
> `showTracker` attribute on TimeSeries object controls if you see the tracker and holding `shift` key makes it sync between the channels | ||
## Instance labeling and snapping to the point | ||
|
||
Double-click to put a bar labeling one particular data point, instead of labeling an entire region. And when you're creating a region it always gets snapped to the closest point. | ||
|
||
<br/> | ||
<img src="/images/release-080/instance.png" /> | ||
|
||
## Configuring overview | ||
|
||
By default, an overview is created from the first channel, but you have control over that. Use `overviewChannels` and define what columns are included, it uses the same format as the `column` parameter, and can also use multiple channels inside an overview if you comma separate it. | ||
|
||
<br/> | ||
<img src="/images/release-080/overview.png" /> | ||
|
||
## Synchronizing across data types [experimental] | ||
|
||
It's not always the case that you can label time series just by looking at the plots. Different events may have different representations, and in such cases, visual support is required. TimeSeries tag can synchronize to audio or video. | ||
|
||
<br/> | ||
<img src="/images/release-080/videosync.png" /> | ||
|
||
This is an experimental feature right now, and we're working on finalizing the implementation, but if you have use-cases, ping us in [Slack](https://join.slack.com/t/label-studio/shared_invite/zt-cr8b7ygm-6L45z7biEBw4HXa5A2b5pw), we will help you to set it up. | ||
|
||
## Next | ||
|
||
Ready to try? [Install Label Studio](/guide/#Running-with-pip) following our guide and check the [template]() on time series configuration. Also, join the Slack channel if you need any help, have feedback, or feature requests. | ||
|
||
Cheers! | ||
|
||
## Resources | ||
|
||
- Label Studio | ||
- [Templates](/templates/time_series.html) - Label Studio pre configured templates for Time Series | ||
- [TimeSeries](/tags/timeseries.html) - Time Series tag specification | ||
- [Channel](/tags/timeseries.html#Channel) - Channel tag specification | ||
- Machine Learning | ||
- https://github.com/awslabs/gluon-ts - Probabilistic time series modeling in Python | ||
- https://github.com/alan-turing-institute/sktime - sktime is a Python machine learning toolbox for time series with a unified interface for multiple learning tasks. | ||
- https://github.com/blue-yonder/tsfresh - Time Series feature extraction package |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
--- | ||
title: Label Studio Release Notes 0.8.0 - Time Series Support | ||
type: blog | ||
order: 99 | ||
--- | ||
|
||
## What problems does Label Studio solve with Time Series Labeling? | ||
|
||
Time Series analysis is widely used in medical and robotics areas. | ||
|
||
<GIF-with-labeling-demo> | ||
|
||
## Quickstart | ||
|
||
1. You need to install and run Label Studio (LS) first. It could be done by many ways using [pip](https://labelstud.io/guide/#Running-with-pip) | ||
`pip install label-studio && label-studio start my_project --init` | ||
or using [Docker](https://labelstud.io/guide/#Running-with-Docker), [Github sources](https://labelstud.io/guide/#Running-from-source) and [one-click-deploy](https://github.com/heartexlabs/label-studio#one-click-deploy) button. | ||
|
||
2. Open LS in the browser (for local usage it will be [http://localhost:8080](http://localhost:8080) usually). | ||
|
||
3. Go to Setup page ([http://localhost:8080/setup](http://localhost:8080/setup)). On this page you need to configure a labeling scheme for your project using LS tags. Read more about LS tags [in the documentation](/tags/timeseries.html). The fastest way to do it is to use templates which are available on Setup page: | ||
<img src="/images/release-080/ts-templates.png" class="gif-border" /> | ||
|
||
4. Import your CSV/TSV/JSON via Import page ([http://localhost:8080/import](http://localhost:8080/import)). | ||
|
||
5. Start Labeling ([http://localhost:8080/](http://localhost:8080/)) | ||
|
||
|
||
## Special cases | ||
|
||
### Multiple time series in one labeling config | ||
|
||
If you want to use multiple time series tags in one labeling config then you need manually host your CSV files and create JSON with tasks for import which contains links to CSV files. Or you can store time series data in tasks directly. | ||
|
||
### Video & audio sync with time series | ||
|
||
It's possible to synchronize TimeSeries with video and audio in Label Studio. Right now you can do it using HyperText tag with html objects `<audio src="path">`/`<video src="path">` and TimeSeries together. We have some solutions for this in testing stage and we can share it with you [by request in slack](https://join.slack.com/t/label-studio/shared_invite/zt-cr8b7ygm-6L45z7biEBw4HXa5A2b5pw). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.