recipe for using the Lightly Docker as API worker #630

MalteEbner · 2021-12-14T16:13:49Z

Description

links to the docs of setting up the S3 bucket
links to the docs of installing the docker
links to the docs of the first steps with the docker
changes the tutorial of setting up the S3 bucket to include distinguishing between Images/Videos

Generated report

Using the docker with an S3 bucket as remote datasource. — lightly 1.2.3 documentation.pdf

codecov · 2021-12-14T16:17:38Z

Codecov Report

Merging #630 (9a51601) into master (deef6be) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #630   +/-   ##
=======================================
  Coverage   87.70%   87.70%           
=======================================
  Files          89       89           
  Lines        3433     3433           
=======================================
  Hits         3011     3011           
  Misses        422      422

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update deef6be...9a51601. Read the comment docs.

japrescott

some comments. I am not sure if the title of the tutorial is correct as this tutorial should be more about directly using a datasource instead of needing the data locally

docs/source/docker/getting_started/first_steps.rst

docs/source/docker/integration/docker_api_worker.rst

philippmwirth · 2022-01-17T10:06:00Z

Just had a quick look and the "setting up" is too minimal and is missing a few points:

there should be a case distinction between video datasets and image datasets
-> if you want to process images, choose Image when creating the dataset
-> if you want to process videos, choose Video when creating the dataset

MalteEbner · 2022-01-17T16:46:01Z

Just had a quick look and the "setting up" is too minimal and is missing a few points:

there should be a case distinction between video datasets and image datasets
-> if you want to process images, choose Image when creating the dataset
-> if you want to process videos, choose Video when creating the dataset

The setting up of the S3 datasource is covered in the dataset creation with AWS recipe: https://docs.lightly.ai/getting_started/dataset_creation/dataset_creation_aws_bucket.html
It surely belongs there and not into the docker tutorial.

philippmwirth · 2022-01-18T07:14:25Z

Just had a quick look and the "setting up" is too minimal and is missing a few points:

there should be a case distinction between video datasets and image datasets
-> if you want to process images, choose Image when creating the dataset
-> if you want to process videos, choose Video when creating the dataset

The setting up of the S3 datasource is covered in the dataset creation with AWS recipe: https://docs.lightly.ai/getting_started/dataset_creation/dataset_creation_aws_bucket.html It surely belongs there and not into the docker tutorial.

All I'm saying is that if I follow the tutorial now it will not work as expected.

docs/source/docker/getting_started/first_steps.rst

docs/source/docker/integration/docker_with_datasource.rst

MalteEbner · 2022-01-18T09:18:31Z

All I'm saying is that if I follow the tutorial now it will not work as expected.

I adapted the tutorial of setting up the S3 bucket accordingly.

IgorSusmelj

I left a few comments. It worked well and is CRAZY SUPER COOL!!! :D

No, it's really awesome to see that this is working. Sooooo convenient.

Could you also add a line here on top? https://docs.lightly.ai/docker/overview.html
Maybe mark the active learning part not as new anymore.

IgorSusmelj · 2022-01-19T08:46:02Z

docs/source/docker/integration/docker_with_datasource.rst

+This recipe requires that you already have a dataset in the Lightly Platform
+configured to use the data in your AWS S3 bucket.
+
+Follow the steps on how to `create a Lightly dataset connected to your S3 bucket <https://docs.lightly.ai/getting_started/dataset_creation/dataset_creation_aws_bucket.html>`_.


I would assume at this part that there is an S3 bucket. The user should go to the other link to setup s3 properly or have a look if there are any questions.

However, for the sake of this tutorial, I would suggest adding all the elements starting from the dataset creation in the UI here as well. So we guide the user from creating a dataset in Lightly and connecting it with the S3 bucket to running the docker.

Please also tell the user here that he/she needs to pick images or videos depending on the dataset type he/she wants to use

I would not add all the dataset creation elements here for two reasons:

It would mean duplicating lots of stuff.

It does not work well when we also have support for GC storage and azure storage.

IgorSusmelj · 2022-01-19T10:23:38Z

docs/source/docker/integration/docker_with_datasource.rst

+Use your subsampled dataset
+---------------------------
+
+Once the docker run has finished, you can use your subsampled dataset as you like:


Would add a screenshot here of the UI home screen for the dataset (e.g. how should it look like)

IgorSusmelj · 2022-01-19T10:26:22Z

docs/source/docker/integration/docker_with_datasource.rst

+subsample it further, or export it for labeling.
+
+.. _ref-docker-with-datasource-datapool:
+Process new data in your S3 bucket using a datapool


This is sort of the killer feature that appears in the end and it goes under because it's just lots of text. I would try to use bullet points or images/ screenshots to help. E.g.

Running the docker for the first time:

Running the docker again with a new video (we see a new cluster because the new video is very different):

Only visualizing the newly added data:

I added much more content and screenshots.

IgorSusmelj · 2022-01-19T10:37:31Z

Ah, and one other thing:

we should mention that this only works with S3 and when to expect other sources (azure, gcp) to work as well

MalteEbner · 2022-01-19T18:40:29Z

we should mention that this only works with S3 and when to expect other sources (azure, gcp) to work as well

I added which features don't work yet but will be implemented soon.

IgorSusmelj · 2022-01-19T20:14:58Z

docs/source/docker/integration/docker_with_datasource.rst

+
+.. code-block:: console
+
+    docker run --gpus all --rm -it \


I would try to give a more complete docker run command. Maybe add the stopping condition?

IgorSusmelj

Thanks for the changes. It already looks much better. I would still add a more detailed instruction on the s3 setup. Just mention the steps:

create dataset (use videos or images depending on the data)
edit dataset and click on s3
fill out (here I would add a screenshot, we can even recycle the one we already have :))

The reason why I would add this is to give the user a good end-to-end workflow. Otherwise, he/she has to jump around in our docs and search the various parts together.

MalteEbner · 2022-01-20T10:17:11Z

Thanks for the changes. It already looks much better. I would still add a more detailed instruction on the s3 setup. Just mention the steps:

create dataset (use videos or images depending on the data)

edit dataset and click on s3

fill out (here I would add a screenshot, we can even recycle the one we already have :))

The reason why I would add this is to give the user a good end-to-end workflow. Otherwise, he/she has to jump around in our docs and search the various parts together.

I have some questions regarding your proposal:

The 3 steps you proposed are basically the 6 steps in the section "Create and configure a dataset" from the docs: https://docs.lightly.ai/getting_started/dataset_creation/dataset_creation_aws_bucket.html#uploading-your-data
Do I get that correctly that you would leave away the complete first section of Setting up Amazon S3? Would you link it? Or how would you tell the user what the access key and secret access key are?
How would you extend the documentation when we add GC bucket and azure storage? Create new tutorials for them? Or extend this tutorial?
Would you also try to get rid of the other references to the docs? (How to download the docker, the first steps with the docker)

However, I get that the S3 tutorial is not ideal for that, thus I created a new issue: https://linear.app/lightly/issue/LIG-548/datasource-setup-recipes-use-order-fitting-milestone-0

MalteEbner force-pushed the malte-lig-338-tutorial-for-user-story-m01 branch from 1b3887d to 23776f9 Compare January 17, 2022 08:56

MalteEbner marked this pull request as ready for review January 17, 2022 09:13

japrescott reviewed Jan 17, 2022

View reviewed changes

MalteEbner requested a review from japrescott January 17, 2022 17:01

japrescott reviewed Jan 18, 2022

View reviewed changes

docs/source/docker/integration/docker_with_datasource.rst Outdated Show resolved Hide resolved

MalteEbner requested a review from japrescott January 18, 2022 09:16

IgorSusmelj requested changes Jan 19, 2022

View reviewed changes

MalteEbner changed the title ~~tutorial for using the Lightly Docker as API worker~~ recipe for using the Lightly Docker as API worker Jan 19, 2022

MalteEbner requested a review from IgorSusmelj January 19, 2022 18:40

MalteEbner force-pushed the malte-lig-338-tutorial-for-user-story-m01 branch from 44bf89d to bf6c66a Compare January 19, 2022 18:40

IgorSusmelj reviewed Jan 19, 2022

View reviewed changes

IgorSusmelj requested changes Jan 19, 2022

View reviewed changes

IgorSusmelj approved these changes Jan 25, 2022

View reviewed changes

MalteEbner added 8 commits January 25, 2022 14:02

draft of tutorial for using the Lightly Docker as API worker

28ff23c

update tutorial to link to tutorial to use the S3 bucket.

c331e41

improvements of docker with datasource tutorial

d001857

adressed comments in PR review

c1e6467

adressed comments in PR review

3ae41ae

Added section about advantages of the new workflow

d06d978

put docker with datasource as new in overview

d969f65

reworked recipe to include screenshots and perform better in general

2086933

MalteEbner added 3 commits January 25, 2022 14:02

nicer text

cf85cf0

more complete docker command

2613892

mini-recipe for creating S3 dataset

9a51601

MalteEbner force-pushed the malte-lig-338-tutorial-for-user-story-m01 branch from 6260a46 to 9a51601 Compare January 25, 2022 13:02

MalteEbner merged commit 61863ac into master Jan 25, 2022

MalteEbner deleted the malte-lig-338-tutorial-for-user-story-m01 branch January 25, 2022 13:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

recipe for using the Lightly Docker as API worker #630

recipe for using the Lightly Docker as API worker #630

MalteEbner commented Dec 14, 2021 •

edited

codecov bot commented Dec 14, 2021 •

edited

japrescott left a comment

philippmwirth commented Jan 17, 2022

MalteEbner commented Jan 17, 2022 •

edited

philippmwirth commented Jan 18, 2022

MalteEbner commented Jan 18, 2022

IgorSusmelj left a comment

IgorSusmelj Jan 19, 2022

IgorSusmelj Jan 19, 2022

MalteEbner Jan 19, 2022

IgorSusmelj Jan 19, 2022

IgorSusmelj Jan 19, 2022

MalteEbner Jan 19, 2022

IgorSusmelj commented Jan 19, 2022

MalteEbner commented Jan 19, 2022

IgorSusmelj Jan 19, 2022

IgorSusmelj left a comment

MalteEbner commented Jan 20, 2022 •

edited

recipe for using the Lightly Docker as API worker #630

recipe for using the Lightly Docker as API worker #630

Conversation

MalteEbner commented Dec 14, 2021 • edited

Description

Generated report

codecov bot commented Dec 14, 2021 • edited

Codecov Report

japrescott left a comment

Choose a reason for hiding this comment

philippmwirth commented Jan 17, 2022

MalteEbner commented Jan 17, 2022 • edited

philippmwirth commented Jan 18, 2022

MalteEbner commented Jan 18, 2022

IgorSusmelj left a comment

Choose a reason for hiding this comment

IgorSusmelj Jan 19, 2022

Choose a reason for hiding this comment

IgorSusmelj Jan 19, 2022

Choose a reason for hiding this comment

MalteEbner Jan 19, 2022

Choose a reason for hiding this comment

IgorSusmelj Jan 19, 2022

Choose a reason for hiding this comment

IgorSusmelj Jan 19, 2022

Choose a reason for hiding this comment

MalteEbner Jan 19, 2022

Choose a reason for hiding this comment

IgorSusmelj commented Jan 19, 2022

MalteEbner commented Jan 19, 2022

IgorSusmelj Jan 19, 2022

Choose a reason for hiding this comment

IgorSusmelj left a comment

Choose a reason for hiding this comment

MalteEbner commented Jan 20, 2022 • edited

MalteEbner commented Dec 14, 2021 •

edited

codecov bot commented Dec 14, 2021 •

edited

MalteEbner commented Jan 17, 2022 •

edited

MalteEbner commented Jan 20, 2022 •

edited