Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vscode plugin to enable interactive debugging #1922

Merged
merged 21 commits into from Nov 8, 2023

Conversation

troychiu
Copy link
Member

@troychiu troychiu commented Oct 28, 2023

TL;DR

Add vscode plugin to enable interactive debugging

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

The Flytekit VSCode plugin offers an easy solution for users to run tasks within an interactive VSCode server, compatible with any image and any python task types (e.g. tfjob, pytorchjob, rayjob, etc). This plugin provides a @vscode decorator, which users can put within @task and user function.

In summary, @vscode decorator modifies a container to run a VSCode server:
1. Overrides the user function with a VSCode setup function.
2. Download vscode server from remote to local.
3. Launches and monitors the VSCode server.
4. Terminates after user specified duration.

End to end test

  1. Build the image with Dockerfile.dev docker build --push . -f Dockerfile.dev -t localhost:30000/flytekit:dev --build-arg PYTHON_VERSION=3.8
  2. Write a task with @vscode decorator and run the workflow on remote pyflyte run --remote --image localhost:30000/flytekit:dev vscode_task.py wf --a 30 --b 10

vscode_task.py:

from flytekit import task, workflow
from flytekitplugins.vscode import vscode

@task
@vscode
def t(a: int, b: int) -> int:
    return a + b

@workflow
def wf(a: int, b: int) -> int:
    out = t(a=a, b=b)
    return out
  1. forward a local port to the pod kubectl port-forward -n flytesnacks-development fd314dee47a304bffa75-n0-0 8080:8080
  2. access the server by opening a web browser and navigating to localhost:8080

Screenshot:
e2e

Tracking Issue

flyteorg/flyte#4284

Follow-up issue

  1. shut down the server based on the idle time instead of a fix duration
  2. support vscode plugin in the code server
  3. automatically generate launch.json to support one click to run feature
  4. detect whether a task run in local mode
  5. Have a second entry point defined in the launch.json which not only runs the task but also uploads the results and terminates the pod so that the workflow can then continue with the subsequent tasks.

Dockerfile.dev Outdated Show resolved Hide resolved
plugins/flytekit-vscode/flytekitplugins/vscode/vscode.py Outdated Show resolved Hide resolved
plugins/flytekit-vscode/flytekitplugins/vscode/vscode.py Outdated Show resolved Hide resolved
plugins/flytekit-vscode/flytekitplugins/vscode/vscode.py Outdated Show resolved Hide resolved
plugins/flytekit-vscode/Dockerfile Outdated Show resolved Hide resolved
byhsu and others added 8 commits October 30, 2023 00:10
Signed-off-by: byhsu <byhsu@linkedin.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
@troychiu troychiu changed the title [WIP] Vscode plugin init version Vscode plugin init version Oct 30, 2023
@troychiu troychiu marked this pull request as ready for review October 30, 2023 07:35
@troychiu
Copy link
Member Author

I have added a test but I am not sure if there is any better way to test the plugin. Welcome any suggestion!

@codecov
Copy link

codecov bot commented Oct 30, 2023

Codecov Report

Attention: 37 lines in your changes are missing coverage. Please review.

Comparison is base (9b08a89) 54.79% compared to head (f5d4fb3) 62.77%.
Report is 15 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1922      +/-   ##
==========================================
+ Coverage   54.79%   62.77%   +7.98%     
==========================================
  Files         306      312       +6     
  Lines       22777    23111     +334     
  Branches     3453     3493      +40     
==========================================
+ Hits        12481    14509    +2028     
+ Misses      10124     8180    -1944     
- Partials      172      422     +250     
Files Coverage Δ
...flytekit-vscode/flytekitplugins/vscode/__init__.py 100.00% <100.00%> (ø)
...lytekit-vscode/flytekitplugins/vscode/constants.py 100.00% <100.00%> (ø)
...lugins/flytekit-vscode/tests/test_vscode_plugin.py 95.00% <95.00%> (ø)
...lytekit-vscode/flytekitplugins/vscode/decorator.py 48.57% <48.57%> (ø)

... and 73 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
@ByronHsu ByronHsu changed the title Vscode plugin init version Add vscode plugin to enable interactive debugging Nov 1, 2023
plugins/flytekit-vscode/README.md Outdated Show resolved Hide resolved
plugins/flytekit-vscode/README.md Outdated Show resolved Hide resolved
plugins/flytekit-vscode/README.md Outdated Show resolved Hide resolved
code_server_bin_dir = os.path.join(code_server_dir_path, "bin")

# Add the directory of code-server binary to $PATH
os.environ["PATH"] = code_server_bin_dir + os.pathsep + os.environ["PATH"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is os.join() better than manually using +?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean os.path.join()? In this case, I think we cannot use os.path.join() because we are going to concatenate $PATH with new path by ":" (os.pathsep) instead of combining two path to one path. For example, if $PATH is /usr/bin:/local/bin and code_server_bin_dir is /tmp/bin, then the result will be /tmp/bin:/usr/bin:/local/bin.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point!

@ByronHsu
Copy link
Collaborator

ByronHsu commented Nov 1, 2023

Also, add a testing section in the PR description, and attach an e2e test in the flytekit sandbox

@fg91
Copy link
Member

fg91 commented Nov 3, 2023

I haven't tried it yet (will do so in the next days) but the code is clear.

I think the proposal to generate a launch.json which already has the correct command to start the specific task in the workflow is very nice, would you be willing to include this?


When the user runs the task in the interactive VSCode server, how is it handled that then the vscode decorator is ignored and the task function is executed instead?


From what I can see, the user can currently terminate the workflow or just wait in order to stop the interactive vscode task.

As a user, I would want to have a second entry point defined in the launch.json which not only runs the task but also uploads the results and terminates the pod so that the workflow can then continue with the subsequent tasks.

What are your thoughts on this idea?

@fg91
Copy link
Member

fg91 commented Nov 3, 2023

compatible with any image and any python task types (e.g. tfjob, pytorchjob, rayjob, etc).

How would this look like e.g. with a pytorch job? Wouldn't the current implementation start the vscode server in every pod in the worker group? And in a torch distributed process group with sync barriers, wouldn't one immediately cause a deadlock if "different workers different things"?

@ByronHsu
Copy link
Collaborator

ByronHsu commented Nov 4, 2023

How would this look like e.g. with a pytorch job? Wouldn't the current implementation start the vscode server in every pod in the worker group? And in a torch distributed process group with sync barriers, wouldn't one immediately cause a deadlock if "different workers different things"?

Great point. For ray job, the vscode runs on the head, and because ray submits ray tasks as protos to the worker, there is no sync issue.

However, for pytorch, tfjob, and mpijob, every worker looks at the code on its own container, if users only change on the head (master), the change does not reflect the worker, which will be an issue. To resolve this issue:

Option 1: write a command in pyflyte to sync the code to workers.
Option 2: users switch back to one single pod with multi GPUs to debug.

I feel option 1 might be overkill because, for us, users can usually pinpoint the issue by running with a single pod many GPUs. Also, the solution for this could be included in the upcoming PR. We can add this plugin as an experimental feature first.

@troychiu
Copy link
Member Author

troychiu commented Nov 4, 2023

I haven't tried it yet (will do so in the next days) but the code is clear.

I think the proposal to generate a launch.json which already has the correct command to start the specific task in the workflow is very nice, would you be willing to include this?

When the user runs the task in the interactive VSCode server, how is it handled that then the vscode decorator is ignored and the task function is executed instead?

From what I can see, the user can currently terminate the workflow or just wait in order to stop the interactive vscode task.

As a user, I would want to have a second entry point defined in the launch.json which not only runs the task but also uploads the results and terminates the pod so that the workflow can then continue with the subsequent tasks.

What are your thoughts on this idea?

Hi @fg91, thank you for all the awesome idea!

  1. Generating a launch.json which allows one click feature is awesome. I am working on it and will create a separate PR to include this feature!
  2. In runtime, we can get the execution state/mode from the flytekit context manager (
    class Mode(Enum):
    ). When we get local execution, we can simply return the function instead of vscode function. A separate PR will be created.
  3. Do you mean when users have done the debugging, they can trigger another launch.json and we will actually run the task and consequently trigger the subsequent workflow as if there is no vscode plugin? I think it is an awesome idea that we should include :)

Thanks again for these awesome idea!

Signed-off-by: troychiu <y.troychiu@gmail.com>
@fg91
Copy link
Member

fg91 commented Nov 4, 2023

However, for pytorch, tfjob, and mpijob, every worker looks at the code on its own container, if users only change on the head (master), the change does not reflect the worker, which will be an issue. To resolve this issue:
Option 1: write a command in pyflyte to sync the code to workers. Option 2: users switch back to one single pod with multi GPUs to debug.

I agree that option 2 is more than enough. Let's just remove pytorch, tfjob, ... from the description/readme ..

@fg91
Copy link
Member

fg91 commented Nov 4, 2023

  1. In runtime, we can get the execution state/mode from the flytekit context manager (
    class Mode(Enum):

    ). When we get local execution, we can simply return the function instead of vscode function. A separate PR will be created.

Makes sense 👍

  1. Do you mean when users have done the debugging, they can trigger another launch.json and we will actually run the task and consequently trigger the subsequent workflow as if there is no vscode plugin? I think it is an awesome idea that we should include :)

Yes, this is exactly what I meant! (Multiple entrypoints defined in a single launch.json instead of multiple jsons which I think doesn't work)

ByronHsu
ByronHsu previously approved these changes Nov 4, 2023
@ByronHsu
Copy link
Collaborator

ByronHsu commented Nov 4, 2023

LGTM! Let's move on to the follow up works

Signed-off-by: troychiu <y.troychiu@gmail.com>
ByronHsu
ByronHsu previously approved these changes Nov 4, 2023
Signed-off-by: troychiu <y.troychiu@gmail.com>
ByronHsu
ByronHsu previously approved these changes Nov 5, 2023
Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add vscode plugin to this GA workflow.

plugins/flytekit-vscode/setup.py Show resolved Hide resolved
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
@ByronHsu ByronHsu merged commit 3073f2c into flyteorg:master Nov 8, 2023
73 checks passed
from flytekitplugins.vscode import vscode

@task
@vscode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you experiment at all with @task + @vscode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the end to end test of @task + @vscode. If yes, then I have done it. You can check the End to end test section of the PR description.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no... i mean the syntax... that was byron's first idea way back in the day.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does that mean? Could you give an example

@wild-endeavor
Copy link
Contributor

Thanks for the contribution @troychiu and @ByronHsu. This is awesome.

  • Echo the sentiment of being able to see the original command and be able to kick it off and then quit. But I feel like users should be able to do this two ways.
    • I think it would be nice to be able to run pyflyte itself.
    • Somehow have another py file or ipynb file that's auto generated that triggers just the task. Maybe it can be an ipynb file where the steps of downloading the fast register tar file, untarring, downloading the inputs, loading the module anew and running the task, uploading outputs, are all different cells. or just have all this in one py file that the user can set breakpoints on. Kinda like an interactive entrypoint.py.

Actually question - if this is run with fast register, is the updated code already copied before vscode launches.

Just to document here - when you first started this byron i briefly mentioned it might be possible to think about similarities between vscode and fast register. they kinda share a similar pattern. like in both cases, you have a task, but before the task runs, you need to do something. in the fast register case you need to download and untar a tar file. in this case you need to download and start a vscode server. Basically I'm wondering if there's an elegant way to think about the pre and post-processing of tasks. so that tasks effectively look like

@fast
@vscode
@task
def t(): ...

or maybe the order is inverted. does that make any sense? nothing to do right now ofc. just rambling.

ringohoffman pushed a commit to ringohoffman/flytekit that referenced this pull request Nov 24, 2023
* init

Signed-off-by: byhsu <byhsu@linkedin.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>

* basic vscode plugin

Signed-off-by: troychiu <y.troychiu@gmail.com>

* WIP

Signed-off-by: troychiu <y.troychiu@gmail.com>

* WIP

Signed-off-by: troychiu <y.troychiu@gmail.com>

* WIP

Signed-off-by: troychiu <y.troychiu@gmail.com>

* WIP

Signed-off-by: troychiu <y.troychiu@gmail.com>

* fix suggestion

Signed-off-by: troychiu <y.troychiu@gmail.com>

* lint

Signed-off-by: troychiu <y.troychiu@gmail.com>

* add test

Signed-off-by: troychiu <y.troychiu@gmail.com>

* add readme; fix test

Signed-off-by: troychiu <y.troychiu@gmail.com>

* fix readme

Signed-off-by: troychiu <y.troychiu@gmail.com>

* fix readme

Signed-off-by: troychiu <y.troychiu@gmail.com>

* remove redundant

Signed-off-by: troychiu <y.troychiu@gmail.com>

* resolve suggestions

Signed-off-by: troychiu <y.troychiu@gmail.com>

* revise readme

Signed-off-by: troychiu <y.troychiu@gmail.com>

* fix docstring style and put constants to a file

Signed-off-by: troychiu <y.troychiu@gmail.com>

* fix readme

Signed-off-by: troychiu <y.troychiu@gmail.com>

* lint

Signed-off-by: troychiu <y.troychiu@gmail.com>

* add to workflow and add python 3.11 to setup.py

Signed-off-by: troychiu <y.troychiu@gmail.com>

* add requirements.in and requirements.txt

Signed-off-by: troychiu <y.troychiu@gmail.com>

* lint

Signed-off-by: troychiu <y.troychiu@gmail.com>

---------

Signed-off-by: byhsu <byhsu@linkedin.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Co-authored-by: byhsu <byhsu@linkedin.com>
RRap0so pushed a commit to RRap0so/flytekit that referenced this pull request Dec 15, 2023
* init

Signed-off-by: byhsu <byhsu@linkedin.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>

* basic vscode plugin

Signed-off-by: troychiu <y.troychiu@gmail.com>

* WIP

Signed-off-by: troychiu <y.troychiu@gmail.com>

* WIP

Signed-off-by: troychiu <y.troychiu@gmail.com>

* WIP

Signed-off-by: troychiu <y.troychiu@gmail.com>

* WIP

Signed-off-by: troychiu <y.troychiu@gmail.com>

* fix suggestion

Signed-off-by: troychiu <y.troychiu@gmail.com>

* lint

Signed-off-by: troychiu <y.troychiu@gmail.com>

* add test

Signed-off-by: troychiu <y.troychiu@gmail.com>

* add readme; fix test

Signed-off-by: troychiu <y.troychiu@gmail.com>

* fix readme

Signed-off-by: troychiu <y.troychiu@gmail.com>

* fix readme

Signed-off-by: troychiu <y.troychiu@gmail.com>

* remove redundant

Signed-off-by: troychiu <y.troychiu@gmail.com>

* resolve suggestions

Signed-off-by: troychiu <y.troychiu@gmail.com>

* revise readme

Signed-off-by: troychiu <y.troychiu@gmail.com>

* fix docstring style and put constants to a file

Signed-off-by: troychiu <y.troychiu@gmail.com>

* fix readme

Signed-off-by: troychiu <y.troychiu@gmail.com>

* lint

Signed-off-by: troychiu <y.troychiu@gmail.com>

* add to workflow and add python 3.11 to setup.py

Signed-off-by: troychiu <y.troychiu@gmail.com>

* add requirements.in and requirements.txt

Signed-off-by: troychiu <y.troychiu@gmail.com>

* lint

Signed-off-by: troychiu <y.troychiu@gmail.com>

---------

Signed-off-by: byhsu <byhsu@linkedin.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Co-authored-by: byhsu <byhsu@linkedin.com>
Signed-off-by: Rafael Raposo <rafaelraposo@spotify.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants