New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vscode plugin to enable interactive debugging #1922
Conversation
5940843
to
8110563
Compare
19f1463
to
3470e73
Compare
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
3470e73
to
64e3968
Compare
I have added a test but I am not sure if there is any better way to test the plugin. Welcome any suggestion! |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #1922 +/- ##
==========================================
+ Coverage 54.79% 62.77% +7.98%
==========================================
Files 306 312 +6
Lines 22777 23111 +334
Branches 3453 3493 +40
==========================================
+ Hits 12481 14509 +2028
+ Misses 10124 8180 -1944
- Partials 172 422 +250
☔ View full report in Codecov by Sentry. |
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
code_server_bin_dir = os.path.join(code_server_dir_path, "bin") | ||
|
||
# Add the directory of code-server binary to $PATH | ||
os.environ["PATH"] = code_server_bin_dir + os.pathsep + os.environ["PATH"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is os.join()
better than manually using +
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean os.path.join()? In this case, I think we cannot use os.path.join() because we are going to concatenate $PATH with new path by ":" (os.pathsep) instead of combining two path to one path. For example, if $PATH is /usr/bin:/local/bin and code_server_bin_dir is /tmp/bin, then the result will be /tmp/bin:/usr/bin:/local/bin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point!
Also, add a testing section in the PR description, and attach an e2e test in the flytekit sandbox |
I haven't tried it yet (will do so in the next days) but the code is clear. I think the proposal to generate a When the user runs the task in the interactive VSCode server, how is it handled that then the vscode decorator is ignored and the task function is executed instead? From what I can see, the user can currently terminate the workflow or just wait in order to stop the interactive vscode task. As a user, I would want to have a second entry point defined in the What are your thoughts on this idea? |
How would this look like e.g. with a pytorch job? Wouldn't the current implementation start the vscode server in every pod in the worker group? And in a torch distributed process group with sync barriers, wouldn't one immediately cause a deadlock if "different workers different things"? |
Great point. For ray job, the vscode runs on the head, and because ray submits ray tasks as protos to the worker, there is no sync issue. However, for pytorch, tfjob, and mpijob, every worker looks at the code on its own container, if users only change on the head (master), the change does not reflect the worker, which will be an issue. To resolve this issue: Option 1: write a command in pyflyte to sync the code to workers. I feel option 1 might be overkill because, for us, users can usually pinpoint the issue by running with a single pod many GPUs. Also, the solution for this could be included in the upcoming PR. We can add this plugin as an experimental feature first. |
Hi @fg91, thank you for all the awesome idea!
Thanks again for these awesome idea! |
Signed-off-by: troychiu <y.troychiu@gmail.com>
I agree that option 2 is more than enough. Let's just remove pytorch, tfjob, ... from the description/readme .. |
Makes sense 👍
Yes, this is exactly what I meant! (Multiple entrypoints defined in a single |
LGTM! Let's move on to the follow up works |
Signed-off-by: troychiu <y.troychiu@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add vscode plugin to this GA workflow.
plugin-names: |
Signed-off-by: troychiu <y.troychiu@gmail.com>
Signed-off-by: troychiu <y.troychiu@gmail.com>
from flytekitplugins.vscode import vscode | ||
|
||
@task | ||
@vscode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you experiment at all with @task + @vscode
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean the end to end test of @task + @vscode
. If yes, then I have done it. You can check the End to end test section of the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no... i mean the syntax... that was byron's first idea way back in the day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does that mean? Could you give an example
Thanks for the contribution @troychiu and @ByronHsu. This is awesome.
Actually question - if this is run with fast register, is the updated code already copied before vscode launches. Just to document here - when you first started this byron i briefly mentioned it might be possible to think about similarities between vscode and fast register. they kinda share a similar pattern. like in both cases, you have a task, but before the task runs, you need to do something. in the fast register case you need to download and untar a tar file. in this case you need to download and start a vscode server. Basically I'm wondering if there's an elegant way to think about the pre and post-processing of tasks. so that tasks effectively look like
or maybe the order is inverted. does that make any sense? nothing to do right now ofc. just rambling. |
* init Signed-off-by: byhsu <byhsu@linkedin.com> Signed-off-by: troychiu <y.troychiu@gmail.com> * basic vscode plugin Signed-off-by: troychiu <y.troychiu@gmail.com> * WIP Signed-off-by: troychiu <y.troychiu@gmail.com> * WIP Signed-off-by: troychiu <y.troychiu@gmail.com> * WIP Signed-off-by: troychiu <y.troychiu@gmail.com> * WIP Signed-off-by: troychiu <y.troychiu@gmail.com> * fix suggestion Signed-off-by: troychiu <y.troychiu@gmail.com> * lint Signed-off-by: troychiu <y.troychiu@gmail.com> * add test Signed-off-by: troychiu <y.troychiu@gmail.com> * add readme; fix test Signed-off-by: troychiu <y.troychiu@gmail.com> * fix readme Signed-off-by: troychiu <y.troychiu@gmail.com> * fix readme Signed-off-by: troychiu <y.troychiu@gmail.com> * remove redundant Signed-off-by: troychiu <y.troychiu@gmail.com> * resolve suggestions Signed-off-by: troychiu <y.troychiu@gmail.com> * revise readme Signed-off-by: troychiu <y.troychiu@gmail.com> * fix docstring style and put constants to a file Signed-off-by: troychiu <y.troychiu@gmail.com> * fix readme Signed-off-by: troychiu <y.troychiu@gmail.com> * lint Signed-off-by: troychiu <y.troychiu@gmail.com> * add to workflow and add python 3.11 to setup.py Signed-off-by: troychiu <y.troychiu@gmail.com> * add requirements.in and requirements.txt Signed-off-by: troychiu <y.troychiu@gmail.com> * lint Signed-off-by: troychiu <y.troychiu@gmail.com> --------- Signed-off-by: byhsu <byhsu@linkedin.com> Signed-off-by: troychiu <y.troychiu@gmail.com> Co-authored-by: byhsu <byhsu@linkedin.com>
* init Signed-off-by: byhsu <byhsu@linkedin.com> Signed-off-by: troychiu <y.troychiu@gmail.com> * basic vscode plugin Signed-off-by: troychiu <y.troychiu@gmail.com> * WIP Signed-off-by: troychiu <y.troychiu@gmail.com> * WIP Signed-off-by: troychiu <y.troychiu@gmail.com> * WIP Signed-off-by: troychiu <y.troychiu@gmail.com> * WIP Signed-off-by: troychiu <y.troychiu@gmail.com> * fix suggestion Signed-off-by: troychiu <y.troychiu@gmail.com> * lint Signed-off-by: troychiu <y.troychiu@gmail.com> * add test Signed-off-by: troychiu <y.troychiu@gmail.com> * add readme; fix test Signed-off-by: troychiu <y.troychiu@gmail.com> * fix readme Signed-off-by: troychiu <y.troychiu@gmail.com> * fix readme Signed-off-by: troychiu <y.troychiu@gmail.com> * remove redundant Signed-off-by: troychiu <y.troychiu@gmail.com> * resolve suggestions Signed-off-by: troychiu <y.troychiu@gmail.com> * revise readme Signed-off-by: troychiu <y.troychiu@gmail.com> * fix docstring style and put constants to a file Signed-off-by: troychiu <y.troychiu@gmail.com> * fix readme Signed-off-by: troychiu <y.troychiu@gmail.com> * lint Signed-off-by: troychiu <y.troychiu@gmail.com> * add to workflow and add python 3.11 to setup.py Signed-off-by: troychiu <y.troychiu@gmail.com> * add requirements.in and requirements.txt Signed-off-by: troychiu <y.troychiu@gmail.com> * lint Signed-off-by: troychiu <y.troychiu@gmail.com> --------- Signed-off-by: byhsu <byhsu@linkedin.com> Signed-off-by: troychiu <y.troychiu@gmail.com> Co-authored-by: byhsu <byhsu@linkedin.com> Signed-off-by: Rafael Raposo <rafaelraposo@spotify.com>
TL;DR
Add vscode plugin to enable interactive debugging
Type
Are all requirements met?
Complete description
The Flytekit VSCode plugin offers an easy solution for users to run tasks within an interactive VSCode server, compatible with any image and any python task types (e.g. tfjob, pytorchjob, rayjob, etc). This plugin provides a
@vscode
decorator, which users can put within@task
and user function.In summary,
@vscode
decorator modifies a container to run a VSCode server:1. Overrides the user function with a VSCode setup function.
2. Download vscode server from remote to local.
3. Launches and monitors the VSCode server.
4. Terminates after user specified duration.
End to end test
docker build --push . -f Dockerfile.dev -t localhost:30000/flytekit:dev --build-arg PYTHON_VERSION=3.8
@vscode
decorator and run the workflow on remotepyflyte run --remote --image localhost:30000/flytekit:dev vscode_task.py wf --a 30 --b 10
vscode_task.py:
kubectl port-forward -n flytesnacks-development fd314dee47a304bffa75-n0-0 8080:8080
localhost:8080
Screenshot:
Tracking Issue
flyteorg/flyte#4284
Follow-up issue
launch.json
which not only runs the task but also uploads the results and terminates the pod so that the workflow can then continue with the subsequent tasks.