Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal to enhance tdworkflow with schedule_from parameter #30

Closed
ehaupt opened this issue Apr 19, 2024 · 8 comments
Closed

Proposal to enhance tdworkflow with schedule_from parameter #30

ehaupt opened this issue Apr 19, 2024 · 8 comments

Comments

@ehaupt
Copy link
Contributor

ehaupt commented Apr 19, 2024

I noticed that the PUT /api/projects endpoint in digdag allows passing a schedule_from parameter to specify a future time for starting the scheduling of new workflows. This feature could be really useful for planning in advance.

Would you consider adding support for this parameter in the client.create_project() method? I believe it could enhance the functionality of your library, and many users including myself would find it extremely beneficial.

Thank you for considering this addition.

@chezou
Copy link
Owner

chezou commented Apr 20, 2024

Added on #31. Can you try it before release?

@ehaupt
Copy link
Contributor Author

ehaupt commented Apr 20, 2024

Awesome, thank you! I'll report back.

@ehaupt
Copy link
Contributor Author

ehaupt commented Apr 20, 2024

Test scenario

  • Start a digdag server
digdag server --memory
  • Created two projects with schedules running 1h in the future from now:
    • normal: without schedule_from
    • schedulefrom: with schedule_from2h in the future from now
import os
from datetime import datetime, timedelta

import requests
import tdworkflow
import yaml

session = requests.Session()
client = tdworkflow.client.Client(
    endpoint="localhost:65432", apikey="", _session=session, scheme="http"
)

# current time
current_time = datetime.now()

# sample workflow
workflow = {
    "timezone": "Europe/Zurich",
    "schedule": {
        "daily>": f"{current_time.hour + 1}:{current_time.minute}:{current_time.second}"
    },
    "+Hello": {"echo>": "Hello"},
}


# mkdir work
def mkproject(project_name, workflow):
    project_dir = f"work/{project_name}"
    os.makedirs(project_dir, exist_ok=True)
    with open(f"{project_dir}/{project_name}.dig", "w") as f:
        f.write(yaml.dump(workflow))
    return project_dir


# normal
project_name = "normal"
project_update = client.create_project(
    project_name=project_name, target_dir=mkproject(project_name, workflow)
)

# schedule_from
time_2_hours_from_now = current_time + timedelta(hours=2)
project_name = "schedulefrom"
project_update = client.create_project(
    project_name=project_name,
    target_dir=mkproject(project_name, workflow),
    schedule_from=time_2_hours_from_now,
)

Expected behavior

  • normal should run in 1h
  • schedulefrom should run 1d and 1h from now

Result

Normal

image

Schedulefrom

image

Conclusion

The schedule_from parameter implementation works as expected.

I noticed that you've also implemented two undocumented (but existing) parameters:

        :param clear_schedules: Clear schedules for the given workflow names
        :param clear_schedule_all: Clear all schedules

Having those available will come in handy. Thank you very very much.

@ehaupt
Copy link
Contributor Author

ehaupt commented Apr 20, 2024

Do you happen to know what setting clear_schedule_all actually does? I've created a project with a schedule, then created the same project again with a different schedule but this time set clear_schedule_all to True while watching:

digdag schedules

All I've observed was that the next scheduled to run at field changed to the correct new time I've set.

@chezou
Copy link
Owner

chezou commented Apr 20, 2024

I don't know much about it, but it looks like it affects handling around the last_session_time. My notation in the docstring may be wrong.
https://github.com/treasure-data/digdag/pull/1800/files#diff-2f645f181008bcd7e23844c3fd9b4a0f8d07f6430ba63aefdcb7c50f12329cfeR505

@chezou
Copy link
Owner

chezou commented Apr 20, 2024

I've updated the docstring accordingly. 7e0ba33

Looking at here
https://github.com/treasure-data/digdag/pull/1800/files#diff-afc0d65ad308597007aeb580d7e11432a12ff7080cc795d2e7306f8f931b03dbR133-R141
and here
https://github.com/treasure-data/digdag/pull/1800/files#diff-2f645f181008bcd7e23844c3fd9b4a0f8d07f6430ba63aefdcb7c50f12329cfeR547-R570
the clear parameters enforce to forget last_session_time from schedules.

I think it's good time to close this issue. Will release a new version soon. Thanks for your evaluation @ehaupt!

@chezou chezou closed this as completed Apr 20, 2024
@chezou
Copy link
Owner

chezou commented Apr 20, 2024

@ehaupt
Copy link
Contributor Author

ehaupt commented Apr 21, 2024

Thank you for the explanation and the new release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants