feat: prefer remote taskfiles over cached ones #1345

pd93 · 2023-09-22T00:18:57Z

As per discussion in #1152 (comment), this PR changes the Remote Taskfiles experiment to prefer remote files over locally cached ones.

Previously, if a user used the --download flag to cache a file, this file would be automatically preferred over the remote copy. The only way to get a newer version of that file was to delete the cached version entirely.

The new behaviour always prefers the remote copy and will always attempt to fetch and use it unless the user specifies the --offline flag (in which case Task will search for a cached version).

In addition to this, if the network times out, when trying to fetch a remote copy, Task will now search for a cached version and use that instead.

My question now is... Do we need --download at all? I would argue that we can now make --download the default behaviour. i.e. we always cache the file when we download it. If we don't do this, then there is a risk that the cached copy will slowly diverge (if a user doesn't remember to use --download periodically when the remote copy changes). I think it's safer to assume that the user always wants the latest version of the file cached.

I'm also considering a --timeout flag. I have set the default to 10 seconds (which seems reasonable to me), but I can see CI/script users wanting to adjust this for various reasons.

caphrim007 · 2023-09-22T03:19:06Z

The way I try to frame stuff like this cli-args thing is to consider how both myself and the engineers that I work with would either pass or fail the principle-of-least-astonishment (POLA).

In this example, if I were to look at the following syntax in a taskfile,

include:
  my-remote-namespace: https://raw.githubusercontent.com/my-org/my-repo/main/Taskfile.yml

my gut-reaction before any thinking occurs is that the system is going to download that Taskfile.yml. In other words, my personal POLA is that URLs are not much different than filesystems. I point to a thing and it just gets it.

Caching, given the above, is an implementation detail to me as the user; it's invisible. If it works, yay. If it doesn't work, something seems a little slow, but the system still got my file...I wonder why slow...who cares, yay.

The above is what I also seem to have been able to grok from my colleagues.

So in this regard, my 2c, is that the --download arg is redundant, and the --offline arg is exposing an internal behavior of the system that might happen under normal operating conditions. But, to your points, I might want to deliberately specify it if I'm using Task in situations where robots are involved and I have full control over my environment.

Happy to hear others opinions.

pd93 · 2023-09-23T15:58:08Z

Thanks @caphrim007. Really appreciate your thoughts :) I've had a bit more time to think about this.

I think the main reasons for the --offline flag are:

If a user wants to view/edit a remote Taskfile before execution or maybe for debugging
If a user knows that they don't have an internet connection and they don't want to wait for the timeout.

To extend on point 1. If we kept the --download flag, but made it so that it never executes a command, then it would further facilitate the ability to "view/edit a remote Taskfile before execution".

So maybe all we need to do is:

Change the default behaviour to always download/cache files (as previously stated)
Change the --download flag so that it never executes tasks

blackjid · 2023-10-03T16:16:05Z

I think I would want this to behave a bit like dependencies...

The first run, download and cache.. (possible log to stdout that downloading is happening)
- You can always run a command to upgrade the dependencies. e.g task includes upgrade
  - There might be an option to set an auto upgrade every x hours, minutes.
Next runs, just use the cache.. until
- The reference change,, for example the urs change to point to a different branch/tag
- The autoupgrade ttl time reach and a new download happens

I think this is mostly the same as the "always download and cache" with a configurable timeout. But presented in a different way. At least I'm more used to think in terms of dependencies management that in terms of cache/timeout.

c-ameron · 2023-11-02T17:13:34Z

I like @blackjid's idea!

For my context, I am wanting to use these features to have a standard set of taskfile includes across my org.

For me, I would like it to download first, then by default always use the cache. I run taskfile a lot, (multiple times a minute if I'm running a debugging command like task test -- feature/a ), so the extra network overhead wouldn't be useful.

I like the idea of having the user force a new download to overwrite the cached files. The auto-update-ttl idea is also great. It would allow users to have the fast latency of not fetching every task run, but also allow automatic updates in a soft manner.

Another suggestion, would be to have these as options as keys inside the .yml file as well. To me it would be clunky to have to have these all as a flag when running my tasks.
As an example

includes:
  my-remote-namespace: https://raw.githubusercontent.com/my-org/my-repo/main/Taskfile.yml
  offline: true
  auto-update-ttl: weekly

Thanks!

pd93 · 2023-11-02T18:19:33Z

Hey all. Thanks for the comments and sorry for the lack of progress on the experiment lately. It's been a busy month or so for me!

I've pushed the changes discussed in previous comments and I believe this is ready for a review when @andreynering has some time.

@c-ameron I like the idea of a cache TTL, but I'm going to leave this as out-of-scope for now. I don't think this addition will affect the API and could be easily added later. The same goes for adding the flags as keys in the file. The schema is a bit harder to amend for an experiment if we were to change our minds on anything, so I'd like to concentrate on the fundamentals for now (so that we can deliver this experiment quicker) and then revisit these bits later. That said, please feel free to open issues for these features so that they aren't forgotten.

andreynering

👏 👏 👏

andreynering · 2023-11-16T01:22:19Z

docs/docs/experiments/remote_taskfiles.md

-any calls to remote sources.
+Whenever you run a remote Taskfile, the latest copy will be downloaded from the
+internet and cached locally. If for whatever reason, you lose access to the
+internet, you will still be able to run your tasks by specifying the `--offline`


As we talked on Discord, it'd be interesting to have an offline: true setting and a TASK_OFFLINE=1 env to allow users to set this once and have it always enabled.

Can be on another PR if you prefer, no problem.

Thanks for this! I also created it is an issue as requested :)
#1403

I think the functionality here is ready, so let's get this merged. I'll work on the schema/env options in another PR as suggested.

@c-ameron Thanks for creating the issues. I've added them to the TODO list in the experiment issue so they're not forgotten.

andreynering · 2023-11-16T01:24:42Z

docs/docs/experiments/remote_taskfiles.md

+of trying to download it. You are able to use the `--download` flag to update
+the cached version of the remote files without running any tasks.


You are able to use the --download flag to update the cached version of the remote files without running any tasks.

Yes, that's the idea 👍

pd93 mentioned this pull request Sep 22, 2023

Remote Taskfiles experiment #1317

Open

15 tasks

pd93 force-pushed the prefer-remote-files branch from 16647a8 to 8d19e59 Compare November 2, 2023 18:00

andreynering approved these changes Nov 16, 2023

View reviewed changes

This was referenced Nov 17, 2023

Feature Request: Have an auto update ttl for downloaded taskfiles #1402

Open

Feature request: Allow taskfile schema to use options related to remote taskfiles #1403

Open

pd93 added 6 commits November 17, 2023 20:19

feat: prefer remote taskfiles over cached ones

7518c77

feat: implemented cache on network timeout

24473ff

feat: --download always downloads, but never executes tasks

d2a633f

feat: --timeout flag

e1639d1

fix: bug with timeout error handling

b5e41b6

chore: changelog

dd0ec73

pd93 force-pushed the prefer-remote-files branch from 3fb260c to dd0ec73 Compare November 17, 2023 20:22

pd93 merged commit 546a4d7 into main Nov 17, 2023
11 checks passed

pd93 deleted the prefer-remote-files branch November 17, 2023 20:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: prefer remote taskfiles over cached ones #1345

feat: prefer remote taskfiles over cached ones #1345

pd93 commented Sep 22, 2023 •

edited

caphrim007 commented Sep 22, 2023

pd93 commented Sep 23, 2023

blackjid commented Oct 3, 2023

c-ameron commented Nov 2, 2023 •

edited

pd93 commented Nov 2, 2023

andreynering left a comment

andreynering Nov 16, 2023

c-ameron Nov 17, 2023

pd93 Nov 17, 2023

andreynering Nov 16, 2023

		of trying to download it. You are able to use the `--download` flag to update
		the cached version of the remote files without running any tasks.

feat: prefer remote taskfiles over cached ones #1345

feat: prefer remote taskfiles over cached ones #1345

Conversation

pd93 commented Sep 22, 2023 • edited

caphrim007 commented Sep 22, 2023

pd93 commented Sep 23, 2023

blackjid commented Oct 3, 2023

c-ameron commented Nov 2, 2023 • edited

pd93 commented Nov 2, 2023

andreynering left a comment

Choose a reason for hiding this comment

andreynering Nov 16, 2023

Choose a reason for hiding this comment

c-ameron Nov 17, 2023

Choose a reason for hiding this comment

pd93 Nov 17, 2023

Choose a reason for hiding this comment

andreynering Nov 16, 2023

Choose a reason for hiding this comment

pd93 commented Sep 22, 2023 •

edited

c-ameron commented Nov 2, 2023 •

edited