Skip to content

Add Follow Option (Data Dump Download)#103

Merged
ninoseki merged 17 commits intomainfrom
follow
Jan 26, 2026
Merged

Add Follow Option (Data Dump Download)#103
ninoseki merged 17 commits intomainfrom
follow

Conversation

@ninoseki
Copy link
Collaborator

Add --follow option in the data dump's download command (#101)

This PR also includes the following changes:

  • Structuring data dump's list response
  • Add --directory-prefix/-P option (same as wget)

Note: following without date param is not implemented in this PR (since this PR is big already)

@ninoseki ninoseki requested a review from fw42 January 16, 2026 09:18
}

func AddDirectoryPrefixFlag(cmd *cobra.Command) {
cmd.Flags().StringP("directory-prefix", "P", ".", "Set directory prefix where file will be saved")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if I wouldn't already know, I would not be sure what this means. Maybe something like output-directory would be clearer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the description explain well?
(I'm biased since I sometimes use wget with -P option)

}

// update the database after successful download
err = db.SetDataDump(path, filepath.Join(directoryPrefix, output))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're doing this even if --follow wasn't used, right? Is that what you intended?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's intended since I think it's smart to not re-download already downloaded file with the follow option.

pro datadump download hours/api/20260115/20260115-00.gz
# skip 20260115-00.gz and download remains
pro datadump download hours/api/20260115/ --follow 

if err != nil {
return nil, fmt.Errorf("failed to check download status for %s: %w", file.Path, err)
}
// if force is set, re-download all files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any legitimate reason why someone might want to use --follow --force? It will redownload all files and overwrite them if they already exist, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I don't know. But I think force re-downloading files if --follow --force is a right move when they are given.

}

if !fileExists(localPath) {
// if file is deleted, remove it from the database
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want that? I imagine what many people will do is have a script that uses --follow, then somehow ingest the downloaded data into their own system, then delete the downloaded files, and then later run again with --follow. The logic here would make it so that it will redownload the file that were already downloaded before. I'm not sure that is what people will expect. I would have assumed files that have been downloaded already will be skipped (even if they don't exist anymore). 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intent to cover the following cases:

  • A user deletes a file accidentally
  • A user deletes a file intentionally and want to fetch it again afterwards

ninoseki and others added 8 commits January 20, 2026 09:35
@ninoseki ninoseki requested a review from fw42 January 20, 2026 05:55
ninoseki and others added 2 commits January 26, 2026 10:12
Co-authored-by: Florian Weingarten <fwgarten@gmail.com>
@ninoseki ninoseki merged commit 5e3a7b7 into main Jan 26, 2026
@ninoseki ninoseki deleted the follow branch January 30, 2026 02:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants