New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP reporting for backup+restore, CLI reporting gor restore #1905

Closed
wants to merge 4 commits into
base: master
from

Conversation

Projects
None yet
5 participants
@CapacitorSet
Copy link

CapacitorSet commented Jul 28, 2018

What is the purpose of this change? What does it change?

During backups and restores, Restic will send progress data (how many files processed, how many bytes, how many folders unchanged, etc.) to an external HTTP server every few seconds. It also implements CLI reporting for restores, in a fashion similar to backups.

Was the change discussed in an issue or in the forum before?

Yes: https://forum.restic.net/t/new-feature-http-reporting-for-backup-restore-cli-reporting-for-restore/824

Checklist

  • I have read the Contribution Guidelines
  • I have added tests for all changes in this PR
  • I have added documentation for the changes (in the manual)
  • There's a new file in changelog/unreleased/ that describes the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review

The development of this feature was sponsored by InAsset.

}

// NewBackup returns a new backup progress reporter.
func NewRestore(term *termstatus.Terminal, verbosity uint, statusUrl string, statusInterval int, statusToken string) *Restore {

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

func parameter statusUrl should be statusURL

}
}

// NewBackup returns a new backup progress reporter.

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

comment on exported function NewRestore should be of the form "NewRestore ..."

// Restore reports progress for the `restore` command.
type Restore struct {
*Message
Http *HttpRestore

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

struct field Http should be HTTP

h.send(msg)
}

func NewHttpRestore(r *Restore, url string, interval int, token string) *HttpRestore {

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

exported function NewHttpRestore should have comment or be unexported
func NewHttpRestore should be NewHTTPRestore

h.send(msg)
}

func (h *HttpRestore) SendDone() {

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

exported method HttpRestore.SendDone should have comment or be unexported

HTTP_SCANNING_DATA
HTTP_DOING_BACKUP
HTTP_DOING_RESTORE
HTTP_DONE

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

don't use ALL_CAPS in Go names; use CamelCase

HTTP_READING_INDEX
HTTP_SCANNING_DATA
HTTP_DOING_BACKUP
HTTP_DOING_RESTORE

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

don't use ALL_CAPS in Go names; use CamelCase

HTTP_NONE = iota
HTTP_READING_INDEX
HTTP_SCANNING_DATA
HTTP_DOING_BACKUP

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

don't use ALL_CAPS in Go names; use CamelCase

const (
HTTP_NONE = iota
HTTP_READING_INDEX
HTTP_SCANNING_DATA

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

don't use ALL_CAPS in Go names; use CamelCase


const (
HTTP_NONE = iota
HTTP_READING_INDEX

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

don't use ALL_CAPS in Go names; use CamelCase

@CapacitorSet CapacitorSet force-pushed the CapacitorSet:feature-http branch from 1c81dcf to e124c40 Jul 28, 2018

h.send(msg)
}

func NewHTTPRestore(r *Restore, url string, interval int, token string) *HTTPRestore {

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

exported function NewHTTPRestore should have comment or be unexported

h.send(msg)
}

func (h *HTTPRestore) SendDone() {

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

exported method HTTPRestore.SendDone should have comment or be unexported

h.send(msg)
}

func (h HTTPRestore) SendUpdate() {

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

exported method HTTPRestore.SendUpdate should have comment or be unexported

"github.com/restic/restic/internal/archiver"
)

type HTTPRestore struct {

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

exported type HTTPRestore should have comment or be unexported

h.send(msg)
}

func NewHTTPBackup(b *Backup, url string, interval int, token string) *HTTPBackup {

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

exported function NewHTTPBackup should have comment or be unexported

const (
HttpNone = iota
HttpReadingIndex
HttpScanningData

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

const HttpScanningData should be HTTPScanningData


const (
HttpNone = iota
HttpReadingIndex

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

const HttpReadingIndex should be HTTPReadingIndex

import "time"

const (
HttpNone = iota

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

const HttpNone should be HTTPNone
exported const HttpNone should have comment (or a comment on this block) or be unexported

@@ -54,8 +59,8 @@ type Backup struct {
}

// NewBackup returns a new backup progress reporter.
func NewBackup(term *termstatus.Terminal, verbosity uint) *Backup {
return &Backup{
func NewBackup(term *termstatus.Terminal, verbosity uint, statusUrl string, statusInterval int, statusToken string) *Backup {

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

func parameter statusUrl should be statusURL

@@ -51,6 +51,9 @@ type GlobalOptions struct {
CACerts []string
TLSClientCert string
CleanupCache bool
StatusUrl string

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

struct field StatusUrl should be StatusURL

@CapacitorSet CapacitorSet force-pushed the CapacitorSet:feature-http branch from e124c40 to d0da440 Jul 28, 2018

h.send(msg)
}

// SendUpdate reports completion, sending a message with the final statistics.

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

comment on exported method HTTPRestore.SendDone should be of the form "SendDone ..."

h.send(msg)
}

// NewHTTPRestore creates an HTTP reporter for restoring

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

comment on exported function NewHTTPBackup should be of the form "NewHTTPBackup ..."

h.send(msg)
}

// SendUpdate reports completion, sending a message with the final statistics.

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

comment on exported method HTTPBackup.SendDone should be of the form "SendDone ..."

import "time"

const (
HTTPNone = iota

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

exported const HTTPNone should have comment (or a comment on this block) or be unexported

@CapacitorSet CapacitorSet force-pushed the CapacitorSet:feature-http branch from d0da440 to 4779996 Jul 28, 2018

const (
// HTTPNone:
HTTPNone = iota
HTTPReadingIndex

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

exported const HTTPReadingIndex should have comment (or a comment on this block) or be unexported

import "time"

const (
// HTTPNone:

This comment has been minimized.

@houndci-bot

houndci-bot Jul 28, 2018

comment on exported const HTTPNone should be of the form "HTTPNone ..."

@CapacitorSet CapacitorSet force-pushed the CapacitorSet:feature-http branch from 4779996 to 0833c50 Jul 28, 2018

@CapacitorSet CapacitorSet force-pushed the CapacitorSet:feature-http branch from 0833c50 to 4ed2ff3 Jul 28, 2018

@CapacitorSet CapacitorSet force-pushed the CapacitorSet:feature-http branch from 4ed2ff3 to b4f575a Jul 28, 2018

@mholt

This comment has been minimized.

Copy link
Contributor

mholt commented Jul 28, 2018

Interesting -- what about just streaming JSON or something to stdout with the progress data? i.e. why does this change require an HTTP server to read the updates?

@CapacitorSet

This comment has been minimized.

Copy link
Author

CapacitorSet commented Jul 28, 2018

That was my first thought when I was assigned this project - "why not just pipe the output to netcat?" - but it turns out that the client needed a cross-platform solution. That's a reasonable constraint.

@mholt

This comment has been minimized.

Copy link
Contributor

mholt commented Jul 28, 2018

I asked, because I'll probably be proposing a PR in the near-future which streams the status out in JSON (or similar) format, sans HTTP. (My change will include the current file(s) being backed up.)

Is there a way to decouple this PR from HTTP?

@CapacitorSet

This comment has been minimized.

Copy link
Author

CapacitorSet commented Jul 29, 2018

Yes, it'd be sufficient to replace http_backup.go:34:

-_, err = http.Post(h.url, "text/json", bytes.NewReader(j))
-if err != nil {
-	panic(err)
-}
+os.Stdout.Write(j)

From there, it's a matter of removing HTTP-specific configuration (URL, token) and renaming functions and variables.

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Jul 29, 2018

Thanks for your contribution! I'll have to think about this for a bit, but at the moment I'm not convinced that HTTP reporting is a good idea. What @mholt proposed (streaming JSON output) is a much better approach. In addition, the progress reporting for the restorer is a separate feature, it must be submitted in a separate PR. Since we'll rework the restorer completely in #1719, I'd like to hold on for now until that is merged, then we can add reporting properly.

@mpetrazzo

This comment has been minimized.

Copy link

mpetrazzo commented Jul 31, 2018

Hi,
I'm @CapacitorSet colleague.
We are developing a restic wrapper that envelop the restic exe, call it at specific time (with a scheduler) and send the output informations to a centralized server.
We use python with pyinstaller for develop and create standalone exe to distribute them to the clients (win, lin, mac).
After a lot of tests we see that on windows and a program that write to console logging (stdout) aren't so much friends: the caller (our program) can correctly read outpout from stdout only in special cases, and not always. This doesn't happen, for example, on linux.
So we thought, and develop, an alternative method for receive and parse informations from restic, that is platform independent and can be use for collect them in localhost or everywhere where there is a http server (for example also in restic rest-server)

Please consider to accept this patch that doesn't interfere with restic, but add a new possibility to parse restic (fantastic) work.

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Jul 31, 2018

Thank you for the praise, I love that restic helps you!

It would have been great to talk about this design before somebody actually spent time implementing it, so maybe we would have found a better solution. While it covers your use case, I fear that it's a very narrow solution for a very specific problem that (at least feels to me) could have been resolved in a way that integrates better with restic. After all, we (as a project) have to live with and maintain this code indefinitely.

Here's a great article about the options dilemma, which explains it very well:
http://neugierig.org/software/blog/2018/07/options.html

We already have one way of getting machine-readable feedback (JSON printed to stdout) in a way that does not require any network (which is always a good thing). So I don't think it's a good idea to add a second way. Where do we stop? Add support for writing progress/status information to a MySQL database? To MongoDB and CouchDB? What about monitoring such as Nagios? It's a slippery slope adding more and more features that we as a spare-time for-fun project (at least for me) have to maintain for a long time.

Please don't get me wrong: I love that you spent time and resources to improve restic, and that you use it! But let's talk about the design first.

I'd like to start with your original problem, please open a new issue (ideally as a bug report) so we can discuss it. Do you have any hints or background information what's going on when restic is started by a Python process? Why can you just read stdout of the restic process from Python, e.g. linewise? For me, it's the first time that I hear of such problems (I'm not a Windows user myself). Is there anything we can improve from restic's side?

Maybe you could just redirect the JSON output from restic to a file, and then have the Python code submit this JSON file to your web server. This would not need any support for reporting via HTTP in restic (and is very flexible)...

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Jul 31, 2018

I'm closing this PR for now as it cannot be merged. The way forward is to do two different things:

  • For the metric/reporting issue: Open a new issue, describe the use case and the problems with Python reading restic's status output, then we can find out what's going on
  • For the restore CLI feedback: Please hold until #1719 is merged, then maybe submit the PR again, rebased to the merged code.

Thank you very much for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment