Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added abstract helper interface for all storage backends #135

Merged
merged 32 commits into from
Aug 18, 2022
Merged

Added abstract helper interface for all storage backends #135

merged 32 commits into from
Aug 18, 2022

Conversation

MaxJa4
Copy link
Contributor

@MaxJa4 MaxJa4 commented Jul 21, 2022

Implements #113

I created an abstract helper interface and implemented it for all storage backends (Minio/S3, WebDav, SSH, local).
The obvious candidates for similar operations were copyArchive and pruneBackups as well as the initialization of the different clients.
Also doPrune has been moved to the abstract helper.

As I'm no Go expert, feel free to adjust these changes to your liking or suggest optimizations.
Maybe the helpers could be moved in their own module - idk how well that would work with Go though while avoiding import cycles since the helpers need "script" and the script.go needs the helpers...

@m90
Copy link
Member

m90 commented Jul 21, 2022

Thanks for working on this. As it's a pretty big changeset and also a really important step forward, review might take a while and I just wanted to let you know that it's much appreciated and I'm looking forward to get this merged once it's ready. Thanks 🎩

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 21, 2022

No worries! It's non-functional so there is no hurry.
Since it touches almost the entire application, it also helped getting to know the code for other contributions :)

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 21, 2022

Maybe having the helpers in a wrapper would be beneficial. One would only have to call helpers.prune() (as an example).
Having a list of helpers could work too, just calling helpers[i].prune() (as an example) in a loop.

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 22, 2022

Just saw your comment in #94, didn't know you already implemented something like that on a different branch. I'll have a look, maybe it can be combined or extended :)

@m90
Copy link
Member

m90 commented Jul 22, 2022

didn't know you already implemented something like that on a different branch

I wasn't super happy with that (otherwise I would have merged it), so feel free to stray away from that path when it makes sense.

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 22, 2022

To do (@me):

  • Add collection of storage backends and unify them in script.go (full transparency if possible)
  • Polish/Cleanup
  • Summarize changes here

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 22, 2022

Changes:

  • Moved files with only structs (config/stats) to sub-module "types" so it can be accessed by all other modules
  • Moved util to sub-module "utilites" so it can be accessed by all other modules
  • Created abstract interface / struct for storages (Storage/GenericStorage)
  • Applied pattern to all implemented storage backends (especially regarding the methods init, copy and prune)
  • Unified storage backends into a single StoragePool
  • Script only has to call StoragePool once per action to work as before -> transparent, easy to use and extendable

All changes are just architectural and structural. I tried to not change any functional flows.
IMO it is a lot nicer now, easy to extend with more backends and intuitive to understand.

Feel free to have a look as you find time @m90 and share your thoughts - no hurry here.
After this is done, I could tackle #103 and/or #101 since it would be quite a bit easier to implement after this PR.

Copy link
Member

@m90 m90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, it will be very nice to have this cleaned up.

I left a couple of comments from a first read inline, and I'd also like to discuss one thing in detail here.

I think it would be great if we could change the package layout here a little bit to make it more flexible, a bit less arbitrary and also more idiomatic Go. The package layout could also guide us towards a minimal interface definition for a storage backend.

How I would envision this to look like right now would be:

internal/
└── storage
    ├── local
    │   └── local.go // exports a func NewLocalStorage(...params) storage.Backend
    ├── s3
    │   └── s3.go // exports a func NewS3Storage(...params) storage.Backend
    ├── ssh
    │   └── ssh.go  // exports a func NewSSHStorage(...params) storage.Backend
    ├── storage.go // exports a Backend interface type
    └── webdav
        └── webdav.go // exports a func NewWebDAVStorage(...params) storage.Backend
cmd
└── backup
    ├── archive.go
    └── .... and so on

Like this we'd achieve two things:

  1. The script logic and control flow stays in its place without creating arbitrary boundaries and creating hard-to-answer questions about who is responsible for doing what.

  2. Logic for talking to a single storage backend is encapsulated in its own package and hidden from the script.


I'll also check out your branch now and poke around in the code a little, trying to run examples and such. Let me know if there's anything you'd like to be clarified or discussed further.

cmd/backup/hooks.go Outdated Show resolved Hide resolved
cmd/backup/script.go Outdated Show resolved Hide resolved
cmd/backup/script.go Show resolved Hide resolved
cmd/backup/storages/local.go Outdated Show resolved Hide resolved
cmd/backup/storages/providers.go Outdated Show resolved Hide resolved
cmd/backup/storages/local.go Outdated Show resolved Hide resolved
cmd/backup/storages/local.go Outdated Show resolved Hide resolved
@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 23, 2022

Thank you for taking the time to review the proposed changes and describing your thoughts on them!
Like I said, I'm not too deep in Go and am a little biased from other approaches (from C#/Python), so I appreciate the input.

I totally agree, that the storages should be in their internal directory/package. You placed each backend, which is a single file, in their own subdirectory. Would you also make it a separate (internal) package? It looks odd to me to have directories for single files, but on the other hand, it cleanly separates it from storage.go and make it clear, what is a backend and what isn't.

I will wait with any changes in regards to your comments so we don't have conflicting changes, since you are having a look right now.

@m90
Copy link
Member

m90 commented Jul 23, 2022

It looks odd to me to have directories for single files

The idea is that package storage just defines what a Backend is, and then other packages (e.g. s3 or webdav) implement it. It would also work if you flattened it out, but when separated it's easier not to accidentally conflate the two responsibilites.

@m90
Copy link
Member

m90 commented Jul 23, 2022

I will wait with any changes in regards to your comments so we don't have conflicting changes, since you are having a look right now.

I won't be making any changes on your branch in any case so feel free to continue whenever it works for you. Don't forget to have a weekend too. Thank you!

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 24, 2022

Did some work regarding the structure like discussed:

  • Internal packages
  • Slice of StorageBackends in script
  • Moved conditionals for backend init back to script
  • Stats after pruning are updated inside the respective backend itself

Utilities/util.go could not be moved back to main, as e.g. the Join function is used by all packages and therefore needs to stay universally accessible without causing cyclic dependencies.

Copy link
Member

@m90 m90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice progress. How do you feel about the changed package layout now? I left a couple of comments inline.

In addition to that I think the next important step would be to get to the following point: Ideally, package storage just exports a single interface type and all of the constructor functions for a storage backend export a single function that returns this very interface type. In addition, the constructor functions could take non-custom types or types defined in package storage only, removing the need for the types package.

Let me know if have any questions. Thanks!

.gitignore Outdated Show resolved Hide resolved
internal/storage/local/local.go Outdated Show resolved Hide resolved
internal/storage/local/local.go Outdated Show resolved Hide resolved
internal/storage/local/local.go Outdated Show resolved Hide resolved
internal/storage/local/local.go Outdated Show resolved Hide resolved
internal/storage/storage.go Outdated Show resolved Hide resolved
internal/storage/s3/s3.go Outdated Show resolved Hide resolved
internal/storage/s3/s3.go Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
cmd/backup/script.go Show resolved Hide resolved
@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 24, 2022

How do you feel about the changed package layout now?

Looks good to me!

In addition, the constructor functions could take non-custom types or types defined in package storage only, removing the need for the types package.

Since the storage backends don't use the Config type anymore, it can surely be moved back to main now. For the stats we would need to move the stats recording back to script as well, which is already the case for copy (not for prune though). So Prune would need to return its stats, so lenCandidates and matches/len(matches)... can you think of a better solution than that? It feels kind of messy to me. Just returning it a Stats object would need the package it lives in again.

cmd/backup/hooks.go Outdated Show resolved Hide resolved
@m90
Copy link
Member

m90 commented Jul 24, 2022

For the stats we would need to move the stats recording back to script as well, which is already the case for copy (not for prune though).

If we already do it for copy, I think it's ok to do the same for prune, no? It's not super ugly, and if you feel it could need a bit more structure, you could also have some sort of intermediate wrapper type defined in package storage. The stats type defined in package main could then have some logic that maps the types returned from package storage without creating a circular dependency.

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 25, 2022

Did you mean with this and this comment what I implemented in c42b188 and 164a32b (last two commits)?
It guarantees a standard format and adds the formation of the context (storage backend name).

Examples:

time="2022-07-25T19:31:05+02:00" level=info msg="[S3] Uploaded a copy of backup /tmp/backup-2022-07-25T19-31-01.tar.gz to bucket Testbucket-VPS."

time="2022-07-25T19:32:36+02:00" level=error msg="Fatal error running backup: [S3] Copy: Error uploading backup to remote storage! The specified bucket does not exist: Testbucket-VPS-invalid"

@m90
Copy link
Member

m90 commented Jul 25, 2022

Did you mean with #135 (comment) and #135 (comment) comment what I implemented in c42b188 and 164a32b (last two commits)?

Yes, that's pretty much what I meant, thank you.

@m90
Copy link
Member

m90 commented Jul 28, 2022

@MaxJa4 Just to avoid any misunderstandings, can you ping me when you think this is ready to have another look at? There's no need to hurry or anything, I just wanted to make sure you're not waiting on feedback when I'm thinking you're all set. Thanks.

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 28, 2022

Sure, I will ping you!
Can I convert an issue back to a draft without loosing anything? I have never done it backwards like that, but it would help to see, that's it's still WIP (though not much left to do).

@m90
Copy link
Member

m90 commented Jul 28, 2022

Can I convert an issue back to a draft without loosing anything?

I doubt there is anything you can lose here, but you can also keep it a PR proper, I won't merge it by accident :) Thanks!

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Jul 29, 2022

@m90 I think I'm done with everything I wanted to do. If there is anything left, let me know.

internal/storage/storage.go Outdated Show resolved Hide resolved
internal/storage/storage.go Outdated Show resolved Hide resolved
cmd/backup/script.go Outdated Show resolved Hide resolved
@m90
Copy link
Member

m90 commented Aug 11, 2022

@MaxJa4 Are you still planning on doing refinements on this or are you waiting for a review of mine? Both would be fine, I just don't want this to go stale. Thanks.

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Aug 11, 2022

@m90 I have nothing left from my side. If we wanna optimize some small things here and there in the future, we still can with separate PRs.

Copy link
Member

@m90 m90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a last round of comments, all of them very nitpicky, but probably worth it so things are kept consistent. Everything else looks pretty good now.

There's a bigger topic left in there still, but I'd like to think about this more and then possibly implement it myself: Even if it's better than passing down the top level logger, I'm not super fond of passing down that logger function, it's currently the only sideeffecfful thing being passed down and I'm not sure if we need to do this at all. Maybe we can somehow have a channel on the storage implementations that emit log messages while things are happening and then script can decide what to do about these. In any case, don't worry about this, I just wanted to ḿention that I'm still thinking about how to solve this.

internal/storage/storage.go Outdated Show resolved Hide resolved
internal/storage/storage.go Outdated Show resolved Hide resolved
internal/storage/storage.go Outdated Show resolved Hide resolved
internal/storage/webdav/webdav.go Outdated Show resolved Hide resolved
.dockerignore Outdated Show resolved Hide resolved
cmd/backup/hooks.go Outdated Show resolved Hide resolved
@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Aug 17, 2022

Rebased with GitHub UI, let's see if it worked correctly. History looked good though. No conflicts anymore.

Copy link
Member

@m90 m90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This turned out really well, thanks for all the effort put into this. I still have to make up my mind about how to get this released (so I won't merge before that), but in case you want to add other features on top of that already, let me know and we can have a develop branch or something similar.

Thank you very much.

@m90 m90 changed the base branch from main to development August 18, 2022 06:51
@m90 m90 merged commit 9afe586 into offen:development Aug 18, 2022
@m90
Copy link
Member

m90 commented Aug 18, 2022

There's now a development branch on the repository that holds this PR plus some changes of mine that I missed in review (most notably fixing the spelling of WebDAV).

@MaxJa4 just so that I can plan how to move ahead, are you planning to add some things on top (e.g. #138 or #101) soon? Both is fine, just so that I know.

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Aug 18, 2022

@m90 #138 was on my list, yes, maybe #101 too. But I've been sick the last two weeks and probably will be able to start working on this by the end of next week - just so you know.

@m90
Copy link
Member

m90 commented Aug 18, 2022

Ok, great, no need to hurry in any way. I'll let you know when this hits the main branch so you know where to start.

@MaxJa4 MaxJa4 deleted the enhancement/abstract-interface branch August 18, 2022 08:49
m90 pushed a commit that referenced this pull request Aug 18, 2022
* Added abstract helper interface and implemented it for all storage backends

* Moved storage client initializations also to helper classes

* Fixed ssh init issue

* Moved script parameter to helper struct to simplify script init.

* Created sub modules. Enhanced abstract implementation.

* Fixed config issue

* Fixed declaration issues. Added config to interface.

* Added StorageProviders to unify all backends.

* Cleanup, optimizations, comments.

* Applied discussed changes. See description.

Moved modules to internal packages.
Replaced StoragePool with slice.
Moved conditional for init of storage backends back to script.

* Fix docker build issue

* Fixed accidentally removed local copy condition.

* Delete .gitignore

* Renaming/changes according to review

Renamed Init functions and interface.
Replaced config object with specific config values.
Init func returns interface instead of struct.
Removed custom import names where possible.

* Fixed auto-complete error.

* Combined copy instructions into one layer.

* Added logging func for storages.

* Introduced logging func for errors too.

* Missed an error message

* Moved config back to main. Optimized prune stats handling.

* Move stats back to main package

* Code doc stuff

* Apply changes from #136

* Replace name field with function.

* Changed receiver names from stg to b.

* Renamed LogFuncDef to Log

* Removed redundant package name.

* Renamed storagePool to storages.

* Simplified creation of new storage backend.

* Added initialization for storage stats map.

* Invert .dockerignore patterns.

* Fix package typo
m90 pushed a commit that referenced this pull request Aug 18, 2022
* Added abstract helper interface and implemented it for all storage backends

* Moved storage client initializations also to helper classes

* Fixed ssh init issue

* Moved script parameter to helper struct to simplify script init.

* Created sub modules. Enhanced abstract implementation.

* Fixed config issue

* Fixed declaration issues. Added config to interface.

* Added StorageProviders to unify all backends.

* Cleanup, optimizations, comments.

* Applied discussed changes. See description.

Moved modules to internal packages.
Replaced StoragePool with slice.
Moved conditional for init of storage backends back to script.

* Fix docker build issue

* Fixed accidentally removed local copy condition.

* Delete .gitignore

* Renaming/changes according to review

Renamed Init functions and interface.
Replaced config object with specific config values.
Init func returns interface instead of struct.
Removed custom import names where possible.

* Fixed auto-complete error.

* Combined copy instructions into one layer.

* Added logging func for storages.

* Introduced logging func for errors too.

* Missed an error message

* Moved config back to main. Optimized prune stats handling.

* Move stats back to main package

* Code doc stuff

* Apply changes from #136

* Replace name field with function.

* Changed receiver names from stg to b.

* Renamed LogFuncDef to Log

* Removed redundant package name.

* Renamed storagePool to storages.

* Simplified creation of new storage backend.

* Added initialization for storage stats map.

* Invert .dockerignore patterns.

* Fix package typo
@m90
Copy link
Member

m90 commented Aug 18, 2022

This is on main now, I really like it. Thanks again.

@MaxJa4
Copy link
Contributor Author

MaxJa4 commented Aug 18, 2022

No worries. I like the improvements you made afterwards too, looks cleaner that way.

@m90
Copy link
Member

m90 commented Oct 12, 2022

This is now released in v2.22.0. Thanks for your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants