Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to azblob sdk #4028

Merged
merged 1 commit into from Dec 7, 2022
Merged

Conversation

ekarlso
Copy link
Contributor

@ekarlso ekarlso commented Nov 20, 2022

This is the start to fix #3698 since that requires to take a cred
struct into many of the client New methods

What does this PR change? What problem does it solve?

Was the change previously discussed in an issue or on the forum?

Checklist

  • I have read the contribution guidelines.
  • I have enabled maintainer edits.
  • I have added tests for all code changes.
  • I have added documentation for relevant changes (in the manual).
  • There's a new file in changelog/unreleased/ that describes the changes for our users (see template).
  • I have run gofmt on the code in all commits.
  • All commit messages are formatted in the same style as the other commits in the repo.
  • I'm done! This pull request is ready for review.

Copy link
Member

@MichaelEischer MichaelEischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Switching to the newer azure sdk is a good idea. Most comments are smaller issues, the only larger one is the Save method.

I'm a bit undecided how the Save method should look like. We basically have three options:

  • change the case with <256MB files to stage and commit a block similar to saveLarge. The main difference would be that the expected md5 hash of the block is retrieved from rd.Hash()
  • Always use saveLarge
  • Always use UploadStream

The simplest way would be to use UploadStream, however, then we'd loose the end-to-end integrity checks for uploaded blocks. My current preference would be to use the first option, even though it's slightly more code.

internal/backend/azure/azure.go Show resolved Hide resolved
internal/backend/azure/azure.go Outdated Show resolved Hide resolved
internal/backend/azure/azure.go Outdated Show resolved Hide resolved
internal/backend/azure/azure.go Outdated Show resolved Hide resolved
internal/backend/azure/azure.go Show resolved Hide resolved
internal/backend/azure/azure.go Outdated Show resolved Hide resolved
internal/backend/azure/azure.go Outdated Show resolved Hide resolved
internal/backend/azure/azure.go Outdated Show resolved Hide resolved
internal/backend/azure/azure.go Outdated Show resolved Hide resolved
internal/backend/azure/azure.go Outdated Show resolved Hide resolved
@ekarlso ekarlso force-pushed the use-az-blob-sdk branch 4 times, most recently from 84195f0 to 59312bd Compare November 23, 2022 08:42
@MichaelEischer
Copy link
Member

I've fixed the remaining review comments. Apparently the azure SDK requires golang 1.18, which would be major bump in the required golang version...

@ekarlso
Copy link
Contributor Author

ekarlso commented Nov 26, 2022

@MichaelEischer idk what's up here but I get

debug log file test.log
debug enabled
Fatal: create key in repository at azure:test:/ failed: StageBlock: PUT https://xxxblob.core.windows.net/test/keys/5a09ec71a89cf36911982a9de23ce53639436830a4114662fe2c457937917c9c
--------------------------------------------------------------------------------
RESPONSE 400: 400 The MD5 value specified in the request did not match with the MD5 value calculated by the server.
ERROR CODE: Md5Mismatch
--------------------------------------------------------------------------------
<?xml version="1.0" encoding="utf-8"?><Error><Code>Md5Mismatch</Code><Message>The MD5 value specified in the request did not match with the MD5 value calculated by the server.
RequestId:afd69da6-501e-0038-77cf-01c51e000000
Time:2022-11-26T19:43:01.9531956Z</Message><UserSpecifiedMd5>MBrCsbYtOgytzT9iexEK4w==</UserSpecifiedMd5><ServerCalculatedMd5>H1A55QvWaykMVmhNhVDGwg==</ServerCalculatedMd5></Error>

Seems we're calculating or giving a mismach of the MD5?

@ekarlso
Copy link
Contributor Author

ekarlso commented Nov 27, 2022

So now it's working that far but now it hangs on:

snapshot 709fadfc saved

I guess it's due to one of the locks not releasing?

@MichaelEischer
Copy link
Member

MichaelEischer commented Nov 28, 2022

So now it's working that far but now it hangs on:

snapshot 709fadfc saved

I guess it's due to one of the locks not releasing?

Without more context that's usually pretty hard to tell. In case of a hanging golang program it's often helpful to send a SIGQUIT to the program (ctrl+). Then the runtime prints a stacktrace of all goroutines and exits. Here it was probably caused by a leaked token in the Save method. I've rebased the PR and pushed a fix for that.

[Edit]Did you test whether go test ./internal/backend/azure works?[/Edit]

@ekarlso
Copy link
Contributor Author

ekarlso commented Nov 29, 2022

No but I well def use it before pushing more code! Thanks for the tip

@ekarlso
Copy link
Contributor Author

ekarlso commented Nov 29, 2022

I found a lock contention in the "List" method https://github.com/restic/restic/pull/4028/files#diff-1b24b164885fb0062347e123c6485f3444f696c8d178798bcfa7e6ac6376fd44R416

go test --race --timeout=10s ./internal/backend/azure                               17:32:52
debug log file test.log
debug enabled
tests.go:807: unexpected error: unexpected EOF

panic: test timed out after 10s

goroutine 16 [running]:
testing.(*M).startAlarm.func1()
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:2036 +0xbb
created by time.goFunc
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/time/sleep.go:176 +0x48

goroutine 1 [chan receive]:
testing.(*T).Run(0xc000184ea0, {0xb43a22, 0x10}, 0xb78940)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1494 +0x789
testing.runTests.func1(0x0?)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1846 +0x9a
testing.tRunner(0xc000184ea0, 0xc0001dfba0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1446 +0x217
testing.runTests(0xc0001bee60?, {0xf0ccc0, 0x3, 0x3}, {0x40?, 0x7f4fedb12e40?, 0xf14820?})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1844 +0x7ed
testing.(*M).Run(0xc0001bee60)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1726 +0xa85
main.main()
        _testmain.go:55 +0x2ea

goroutine 35 [chan receive]:
github.com/restic/restic/internal/restic.init.0.func1.1()
        /home/ekarlso/projects/evry/dds/infrastructure/docker-images/restic/internal/restic/lock.go:273 +0xc6
created by github.com/restic/restic/internal/restic.init.0.func1
        /home/ekarlso/projects/evry/dds/infrastructure/docker-images/restic/internal/restic/lock.go:270 +0x2a

goroutine 3 [syscall]:
os/signal.signal_recv()
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/runtime/sigqueue.go:152 +0x2f
os/signal.loop()
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/os/signal/signal_unix.go:23 +0x25
created by os/signal.Notify.func1.1
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/os/signal/signal.go:151 +0x51

goroutine 37 [chan receive]:
testing.(*T).Run(0xc000185380, {0xa75502, 0xe}, 0xc000394100)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1494 +0x789
github.com/restic/restic/internal/backend/test.(*Suite).RunTests(0xc0001bc410, 0xc000185380)
        /home/ekarlso/projects/evry/dds/infrastructure/docker-images/restic/internal/backend/test/suite.go:55 +0x2f9
github.com/restic/restic/internal/backend/azure_test.TestBackendAzure(0xc000185380)
        /home/ekarlso/projects/evry/dds/infrastructure/docker-images/restic/internal/backend/azure/azure_test.go:109 +0x1b2
testing.tRunner(0xc000185380, 0xb78940)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1446 +0x217
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1493 +0x75e

goroutine 21 [IO wait]:
internal/poll.runtime_pollWait(0x7f4fed36f028, 0x72)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/runtime/netpoll.go:305 +0x89
internal/poll.(*pollDesc).wait(0xc000042118, 0xc0000e6000?, 0x0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/internal/poll/fd_poll_runtime.go:84 +0xbd
internal/poll.(*pollDesc).waitRead(...)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000042100, {0xc0000e6000, 0x2000, 0x2000})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/internal/poll/fd_unix.go:167 +0x415
net.(*netFD).Read(0xc000042100, {0xc0000e6000, 0x2000, 0x2000})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/net/fd_posix.go:55 +0x51
net.(*conn).Read(0xc000014030, {0xc0000e6000, 0x2000, 0x2000})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/net/net.go:183 +0xb1
crypto/tls.(*atLeastReader).Read(0xc000246000, {0xc0000e6000, 0x2000, 0x2000})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/crypto/tls/conn.go:787 +0x86
bytes.(*Buffer).ReadFrom(0xc0000d8278, {0xc351a0, 0xc000246000})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/bytes/buffer.go:202 +0x113
crypto/tls.(*Conn).readFromUntil(0xc0000d8000, {0xc35780?, 0xc000014030}, 0x5)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/crypto/tls/conn.go:809 +0x1f3
crypto/tls.(*Conn).readRecordOrCCS(0xc0000d8000, 0x0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/crypto/tls/conn.go:616 +0x417
crypto/tls.(*Conn).readRecord(...)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/crypto/tls/conn.go:582
crypto/tls.(*Conn).Read(0xc0000d8000, {0xc00022f000, 0x1000, 0xc0001029c0?})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/crypto/tls/conn.go:1287 +0x29d
net/http.(*persistConn).Read(0xc0001e67e0, {0xc00022f000, 0x1000, 0x1000})
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/net/http/transport.go:1929 +0x105
bufio.(*Reader).fill(0xc0002d02a0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/bufio/bufio.go:106 +0x2ab
bufio.(*Reader).Peek(0xc0002d02a0, 0x1)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/bufio/bufio.go:144 +0xd2
net/http.(*persistConn).readLoop(0xc0001e67e0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/net/http/transport.go:2093 +0x2c8
created by net/http.(*Transport).dialConn
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/net/http/transport.go:1751 +0x2586

goroutine 22 [select]:
net/http.(*persistConn).writeLoop(0xc0001e67e0)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/net/http/transport.go:2392 +0x1a9
created by net/http.(*Transport).dialConn
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/net/http/transport.go:1752 +0x261a

goroutine 58 [chan receive]:
testing.(*T).Run(0xc000102d00, {0xb406be, 0x9}, 0xc000246048)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1494 +0x789
github.com/restic/restic/internal/backend/test.(*Suite).TestListCancel(0xc0001bc410, 0xc000102d00)
        /home/ekarlso/projects/evry/dds/infrastructure/docker-images/restic/internal/backend/test/tests.go:354 +0x545
testing.tRunner(0xc000102d00, 0xc000394100)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1446 +0x217
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1493 +0x75e

goroutine 43 [chan send]:
github.com/restic/restic/internal/backend/sema.Semaphore.GetToken(...)
        /home/ekarlso/projects/evry/dds/infrastructure/docker-images/restic/internal/backend/sema/semaphore.go:27
github.com/restic/restic/internal/backend/azure.(*Backend).List(0xc00011d290, {0xc39138, 0xc00019e140}, 0x1, 0xc0004f4080)
        /home/ekarlso/projects/evry/dds/infrastructure/docker-images/restic/internal/backend/azure/azure.go:426 +0x35e
github.com/restic/restic/internal/backend/test.(*Suite).TestListCancel.func1(0xc000480340)
        /home/ekarlso/projects/evry/dds/infrastructure/docker-images/restic/internal/backend/test/tests.go:359 +0x10b
testing.tRunner(0xc000480340, 0xc000246048)
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1446 +0x217
created by testing.(*T).Run
        /home/linuxbrew/.linuxbrew/Cellar/go/1.19.3/libexec/src/testing/testing.go:1493 +0x75e
FAIL    github.com/restic/restic/internal/backend/azure 10.037s
FAIL

@MichaelEischer
Copy link
Member

goroutine 43 [chan send]:
github.com/restic/restic/internal/backend/sema.Semaphore.GetToken(...)
/home/ekarlso/projects/evry/dds/infrastructure/docker-images/restic/internal/backend/sema/semaphore.go:27
github.com/restic/restic/internal/backend/azure.(*Backend).List(0xc00011d290, {0xc39138, 0xc00019e140}, 0x1, 0xc0004f4080)
/home/ekarlso/projects/evry/dds/infrastructure/docker-images/restic/internal/backend/azure/azure.go:426 +0x35e

I have no idea which version of the code that is. But it looks like it does not include the last three bugfix commits which I've added to this PR (otherwise the line number doesn't make sense).

@ekarlso ekarlso force-pushed the use-az-blob-sdk branch 2 times, most recently from c10dfda to af66852 Compare December 4, 2022 18:14
@ekarlso
Copy link
Contributor Author

ekarlso commented Dec 4, 2022

@MichaelEischer all tests are green and the branch has been updated. Thanks for your help!

Copy link
Member

@MichaelEischer MichaelEischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I've squashed the commits (the bugfix commits probably aren't useful anyways) and removed the changelog.

@MichaelEischer MichaelEischer merged commit eae7366 into restic:master Dec 7, 2022
@ekarlso ekarlso deleted the use-az-blob-sdk branch December 8, 2022 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for backup on Microsoft Azure Blob Storage using Workload Identity
2 participants