Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass size of source while calling put for storage bucket from kando #809

Merged
merged 12 commits into from
Nov 11, 2020

Conversation

viveksinghggits
Copy link
Contributor

@viveksinghggits viveksinghggits commented Nov 5, 2020

Change Overview

We were passing the object size 0 when we called .put on
the bucket of object storage and because of that whenever stow
tried to check if source size is more than the allowed size we didn't
get expected result.
This PR tries to fix that by passing the correct object size while calling put.

Depends on: #809

Pull request type

Please check the type of change your PR introduces:

  • 🐛 Bugfix

Issues

  • #XXX

Test Plan

» cd pkg/location
» go test                                        
OK: 4 passed, 2 skipped
PASS
ok  	github.com/kanisterio/kanister/pkg/location	12.187s

I tried to upload the file of size 1000M, if the size is greater that 256M multipart upload for stow would be called.

» ls -lh
-rw-rw-r-- 1 vivek vivek 1000M Nov  5 18:29 1000M.db.gz

» ./kando location push --profile '{"Location":{"type":"azure","bucket":"vivek-test","endpoint":"","prefix":"k10/480b12f5-9107-41cb-9f18-ad0b57165d08/migration","region":"Central US"},"Credential":{"Type":"keyPair","KeyPair":{"ID":"test","Secret":"test"},"Secret":null},"SkipSSLVerify":false}'  ./1000M.db.gz --path testsize/1/1000
  • 💪 Manual
  • ⚡ Unit test
  • 💚 E2E

We were passing the object size 0 when we called `.put` on
the bucket of object storage and because of that `stow` was
not able to figure out the source size and eventually multi
part upload was not being triggered.
@viveksinghggits
Copy link
Contributor Author

Please ignore the go.mod, go.sum changes as of now.

@@ -97,7 +98,12 @@ func writeData(ctx context.Context, pType objectstore.ProviderType, profile para
if err != nil {
return err
}
if err := bucket.Put(ctx, path, in, 0, nil); err != nil {
b, err := ioutil.ReadAll(in)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't we end up reading the data twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we discussed this in yesterday's call. We unknowingly introduced another issue here. After reading the in is going to be nil. I am working on this.

@viveksinghggits
Copy link
Contributor Author

This requires the version of kastenhq/stow that has multipart support (graymeta/stow#162). The tag that we were using in kanister doesn't have that code.

Signed-off-by: PrasadG193 <prasad.ghangal@gmail.com>
Signed-off-by: PrasadG193 <prasad.ghangal@gmail.com>
Signed-off-by: PrasadG193 <prasad.ghangal@gmail.com>
Signed-off-by: PrasadG193 <prasad.ghangal@gmail.com>
@PrasadG193
Copy link
Contributor

Unit tests run after updating stow dep to v0.2.6-kasten

$ go test -v . -check.vv -check.suitep=4 -timeout=50m
=== RUN   Test
START: location_test.go:60: LocationSuite.SetUpSuite
SKIP: location_test.go:60: LocationSuite.SetUpSuite (Test  requires the environemnt variable 'GOOGLE_APPLICATION_CREDENTIALS')

START: location_test.go:138: LocationSuite.TestAzMultipartUpload
SKIP: location_test.go:138: LocationSuite.TestAzMultipartUpload

START: location_test.go:175: LocationSuite.TestReaderSize
SKIP: location_test.go:175: LocationSuite.TestReaderSize

START: location_test.go:127: LocationSuite.TestWriteAndReadData
SKIP: location_test.go:127: LocationSuite.TestWriteAndReadData

START: location_test.go:60: LocationSuite.SetUpSuite
PASS: location_test.go:60: LocationSuite.SetUpSuite     7.730s

START: location_test.go:138: LocationSuite.TestAzMultipartUpload
START: location_test.go:108: LocationSuite.TearDownTest
PASS: location_test.go:108: LocationSuite.TearDownTest  4.769s

SKIP: location_test.go:138: LocationSuite.TestAzMultipartUpload (Not applicable for location type S3)

START: location_test.go:175: LocationSuite.TestReaderSize
START: location_test.go:108: LocationSuite.TearDownTest
PASS: location_test.go:108: LocationSuite.TearDownTest  0.628s

PASS: location_test.go:175: LocationSuite.TestReaderSize        0.000s

START: location_test.go:127: LocationSuite.TestWriteAndReadData
START: location_test.go:108: LocationSuite.TearDownTest
PASS: location_test.go:108: LocationSuite.TearDownTest  0.593s

PASS: location_test.go:127: LocationSuite.TestWriteAndReadData  16.202s

START: location_test.go:60: LocationSuite.SetUpSuite
PASS: location_test.go:60: LocationSuite.SetUpSuite     0.229s

START: location_test.go:138: LocationSuite.TestAzMultipartUpload
START: location_test.go:108: LocationSuite.TearDownTest
Cannot cleanup test directory: 2020-11-11T05:37:23.474016906Z/testlocation.txt
PASS: location_test.go:108: LocationSuite.TearDownTest  0.093s

PASS: location_test.go:138: LocationSuite.TestAzMultipartUpload 704.459s

START: location_test.go:175: LocationSuite.TestReaderSize
START: location_test.go:108: LocationSuite.TearDownTest
Cannot cleanup test directory: 2020-11-11T05:37:23.474016906Z/testlocation.txt
Cannot cleanup test directory: 2020-11-11T05:37:23.474016906Z/testchunk.txt
PASS: location_test.go:108: LocationSuite.TearDownTest  0.082s

PASS: location_test.go:175: LocationSuite.TestReaderSize        0.000s

START: location_test.go:127: LocationSuite.TestWriteAndReadData
START: location_test.go:108: LocationSuite.TearDownTest
Cannot cleanup test directory: 2020-11-11T05:37:23.474016906Z/testchunk.txt
PASS: location_test.go:108: LocationSuite.TearDownTest  0.056s

PASS: location_test.go:127: LocationSuite.TestWriteAndReadData  3.959s

OK: 5 passed, 4 skipped
--- PASS: Test (708.88s)
PASS
ok      github.com/kanisterio/kanister/pkg/location     708.973s

@PrasadG193
Copy link
Contributor

  1. Test AZ upload with empty files
$ ll -l dump 
-rw-rw-r-- 1 prasad prasad 0 Nov 11 10:25 dump

# ./dist/kando_linux_amd64/kando location push --profile '{"Location":{"type":"azure","bucket":"xxxxxxxxx","endpoint":"","prefix":"k10/test-migration","region":"westindia"},"Credential":{"Type":"keyPair","KeyPair":{"ID":"xxxxx","Secret":"xxxxxxxxxxxxxxxx"},"Secret":null},"SkipSSLVerify":false}'   dump --path testsize/0

# az storage blob list --container-name prasad-test --account-name xxxxxxxxxxx --account-key xxxxxxxxxxxxx 

  {                                                                                                                                                                                                                
    "content": null,                                                                                                                                                                                               
    "deleted": false,                                                                                                                                                                                              
    "metadata": null,                                                                                    
    "name": "k10/test-migration/testsize/0",                                                             
    "properties": {                                                                                                                                                                                                
      "appendBlobCommittedBlockCount": null,                                                             
      "blobTier": "Hot",                      
      "blobTierChangeTime": null,                                                                        
      "blobTierInferred": true,                                                                          
      "blobType": "BlockBlob",                                                                                                                                                                                     
      "contentLength": 0,                                                                                                                                                                                          
      "contentRange": null,      
      "contentSettings": {                                                                                                                                                                                         
        "cacheControl": null,
        "contentDisposition": null,
        "contentEncoding": null,                                                                         
        "contentLanguage": null,
        "contentMd5": "1B2M2Y8AsgTpgAmY7PhCfg==",
        "contentType": "application/octet-stream"
      },
.
.
.

  1. Test Azure upload with 2G file
$ ll -l dump 
-rw-rw-r-- 1 prasad prasad 2147483648 Nov 10 18:29 dump

$ ./dist/kando_linux_amd64/kando location push --profile '{"Location":{"type":"azure","bucket":"xxxxxxxxx","endpoint":"","prefix":"k10/480b12f5-9107-41cb-9f18-ad0b57165d08/migration","region":"westindia"},"Credential":{"Type":"keyPair","KeyPair":{"ID":"xxxxx","Secret":"xxxxxxxxxxxxxxxx"},"Secret":null},"SkipSSLVerify":false}'   dump --path estsize/1

$ az storage blob list --container-name prasad-test --account-name xxxxxxxxxxx --account-key xxxxxxxxxxxxx 

  {                                                                                                                                                   
    "content": null,                                                                                                                                  
    "deleted": false,                                                                                                                                 
    "metadata": null,                                                      
    "name": "k10/480b12f5-9107-41cb-9f18-ad0b57165d08/migration/testsize/1",                  
    "properties": {                                                        
      "appendBlobCommittedBlockCount": null,                                                                                                          
      "blobTier": "Hot",                                                                                                                              
      "blobTierChangeTime": null,
      "blobTierInferred": true,                                                                                                                       
      "blobType": "BlockBlob",                                             
      "contentLength": 2147483648,                                         
      "contentRange": null,                                                
      "contentSettings": {
        "cacheControl": null,
        "contentDisposition": null,
        "contentEncoding": null,
        "contentLanguage": null,
        "contentMd5": null,
        "contentType": "application/octet-stream"
      },
  1. Test S3 upload for 0 filesize
$ ls -l dump 
-rw-rw-r-- 1 prasad prasad 0 Nov 11 09:55 dump


./dist/kando_linux_amd64/kando location push --profile '{"Location":{"type":"s3Compliant","bucket":"xxxxx","endpoint":"","prefix":"k10/test-migration","region":"ap-south-1"},"Credential":{"Type":"keyPair","KeyPair":{"ID":"xxxxxx","Secret":"xxxxxxxxx+9z0vyf"},"Secret":null},"SkipSSLVerify":false}' dump --path testsize/0

$ aws s3 ls s3://xxxxxx/k10/test-migration/ --recursive
2020-11-11 10:01:52          0 k10/test-migration/testsize/0
  1. Test S3 upload for 10M file
$ ls -l dump 
-rw-rw-r-- 1 prasad prasad 10485760 Nov 11 10:04 dump

$ ./dist/kando_linux_amd64/kando location push --profile '{"Location":{"type":"s3Compliant","bucket":"xxxxxx","endpoint":"","prefix":"k10/test-migration","region":"ap-south-1"},"Credential":{"Type":"keyPair","KeyPair":{"ID":"xxxx","Secret":"xxxxxxxx"},"Secret":null},"SkipSSLVerify":false}' dump --path testsize/100


$ aws s3 ls s3://xxxxxxxx/k10/test-migration/ --recursive
2020-11-11 10:01:52          0 k10/test-migration/testsize/0
2020-11-11 10:04:32   10485760 k10/test-migration/testsize/100

pkg/location/location_test.go Show resolved Hide resolved
// Create dump file
err := os.Truncate(s.testMultipartPath, fileSize)
c.Assert(err, IsNil)
testcontent, err := ioutil.ReadFile(s.testMultipartPath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.Open() will give you a reader - you don't need to get the content in-memory then create a bytes.Buffer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: PrasadG193 <prasad.ghangal@gmail.com>
Signed-off-by: PrasadG193 <prasad.ghangal@gmail.com>
@mergify mergify bot merged commit 629d229 into master Nov 11, 2020
@mergify mergify bot deleted the kando-source-size-fix branch November 11, 2020 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants