- doc updates

Azure · Mar 5, 2018 · 1877f3b · 1877f3b
1 parent 400281c
commit 1877f3b
Show file tree

Hide file tree

Showing 4 changed files with 71 additions and 16 deletions.
diff --git a/args.go b/args.go
@@ -522,7 +522,7 @@ func (p *paramParserValidator) pvSourceInfoForS3IsReq() error {
 	burl, err := url.Parse(p.params.sourceURIs[0])
 
 	if err != nil {
-		return fmt.Errorf("Invalid S3 endpoint URL. Parsing error: %v.\nThe format is s3://[END_POINT]/[BUCKET]/[OBJECT]", err)
+		return fmt.Errorf("Invalid S3 endpoint URL. Parsing error: %v.\nThe format is s3://[END_POINT]/[BUCKET]/[PREFIX]", err)
 	}
 
 	p.params.s3Source.endpoint = burl.Hostname()
@@ -533,10 +533,14 @@ func (p *paramParserValidator) pvSourceInfoForS3IsReq() error {
 
 	segments := strings.Split(burl.Path, "/")
 
+	if len(segments) < 2 {
+		return fmt.Errorf("Invalid S3 endpoint URL. Bucket not specified. The format is s3://[END_POINT]/[BUCKET]/[PREFIX]")		
+	}
+
 	p.params.s3Source.bucket = segments[1]
 
 	if p.params.s3Source.bucket == "" {
-		return fmt.Errorf("Invalid source S3 URI. Bucket name could be parsed")
+		return fmt.Errorf("Invalid source S3 URI. Bucket name could not be parsed")
 	}
 
 	prefix := ""

diff --git a/docs/perfmode.rst b/docs/perfmode.rst
@@ -1,14 +1,7 @@
 Performance Mode
 ======================================
-
-If you want to maximize performance, and your source and target are public HTTP based end-points (Blob, S3, and HTTP), running the transfer in a high bandwidth environment such as a VM on the cloud, is strongly recommended.  This recommendation comes from the fact that blob to blob, S3 to blob or HTTP to blob transfers are bidirectional where BlobPorter downloads the data (without writing to disk) and uploads it as it is received. 
-
-When running in the cloud, consider the region where the transfer VM ( where BlobPorter will be running), will be deployed. When possible, deploy the transfer VM in the same the same region as the target of the transfer.  Running in the same region as the target minimizes the transfer costs (egress from the VM to the target storage account) and the network performance impact (lower latency) for the upload operation.
-
-For downloads or uploads of multiple or large files the disk i/o could be the constraining resource that slows down the transfer. And often identifying if this is the case, is a cumbersome process.  But if done, it could lead to informed decisions about the environment where BlobPorter runs.
-
-To help with this indentification process, BlobPorter has a performance mode that uploads random data generated in memory and measures the performance of the operation without the impact of disk i/o.
-The performance mode for uploads could help you identify the potential upper limit of throughput that the network and the target storage account can provide.   
+BlobPorter has a performance mode that uploads random data generated in memory and measures the performance of the operation without the impact of disk i/o.
+The performance mode for uploads could help you identify the potential upper limit of throughput that the network and the target storage account can provide.
 
 For example, the following command will upload 10 x 10GB files to a storage account.
 
@@ -24,19 +17,17 @@ blobporter -f "1GB:10" -c perft -t perf-blockblob -g 20
 
 Similarly, for downloads, you can simulate downloading data from a storage account without writing to disk. This mode could also help you fine-tune the number of readers (-r option) and get an idea of the maximum download throughput.
 
-The following command will download the data we previously uploaded.
+The following command downloads the data previously uploaded.
 
 ```
 export SRC_ACCOUNT_KEY=$ACCOUNT_KEY
 blobporter -f "https://myaccount.blob.core.windows.net/perft" -t blob-perf 
 ```
 
-Then you can try downloading to disk.
+Then you can download to disk.
 
 ```
 blobporter -c perft -t blob-file 
 ```
 
-If the performance difference is significant then you can conclude that disk i/o is the bottleneck, at which point you can consider an SSD backed VM.
-
-
+The performance difference will you a measurement of the impact of disk i/o.
diff --git a/docs/resumable_transfers.rst b/docs/resumable_transfers.rst
@@ -0,0 +1,59 @@
+Resumable Transfers
+======================================
+BlobPorter supports resumable transfers. To enable this feature you need to set the -l option with a path to the transfer status file.
+
+```
+blobporter -f "manyfiles/*" -c "many" -l mylog
+```
+
+The status transfer file contains entries for when a file is queued and when it was succesfully tranferred.
+
+The log entries are created with the following tab-delimited format:
+
+```
+[Timestamp] [Filename] [Status (1:Started,2:Completed,3:Ignored)] [Size] [Transfer ID ]
+```
+
+The following output from a transfer status file shows that three files were included in the transfer (file10, file11 and file15).
+However, only two were successfully transferred: file10 and file11.
+
+```
+2018-03-05T03:31:13.034245807Z  file10  1       104857600       938520246_mylog
+2018-03-05T03:31:13.034390509Z  file11  1       104857600       938520246_mylog
+2018-03-05T03:31:13.034437109Z  file15  1       104857600       938520246_mylog
+2018-03-05T03:31:25.232572306Z  file10  2       104857600       938520246_mylog
+2018-03-05T03:31:25.591239355Z  file11  2       104857600       938520246_mylog
+```
+
+In case of failure, you can reference the same status file and BlobPorter will skip files that were already transferred.
+
+Consider the previous scenario. After executing the transfer again, the status file has entries only for the missing file (file15).
+
+```
+2018-03-05T03:31:13.034245807Z  file10  1       104857600       938520246_mylog
+2018-03-05T03:31:13.034390509Z  file11  1       104857600       938520246_mylog
+2018-03-05T03:31:13.034437109Z  file15  1       104857600       938520246_mylog
+2018-03-05T03:31:25.232572306Z  file10  2       104857600       938520246_mylog
+2018-03-05T03:31:25.591239355Z  file11  2       104857600       938520246_mylog
+2018-03-05T03:54:33.660161772Z  file15  1       104857600       495675852_mylog
+2018-03-05T03:54:34.579295059Z  file15  2       104857600       495675852_mylog
+```
+
+When the transfer is sucessful, a summary is created at the end of the transfer status file.
+
+```
+----------------------------------------------------------
+Transfer Completed----------------------------------------
+Start Summary---------------------------------------------
+Last Transfer ID:495675852_mylog
+Date:Mon Mar  5 03:54:34 UTC 2018
+File:file15     Size:104857600  TID:495675852_mylog
+File:file10     Size:104857600  TID:938520246_mylog
+File:file11     Size:104857600  TID:938520246_mylog
+Transferred Files:3     Total Size:314572800
+End Summary-----------------------------------------------
+```
+
+
+
+
diff --git a/sources/s3info.go b/sources/s3info.go
@@ -33,6 +33,7 @@ type s3InfoProvider struct {
 func newS3InfoProvider(params *S3Params) (*s3InfoProvider, error) {
 	s3client, err := minio.New(params.Endpoint, params.AccessKey, params.SecretKey, true)
 
+
 	if err != nil {
 		log.Fatalln(err)
 	}