#### James Gray | jamesgray@ischool.berkeley.edu
#### UC Berkeley School of Information - W251
#### Week 7 Assignment

* https://github.com/MIDS-scaling-up/coursework/tree/master/week7/hw

## Using Softlayer Object Storage Python Client to manage Swift

https://github.com/softlayer/softlayer-object-storage-python

In [11]:
import object_storage
import os
import time

DATAFILE = os.path.expanduser("~/onedrive/github/Berkeley_W251/googlebooks.csv")
googlefilesize = 1596  

# connect to storage account
sl_storage = object_storage.get_client('SLOS856755-2:SL856755',
                                       '', 
                                       datacenter='dal05')


In [7]:
# view storage containers
sl_storage.containers()

[Container(more_files), Container(myfiles), Container(week7)]

In [None]:
# create storage container for this assignment
try:
    sl_storage['week7'].create()
except ResponseError:
    print ResponseError

In [12]:
# upload 1GB > file to the container 

try:
    sl_storage['week7']['googlebooks2.csv'].create()

except ResponseError:
    print ResponseError
    
try:
    uploadstart = time.time()
    fileUpload = sl_storage['week7']['googlebooks2.csv'].load_from_filename(DATAFILE)
except object_storage.errors.ResponseError:
    print ResponseError

uploadcomplete = time.time()

uploadtime = uploadcomplete - uploadstart
print "Total upload time " + str(uploadtime) + " seconds"
uploadspeed = googlefilesize/uploadtime
print "Upload speed = " + str(uploadspeed) + " MB/sec"

Total upload time 143.199480057 seconds
Upload speed = 11.145291864 MB/sec


In [13]:
# download file from the container 

downloadstart = time.time()

try:
    uploadstart = time.time()
    sl_storage['week7']['googlebooks2.csv'].read()
except object_storage.errors.ResponseError:
    print ResponseError

downloadcomplete = time.time()

downloadtime = downloadcomplete - downloadstart
print "Total download time = " + str(downloadtime)
downloadspeed = googlefilesize/downloadtime
print "Download speed = " + str(downloadspeed) + " MB/sec"

Total download time = 199.914308071
Download speed = 7.98342057354 MB/sec


In [10]:
# list the data in the week7 container
sl_storage['week7'].objects()


[StorageObject(week7, tweets2.json, 38731776B),
 StorageObject(week7, googlebooks.csv, 1674466136B),
 StorageObject(week7, random.hdf5, 4000002144B)]

In [9]:
# delete file object in week7 container

sl_storage['week7']['googlebooks2.csv'].delete()

True

In [None]:
# run two threads in parallel


## Using the REST API to manage Swift

* https://sldn.softlayer.com/blog/waelriac/Managing-SoftLayer-Object-Storage-Through-REST-APIs
* http://sldn.softlayer.com/blog/bpotter/more-softlayer-rest-api-examples
* http://sldn.softlayer.com/article/rest
* http://www.cloudsoftcorp.com/blog/2014/04/crib-sheet-softlayer-object-storage-keystone/

In [None]:
# get authentication tokens for user and storage account
! curl -i -H "X-Auth-User: SLOS856755-2:SL856755 " \
-H "X-Auth-Key: " \
https://dal05.objectstorage.softlayer.net/auth/v1.0

In [1]:
# list containers
! curl -i -H "X-Auth-Token: " \
https://dal05.objectstorage.softlayer.net/v1/AUTH_a26db204-4ebf-4d13-b9f6-2e1e9ec4209f

HTTP/1.1 200 OK
Content-Length: 25
X-Account-Meta-Nas-Id: 8571659
X-Account-Object-Count: 8
X-Account-Storage-Policy-Standard-Container-Count: 3
X-Timestamp: 1452997391.90664
X-Account-Meta-Cdn-Id: 75651
X-Account-Storage-Policy-Standard-Object-Count: 8
X-Account-Bytes-Used: 10736598492
X-Account-Container-Count: 3
Content-Type: text/plain; charset=utf-8
Accept-Ranges: bytes
X-Account-Storage-Policy-Standard-Bytes-Used: 10736598492
X-Trans-Id: txe62b489c53bb4956bb9a7-0056cbd237
Date: Tue, 23 Feb 2016 03:29:59 GMT

more_files
myfiles
week7


In [2]:
# list files in week7 container
! curl -i -H "X-Auth-Token: " \
https://dal05.objectstorage.softlayer.net/v1/AUTH_a26db204-4ebf-4d13-b9f6-2e1e9ec4209f/week7

HTTP/1.1 200 OK
Content-Length: 92
X-Container-Object-Count: 6
Accept-Ranges: bytes
X-Storage-Policy: standard
X-Container-Bytes-Used: 10736598464
X-Timestamp: 1456008266.39793
Content-Type: text/plain; charset=utf-8
X-Trans-Id: tx0a56dbeed93144e89b3b0-0056cbd23d
Date: Tue, 23 Feb 2016 03:30:05 GMT

googlebooks.csv
googlebooks2.csv
googlebooks3.csv
googlebooks4.csv
random.hdf5
tweets2.json


In [None]:
# upload file to week7 container

uploadstart = time.time()
filesize = 3815
    
! curl -i -XPUT -H "X-Auth-Token: " \
-T random.hdf5 https://dal05.objectstorage.softlayer.net/v1/AUTH_a26db204-4ebf-4d13-b9f6-2e1e9ec4209f/week7/random.hdf5

uploadcomplete = time.time()

uploadtime = uploadcomplete - uploadstart
print "Total upload time " + str(uploadtime) + " seconds"
uploadspeed = filesize/uploadtime
print "Upload speed = " + str(uploadspeed) + " MB/sec"

In [None]:
# upload file to week7 container

uploadstart = time.time()
filesize = 1597
    
! curl -i -XPUT -H "X-Auth-Token: " \
-T googlebooks.csv https://dal05.objectstorage.softlayer.net/v1/AUTH_a26db204-4ebf-4d13-b9f6-2e1e9ec4209f/week7/googlebooks.csv

uploadcomplete = time.time()

uploadtime = uploadcomplete - uploadstart
print "Total upload time " + str(uploadtime) + " seconds"
uploadspeed = filesize/uploadtime
print "Upload speed = " + str(uploadspeed) + " MB/sec"

In [None]:
# download file from week7 container

downloadstart = time.time()

! curl -i -H "X-Auth-Token: " \
https://dal05.objectstorage.softlayer.net/v1/AUTH_a26db204-4ebf-4d13-b9f6-2e1e9ec4209f/week7/googlebooks.csv
    
downloadcomplete = time.time()
downloadtime = downloadcomplete - downloadstart 
print "Total download time " + str(downloadtime) + " seconds"

downloadspeed = filesize/downloadtime
print "Download speed = " + str(downloadspeed) + " MB/sec"


In [None]:
# delete an object

! curl -X DELETE -H "X-Auth-Token: " \
https://dal05.objectstorage.softlayer.net/v1/AUTH_a26db204-4ebf-4d13-b9f6-2e1e9ec4209f/week7/random.hdf5


## Part 4 - Answers to Questions

Here are the object files:

* https://dal05.objectstorage.softlayer.net/v1/AUTH_a26db204-4ebf-4d13-b9f6-2e1e9ec4209f/week7/googlebooks.csv

![Screen Shot](ScreenShot2.png)

* **What is the average READ speed in Mb/sec?**

* The googlebooks.csv download was approximately 8MB/sec (200 seconds)


* **What is the average WRITE speed in Mb/sec?**

* The 4GB file averaged about 11MB/sec for approximately total 340 seconds. 
 
* I also posted a googlebooks.csv file (1.56 GB) that uploaded in 11MB/sec (143 seconds)


* **Can you account for the discrepancies? Consider all of the possible reasons and explain.**

Not 100% sure but perhaps that upload is optimized at the Swift layer for writing to the infrastructure. 


* **What happens to these speeds if you run two threads in parallel?**

Python scripts google3up.py and google4up.py were executed in parallel from 2 separate command windows.  The first file finished in 268 seconds (5.93 MB/sec) and the second file finished in 284 seconds (5.61 MB/sec).  It looks likes that bandwidth was constrained and it took longer to upload these files in parallel.

![Screen Shot](ScreenShot.png)
