Skip to content

Conversation

@mlilius
Copy link
Contributor

@mlilius mlilius commented Oct 3, 2014

When doing a sync the second file and all the others after that will have the prefix doubled up. I changed it to use a separate variable while in the loop.

@sivel
Copy link
Contributor

sivel commented Oct 3, 2014

@mlilius While I haven't look in depth into this pull request and what it is solving. There are a few things that need to happen with this PR.

  1. Please squash all commits
  2. Ensure that unit tests are not broken. Currently unit tests are failing due to UnboundLocalError: local variable '_prefix' referenced before assignment

Additionally, please provide a working example that allows us to easily replicate the issue.

Thanks!

@mlilius mlilius force-pushed the fix-sync-prefix-bug branch from 9e20587 to 66d5e3d Compare October 3, 2014 18:06
@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) when pulling 9e20587 on mlilius:fix-sync-prefix-bug into 4ff2d72 on rackspace:working.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) when pulling 66d5e3d on mlilius:fix-sync-prefix-bug into 91120d9 on rackspace:working.

…second+ files to have duplicated prefixes added to the objects fullname_with_prefix.

The merged prefix was put together backwards. The merged prefix needs to be object_prefix+prefix.
@mlilius mlilius force-pushed the fix-sync-prefix-bug branch from 66d5e3d to cd79794 Compare October 3, 2014 18:35
@mlilius
Copy link
Contributor Author

mlilius commented Oct 3, 2014

Here are the steps to test the before and after. Just replace USERNAME, API_KEY, and THECONTAINER

cat << EOF > rackspace_creds
[rackspace_cloud]
username = USERNAME
api_key = API_KEY
EOF


cat << EOF > cf_sync.py
#!/usr/bin/python
import pyrax
import argparse


# Parse arguments
parser = argparse.ArgumentParser()
parser.add_argument('container', type=str)
parser.add_argument('path', type=str)
parser.add_argument('folder', type=str)
parser.add_argument('delete', type=str)
parser.add_argument('region', type=str)
parser.add_argument('credentials', type=str)
args = parser.parse_args()

# Settings
container = args.container
path = args.path
folder = args.folder
delete = args.delete
region = args.region
credentials = args.credentials

# Authentication
pyrax.encoding = "utf-8"
pyrax.set_setting("identity_type", "rackspace")
pyrax.set_setting("region", region)
pyrax.set_credential_file(credentials)

# Initiate the Cloud Files connection
cf_ord = pyrax.connect_to_cloudfiles(public=True)

# Upload the entire folder to the cloud files container.
print("Syncing all objects to %s/%s container from folder %s." % (container, path, folder))
cf_ord.sync_folder_to_container(folder, container, delete=delete, include_hidden=False, ignore_timestamps=True, object_prefix=path, verbose=True)
print "Sync complete."
EOF

chmod +x cf_sync.py

mkdir -p data_to_upload/levelone/leveltwo/levelthree/levelfour/
touch data_to_upload/levelone/onefile
touch data_to_upload/levelone/twofile
touch data_to_upload/levelone/threefile
touch data_to_upload/levelone/leveltwo/fileone
touch data_to_upload/levelone/leveltwo/filetwo
touch data_to_upload/levelone/leveltwo/filethree
touch data_to_upload/levelone/leveltwo/levelthree/levelfour/deepfile

./cf_sync.py THECONTAINER object/prefix/path data_to_upload/ True ORD rackspace_creds

Results before the patch

$ ./cf_sync.py THECONTAINER object/prefix/path data_to_upload/ True ORD rackspace_creds
Syncing all objects to thecontainer/object/prefix/path container from folder data_to_upload/.
Loading remote object list (prefix=object/prefix/path)
levelone/object/prefix/path/threefile UPLOADED
levelone/object/prefix/path/object/prefix/path/onefile UPLOADED
levelone/object/prefix/path/object/prefix/path/leveltwo/levelthree/levelfour/object/prefix/path/deepfile UPLOADED
levelone/object/prefix/path/object/prefix/path/leveltwo/object/prefix/path/fileone UPLOADED
levelone/object/prefix/path/object/prefix/path/leveltwo/object/prefix/path/object/prefix/path/filetwo UPLOADED
levelone/object/prefix/path/object/prefix/path/leveltwo/object/prefix/path/object/prefix/path/object/prefix/path/filethree UPLOADED
levelone/object/prefix/path/object/prefix/path/object/prefix/path/twofile UPLOADED
Folder sync completed at Fri Oct  3 19:10:16 2014
  Total files processed: 0
  Number Uploaded: 7
  Number Ignored: 0
  Number Skipped (older): 0
  Number Skipped (dupe): 0
  Number Deleted: 6
  Number Failed: 0
Sync complete.

Results after the patch

$ ./cf_sync.py THECONTAINER object/prefix/path data_to_upload/ True ORD rackspace_creds
Syncing all objects to thecontainer/object/prefix/path container from folder data_to_upload/.
Loading remote object list (prefix=object/prefix/path)
object/prefix/path/levelone/threefile UPLOADED
object/prefix/path/levelone/onefile UPLOADED
object/prefix/path/levelone/leveltwo/levelthree/levelfour/deepfile UPLOADED
object/prefix/path/levelone/leveltwo/fileone UPLOADED
object/prefix/path/levelone/leveltwo/filetwo UPLOADED
object/prefix/path/levelone/leveltwo/filethree UPLOADED
object/prefix/path/levelone/twofile UPLOADED
Folder sync completed at Fri Oct  3 19:10:50 2014
  Total files processed: 0
  Number Uploaded: 6
  Number Ignored: 0
  Number Skipped (older): 0
  Number Skipped (dupe): 1
  Number Deleted: 0
  Number Failed: 0
Sync complete.

@sivel
Copy link
Contributor

sivel commented Oct 4, 2014

@mlilius I've done a little testing, and with the change in this PR, the if object_prefix: statement is hit a lot more than needed.

I've tested a fix which prevents continually resetting the prefix that is used with fullname_with_prefix, and only hits the code inside of that if statement 1 time with your test case. I've uploaded the full contents of pyrax via this method, and compared the fullname_with_prefix to a list of files in pyrax manually manipulated to include a standard prefix and have validated that there are no differences.

Would you mind testing the following code?

diff --git a/pyrax/object_storage.py b/pyrax/object_storage.py
index 5565600..9e6a55d 100644
--- a/pyrax/object_storage.py
+++ b/pyrax/object_storage.py
@@ -3087,7 +3085,7 @@ class StorageClient(BaseClient):
             if os.path.isdir(pth):
                 subprefix = fname
                 if prefix:
-                    subprefix = "%s/%s" % (prefix, subprefix)
+                    subprefix = os.path.join(prefix, subprefix)
                 self._sync_folder_to_container(pth, container, prefix=subprefix,
                         delete=delete, include_hidden=include_hidden,
                         ignore=ignore, ignore_timestamps=ignore_timestamps,
@@ -3097,7 +3095,8 @@ class StorageClient(BaseClient):
                     fname))
             local_etag = utils.get_checksum(pth)
             if object_prefix:
-                prefix = os.path.join(prefix, object_prefix)
+                prefix = os.path.join(object_prefix, prefix)
+                object_prefix = ""
             fullname_with_prefix = os.path.join(prefix, fname)
             try:
                 obj = self._remote_files[fullname_with_prefix]

@sivel
Copy link
Contributor

sivel commented Nov 14, 2014

Closing in favor of #498

@sivel sivel closed this Nov 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants