Skip to content

Commit

Permalink
updated and verified code to change result message
Browse files Browse the repository at this point in the history
there is now a new type of result, duplicate files
these are files that already exist in the destination
folder and are not imported if not importing duplicates

I have thoroughly verified that my changes do not cause
any other part of the code to break or act in an
unexpected manner. I have also removed some code
that could never get used:
elodie/localstorage.py: checksum() could never return None,
code was not accessible due to earlier return
because of this:
elodie/filesystem.py: process_checksum() did not need
to check if checksum was None after calculating it
this is what allowed my new code to use None
as a flag meaning that the file being imported and skipped
is a duplicate file
  • Loading branch information
D3Zyre committed Oct 13, 2023
1 parent a620e70 commit 651c373
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 5 deletions.
3 changes: 0 additions & 3 deletions elodie/filesystem.py
Original file line number Diff line number Diff line change
Expand Up @@ -487,9 +487,6 @@ def parse_mask_for_location(self, mask, location_parts, place_name):
def process_checksum(self, _file, allow_duplicate):
db = Db()
checksum = db.checksum(_file)
if(checksum is None):

This comment has been minimized.

Copy link
@D3Zyre

D3Zyre Oct 13, 2023

Author Owner

checksum, obtained from Db.checksum(), could never be None, hence no need to check for this

log.info('Could not get checksum for %s.' % _file)
return None

# If duplicates are not allowed then we check if we've seen this file
# before via checksum. We also check that the file exists at the
Expand Down
1 change: 0 additions & 1 deletion elodie/localstorage.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,6 @@ def checksum(self, file_path, blocksize=65536):
hasher.update(buf)
buf = f.read(blocksize)
return hasher.hexdigest()
return None

This comment has been minimized.

Copy link
@D3Zyre

D3Zyre Oct 13, 2023

Author Owner

this line was not accessible, hence the deletion for clarity


def get_hash(self, key):
"""Get the hash value for a given key.
Expand Down
20 changes: 19 additions & 1 deletion elodie/result.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,24 @@ def __init__(self):
self.success = 0
self.error = 0
self.error_items = []
self.duplicate = 0
self.duplicate_items = []

def append(self, row):
id, status = row

if status:
# status can only be True, False, or None
if status is True:
self.success += 1
elif status is None: # status is only ever None if file checksum matched an existing file checksum and is therefore a duplicate file

This comment has been minimized.

Copy link
@D3Zyre

D3Zyre Oct 13, 2023

Author Owner

status is only ever None if the file being imported was skipped due to already existing in the destination folder

self.duplicate += 1
self.duplicate_items.append(id)
else:
self.error += 1
self.error_items.append(id)

def write(self):
print("\n")

This comment has been minimized.

Copy link
@D3Zyre

D3Zyre Oct 13, 2023

Author Owner

just to make the output look a bit nicer

if self.error > 0:
error_headers = ["File"]
error_result = []
Expand All @@ -29,10 +36,21 @@ def write(self):
print(tabulate(error_result, headers=error_headers))
print("\n")

if self.duplicate > 0:

This comment has been minimized.

Copy link
@D3Zyre

D3Zyre Oct 13, 2023

Author Owner

this section is a copy-paste of above, just changing error to duplicate, essentially

duplicate_headers = ["File"]
duplicate_result = []
for id in self.duplicate_items:
duplicate_result.append([id])

print("****** DUPLICATE (NOT IMPORTED) DETAILS ******")
print(tabulate(duplicate_result, headers=duplicate_headers))
print("\n")

headers = ["Metric", "Count"]
result = [
["Success", self.success],
["Error", self.error],
["Duplicate, not imported", self.duplicate]

This comment has been minimized.

Copy link
@D3Zyre

D3Zyre Oct 13, 2023

Author Owner

although this doesn't fully fit the formatting of the other two lines, it helps clarify to the user that files labelled as duplicate here are not being imported (whereas if they had set the option to import duplicates anyway, the file would be a success instead)

]

print("****** SUMMARY ******")
Expand Down

0 comments on commit 651c373

Please sign in to comment.