Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checksums: use blocks to read the files #836

Merged
merged 12 commits into from Feb 8, 2014

Conversation

wpoely86
Copy link
Member

@wpoely86 wpoely86 commented Feb 5, 2014

We need to read a file in blocks to calculate the checksum because else, the memory blows up if you try to install icc.

This PR fixes it for sha1 and md5 but not yet for addler32 or crc32. I will try to fix them when I find the time.

When calculating the checksums of a large file, we need to read it into
blocks (currently 16MB) to keep memory usage acceptable.
maxHitCount,
stdoutErr[-500:]
))
maxHitCount,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be nicer if it was just on the next line...(starting at the (")

@@ -1083,7 +1100,7 @@ def mkdir(directory, parents=False):
_log.debug("Directory %s already exitst" % directory)
else:
_log.error("Failed to create directory %s: %s" % (directory, err))
else:#not parrents
else: # not parrents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"parents", while you are cleaning up stuff.

@itkovian
Copy link
Contributor

itkovian commented Feb 5, 2014

Looks OK to me.

@@ -119,12 +120,12 @@
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split off the hash functions from the dicts/lambda's. too much duplicated code

try:
    import hashlib
    md5_func=hashlib.md5
    sha1_func=hashlib.sha1
except ImportError:
    import md5, sha
    md5_func=md5.md5
    sha1_func=sha.sha
CHECKSUM_FUNCTIONS['md5'] = lambda p: calc_block_checksum(p, md5_func())
CHECKSUM_FUNCTIONS['sha1'] = lambda p: calc_block_checksum(p, sha1_func())

CHECKSUM_FUNCTIONS['sha1'] = lambda p: calc_block_checksum(p, sha1_func())


class zlib_checksum:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class names are camelcase

md5_func = md5.md5
sha1_func = sha.sha

CHECKSUM_FUNCTIONS['md5'] = lambda p: calc_block_checksum(p, md5_func())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the try/except a bit higher, and all this in the dict above. this looks odd now.

@stdweird
Copy link
Contributor

stdweird commented Feb 6, 2014

@wpoely86 nice!
@boegel ready

'adler32': lambda p: '0x%s' % zlib.adler32(open(p, 'r').read()),
'crc32': lambda p: '0x%s' % binascii.crc32(open(p, 'r').read()),
'md5': lambda p: calc_block_checksum(p, md5_func()),
'sha1': lambda p: calc_block_checksum(p, sha1_func()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure you need to call md5_func and sha1_func here? It seems like you just want to pass them along.
(did this pass the untit tests?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call them to have a md5 or sha1 object to work with. It has passed the unit tests (the last one failed because the disk was full, @boegel was going to fix it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so better to call the md5_class and sha1_class then? (and i wasn't going to make any more remarks...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, the would be more correct.

@boegel
Copy link
Member

boegel commented Feb 8, 2014

Thanks @wpoely86, thanks for the reviewing @itkovian, @stdweird, @JensTimmerman!

boegel added a commit that referenced this pull request Feb 8, 2014
Checksums: use blocks to read the files
@boegel boegel merged commit c76599c into easybuilders:develop Feb 8, 2014
@wpoely86 wpoely86 deleted the checksums branch February 9, 2014 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants