Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase checksum noting and matching for imports. #890

Merged
merged 6 commits into from Mar 25, 2013

Conversation

mtbc
Copy link
Member

@mtbc mtbc commented Mar 15, 2013

To test make sure that different import/upload paths still work, including archiving during import. For real files in the server's local filesystem, if they have a corresponding entry in the originalfile table in the DB then make sure that the SHA1 there is set correctly.

Makes progress on http://trac.openmicroscopy.org.uk/ome/ticket/2581

@@ -632,6 +632,7 @@ def update_table(self, columns):
column_report[column.name] = len(column.values)
log.debug("Column report: %r" % column_report)
self.table.addData(columns)
self.table.close()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

causes SHA1 to be written for the table file

@joshmoore
Copy link
Member

Adding exclude label:

/home/hudson/hudson/workspace/OMERO-merge-stable/src/components/blitz/src/ome/formats/OMEROMetadataStoreClient.java:1854: cannot find symbol
symbol  : method copyOfRange(byte[],int,int)
location: class java.util.Arrays
                    hasher.putBytes(rlen == buf.length ? buf : Arrays.copyOfRange(buf, 0, rlen));

@mtbc
Copy link
Member Author

mtbc commented Mar 18, 2013

That's interesting. I'm guessing that Eclipse's 1.5 compliance checking goes as far as the core language but doesn't do anything clever with APIs, then. I'll have to watch out for that. Of course, I'll provide a fix.

@bpindelski
Copy link

@mtbc #897 could be helpful in cases like this.

@joshmoore
Copy link
Member

Label didn't get removed for today. @mtbc, should it for tomorrow? Would you like to wait on changes from @bpindelski?

@bpindelski
Copy link

@joshmoore I think all the changes @mtbc needs are in #897. That will allow for removing the dependancy on ArrayUtils and Arrays. I'm not planning on any other major changes wrt. checksums on dev_4_4 besides #897.

@mtbc
Copy link
Member Author

mtbc commented Mar 19, 2013

Yeah, remove it today and I'll push another commit later that is based on #897 so that the revised version gets built tomorrow morning.

@mtbc
Copy link
Member Author

mtbc commented Mar 19, 2013

Should there be a new exception class for checksum mismatches that has the client and server checksums and file path as properties? (Extending which class?) Or is what's here good enough for now?

@bpindelski
Copy link

@mtbc On develop we currently do https://github.com/openmicroscopy/openmicroscopy/blob/develop/components/blitz/src/ome/services/blitz/repo/ManagedImportProcessI.java#L261. Would be interesting to see how that can be handled on dev_4_4.

@mtbc
Copy link
Member Author

mtbc commented Mar 19, 2013

Further commit will be tomorrow, not today. (I'm not even going to get finished with permissions testing today.)

@mtbc
Copy link
Member Author

mtbc commented Mar 20, 2013

Okay, should be ready for testing again.

@joshmoore
Copy link
Member

Unlabeled.

@joshmoore
Copy link
Member

One further location that could use your inspection:

https://github.com/openmicroscopy/openmicroscopy/blob/develop/components/blitz/src/omero/client.java#L1029

(in fact, any location that calls write() on a RawFileStore probably falls into that category, but that may be beyond the scope at this point)

@mtbc
Copy link
Member Author

mtbc commented Mar 21, 2013

I think the finally still executes despite the return but, yes, I might even try logging the saves and closes to spot where the leaks might be.

@joshmoore
Copy link
Member

Basic archived import's certainly still work fine:

hudson@howe:~/OMERO-CURRENT$ echo "hi" >> pr890.fake
hudson@howe:~/OMERO-CURRENT$ bin/omero -s user-1@localhost import -a pr890.fake
omero=# select id, name, sha1 from originalfile order by id desc limit 11;
  id   |                               name                                |                   sha1
-------+-------------------------------------------------------------------+------------------------------------------
 41451 | pr890.fake                                                        | 55ca6286e3e4f4fba5d0448333fa99fc5a404a73
(1 row)

omero=# \q
hudson@howe:~/OMERO-CURRENT$ shasum pr890.fake
55ca6286e3e4f4fba5d0448333fa99fc5a404a73  pr890.fake

Though I'm still pondering who to make the checksum check fail.

@mtbc
Copy link
Member Author

mtbc commented Mar 21, 2013

If it doesn't unduly invalidate the review, when the server or client hasher is constructed, put an extra byte into it so it'll screw up. (-:

@joshmoore
Copy link
Member

OmeroPy/src/omero/util/populate_metadata.py also does not call close() on the TablePrx.

@joshmoore
Copy link
Member

@mtbc: this PR all looks healthy and happy even if there are various things outside of the PR which are less so. Let me know if you would like to handle those in a separate PR or here.

@mtbc
Copy link
Member Author

mtbc commented Mar 21, 2013

I'd like to fix populate_metadata.py here and RawFileStore leakage separately.

@joshmoore
Copy link
Member

  • Re: an extra byte -- :). Maybe my own hasher subclass that just always return "".
  • populate_metadata.py v. RawFileStore: check, but not I didn't see a rawFileStore.close(). If I'm just overlooking that, then perhaps there's no problem.

@joshmoore
Copy link
Member

Faked a failure. 👍

hudson@howe:~/OMERO-CURRENT/pr890$ export CLASSPATH=`find ../lib/client/ -name "*.jar"  | perl -pe 's/\s+/:/g'`
hudson@howe:~/OMERO-CURRENT/pr890$ JAVA_OPTS="-cp .:$CLASSPATH" ../bin/omero -s user-1@localhost import -a pr890.fake
...
2013-03-21 11:01:28,426 3966       [      main] INFO         ome.formats.importer.ImportLibrary  - Reader is not of HCS domain, use metafile: true
FAKE CHECKSUM
2013-03-21 11:01:28,752 4292       [      main] ERROR     ome.formats.importer.cli.ErrorHandler  - FILE_EXCEPTION: /home/hudson/hudson/workspace/OMERO-merge-stable/src/dist/pr890/pr890.fake
java.io.IOException: file checksum mismatch on upload: /home/hudson/hudson/workspace/OMERO-merge-stable/src/dist/pr890/pr890.fake (client has fake, server has 55ca6286e3e4f4fba5d0448333fa99fc5a404a73)
    at ome.formats.OMEROMetadataStoreClient.writeFilesToFileStore(OMEROMetadataStoreClient.java:1859)
    at ome.formats.importer.ImportLibrary.importImage(ImportLibrary.java:694)
    at ome.formats.importer.ImportLibrary.importCandidates(ImportLibrary.java:298)
    at ome.formats.importer.cli.CommandLineImporter.start(CommandLineImporter.java:143)
    at ome.formats.importer.cli.CommandLineImporter.main(CommandLineImporter.java:462)
2013-03-21 11:01:28,824 4364       [      main] ERROR     ome.formats.importer.cli.ErrorHandler  - INTERNAL_EXCEPTION: /home/hudson/hudson/workspace/OMERO-merge-stable/src/dist/pr890/pr890.fake
java.lang.RuntimeException: java.io.IOException: file checksum mismatch on upload: /home/hudson/hudson/workspace/OMERO-merge-stable/src/dist/pr890/pr890.fake (client has fake, server has 55ca6286e3e4f4fba5d0448333fa99fc5a404a73)
    at ome.formats.importer.ImportLibrary.importCandidates(ImportLibrary.java:301)
    at ome.formats.importer.cli.CommandLineImporter.start(CommandLineImporter.java:143)
    at ome.formats.importer.cli.CommandLineImporter.main(CommandLineImporter.java:462)
Caused by: java.io.IOException: file checksum mismatch on upload: /home/hudson/hudson/workspace/OMERO-merge-stable/src/dist/pr890/pr890.fake (client has fake, server has 55ca6286e3e4f4fba5d0448333fa99fc5a404a73)
    at ome.formats.OMEROMetadataStoreClient.writeFilesToFileStore(OMEROMetadataStoreClient.java:1859)
    at ome.formats.importer.ImportLibrary.importImage(ImportLibrary.java:694)
    at ome.formats.importer.ImportLibrary.importCandidates(ImportLibrary.java:298)
    ... 2 more

java.lang.RuntimeException: java.io.IOException: file checksum mismatch on upload: /home/hudson/hudson/workspace/OMERO-merge-stable/src/dist/pr890/pr890.fake (client has fake, server has 55ca6286e3e4f4fba5d0448333fa99fc5a404a73)
    at ome.formats.importer.ImportLibrary.importCandidates(ImportLibrary.java:301)
    at ome.formats.importer.cli.CommandLineImporter.start(CommandLineImporter.java:143)
    at ome.formats.importer.cli.CommandLineImporter.main(CommandLineImporter.java:462)
Caused by: java.io.IOException: file checksum mismatch on upload: /home/hudson/hudson/workspace/OMERO-merge-stable/src/dist/pr890/pr890.fake (client has fake, server has 55ca6286e3e4f4fba5d0448333fa99fc5a404a73)
    at ome.formats.OMEROMetadataStoreClient.writeFilesToFileStore(OMEROMetadataStoreClient.java:1859)
    at ome.formats.importer.ImportLibrary.importImage(ImportLibrary.java:694)
    at ome.formats.importer.ImportLibrary.importCandidates(ImportLibrary.java:298)
    ... 2 more
2013-03-21 11:01:28,824 4364       [      main] INFO         ome.formats.importer.ImportLibrary  - Exiting on error
hudson@howe:~/OMERO-CURRENT/pr890$ cat ome/util/checksum/SHA1ChecksumProviderImpl.
cat: ome/util/checksum/SHA1ChecksumProviderImpl.: No such file or directory
hudson@howe:~/OMERO-CURRENT/pr890$ cat ome/util/checksum/SHA1ChecksumProviderImpl.java
package ome.util.checksum;
import com.google.common.hash.Hashing;

public final class SHA1ChecksumProviderImpl extends AbstractChecksumProvider {

    public SHA1ChecksumProviderImpl() {
        super(Hashing.sha1());
        System.out.println("FAKE CHECKSUM");
    }

    @Override
    public String checksumAsString() {
        return "fake";
    }

}

@joshmoore
Copy link
Member

We'll likely need to decide on the usability of this at some point, since this leaves garbage on the server that the user may want to delete immediately. /cc @jburel @gusferguson

@mtbc
Copy link
Member Author

mtbc commented Mar 21, 2013

True, though they may also want it left around for a while first so the sysadmin can investigate the corruption.

@gusferguson
Copy link

Will require some form of notification - so an option in that to delete or retain?

@jburel
Copy link
Member

jburel commented Mar 21, 2013

if we have a checksum mismatch, I was planning to add a delete control. I used to have code that will automatically delete the image if an error occurred.
For multi-images file, I discuss with @bpindelski:

  • Green tick if everything okay
  • Green tick with red cross if one file or more did not work
  • Red cross if failure
    For point 2,3 delete option will be available.

@mtbc
Copy link
Member Author

mtbc commented Mar 21, 2013

This seems to work for me, too. The rest of the resource leakage cleanup I'll leave for https://trac.openmicroscopy.org.uk/ome/ticket/10565

@joshmoore
Copy link
Member

@jburel: do you have that ticketed elsewhere?

@jburel
Copy link
Member

jburel commented Mar 22, 2013

No I did not have the time to do it. I will do that today

@jburel
Copy link
Member

jburel commented Mar 24, 2013

@joshmoore
Copy link
Member

Thanks all.

joshmoore added a commit that referenced this pull request Mar 25, 2013
Increase checksum noting and matching for imports.
@joshmoore joshmoore merged commit 44087d0 into ome:dev_4_4 Mar 25, 2013
@mtbc mtbc deleted the trac-2581 branch March 25, 2013 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants