-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speedup #30
Speedup #30
Conversation
hopefully every other OS.
available by a switch in MessagePackSerialization.
of BioJava Structure creation.
Passed MessagePackTest.
parser adopted from Jmol are identical.
Thanks. I'm trying to review it, give me some time. As an advice for next time, please don't do code reformatting at the same time that you do important changes. The diffs as they are shown are not very helpful. Having said that, there's a really nice trick to ignore whitespaces in the diff that will make them more readable: add a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me like the file pdb_entry_type.txt
shouldn't be here, it's quite a large file. Is it needed?
return new GenericDecoder(ReaderUtils.getDataFromFile(inFile)); | ||
} catch (IOException e) { | ||
e.printStackTrace(); | ||
throw new RuntimeException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not your change, but it'd be nice to fix: don't catch the exception, just declare a "throws IOException" and error handling will be much clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Text files are in gz now.
mmtf-api/pom.xml
Outdated
<parent> | ||
<groupId>org.rcsb</groupId> | ||
<artifactId>mmtf</artifactId> | ||
<version>1.0.5-SNAPSHOT</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to change the version here, SNAPSHOT versions are for development. Versions in pom should only change when a release happens. Same applies to other pom files.
lines.add(line); | ||
} | ||
return lines; | ||
} catch (IOException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to declare the exception as thrown in the method signature. Error handling will be easier like that
} | ||
return codes; | ||
} catch (Exception ex) { | ||
throw new RuntimeException(ex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just declare the exception as throws
in signature. It will have the same effect and will be clearer
MessagePackSerialization.setJackson(true); | ||
StructureDataInterface sdiJackson = parse(bytes); | ||
|
||
ReflectionAssert.assertReflectionEquals(sdiJackson, sdiJmol); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to use this assert method instead of the standard assert from the junit library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assert performs deep comparison, while standard assert just checks boolean.
* creation gives identical data. | ||
* | ||
*/ | ||
public class MessagePackTest extends TestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use junit4. It doesn't require extending TestCase, just adding the @test annotation to the method you want to make into a test
<groupId>org.unitils</groupId> | ||
<artifactId>unitils-core</artifactId> | ||
<version>3.4.2</version> | ||
<scope>test</scope> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really needed? Can't we just use junit4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for the deep comparison of objects.
mmtfBean = objectMapper.readValue(inputStream, MmtfStructure.class); | ||
} catch (IOException e) { | ||
e.printStackTrace(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception should be declared in signature and not caught
return s; | ||
} catch (Exception e) { | ||
throw new RuntimeException(e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Declare IOException in signature
try { | ||
objectMapper.writeValue(outputStream, mmtfStructure); | ||
} catch (IOException e) { | ||
e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't catch the exception, declare it in signature
One concern is the 17.9% drop in test coverage, I guess because of the new message pack decoding code from jmol that has no tests. Do you have some metrics on the performance difference of the new decoder versus the msgpack library? It's a pity to lose a well-tested library in favor of an ad-hoc decoder. Anyway thanks a lot for the patch! looks good in general. If you could address the comments that'd be great. |
RuntimeExceptions by IOException and try with resources.
} | ||
} | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need gzipping rather than zipping here.
This is how you do it with gzipping:
public static void gzipFile(File inFile, File outFile) throws IOException {
GZIPOutputStream zos = new GZIPOutputStream(new FileOutputStream(outFile));
FileInputStream is = new FileInputStream(inFile);
int b;
while ( (b=is.read())!=-1) {
zos.write(b);
}
zos.close();
is.close();
}
And ungzip is:
public static void gunzipFile(File inFile, File outFile) throws IOException {
GZIPInputStream zis = new GZIPInputStream(new FileInputStream(inFile));
FileOutputStream os = new FileOutputStream(outFile);
int b;
while ( (b=zis.read())!=-1) {
os.write(b);
}
zis.close();
os.close();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say that my code is probably faster (buffer). And more importantly, it is easier to prepare the resource, because I do not need to do it with Java. I just do it by file manager or command line, which is far more convenient. Also I would leave it because it does not matter much I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue here is zip vs gzip, not so much about efficiency. Zip is an archive format (to put a bunch of files together), while gzip is purely a compression format.
You can also do gzip easily from command line with tools like gzip and gunzip, so that's also not a concern.
previous was failing on Windows again. Going to use URL elsewhere too in the next commit.
whenever possible (simpler and more reliable), File left only when necessary for API tests.
Concerning test coverage (I do not see way how to comment directly): |
Thanks for all the changes! the only thing missing is putting back the version to 1.0.4-SNAPSHOT. After that I think we should merge. By the way there's a merge conflict now, could you take care of that too? (just merge into this branch the latest master). The conflict should be minimal, the only file that diverged since you branched is |
* | ||
*/ | ||
|
||
public class SB { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just found this one: could we get rid of it? This looks like a replacement for StringBuilder needed in jmol so that the java to js transpiler is happy. We don't need it here, let's replace it by StringBuilder
* | ||
* @author Antonin Pavelka | ||
*/ | ||
public class ObjectTree { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we have some documentation for this class? or at least change the method names to something more usable, one letter method names without javadocs are going to be difficult to understand....
} | ||
} | ||
|
||
public Hashtable l(String s) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method doesn't seem to be used. Do we need it?
* | ||
* @author Antonin Pavelka | ||
*/ | ||
public class MmtfStructureFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some javadocs to explain what this class does? The individual methods are more self-explanatory
…after 12/01/2016). Increased -Xmx for testing from 1500 to 6000 MB because sometimes GC timeout occured in MessagePackTest. Improved comments on the MessagePack parser.
was corrected to reasonable number 10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good now, thank you so much for all the changes. Just one final detail: I'd propose method names for ObjectTree
like getString
, getInteger
, getByteArray
etc. As I said, single letter names make things difficult to understand. Let's merge after you refactor those, it should only take a couple of minutes.
Several changes in decoder, resulting in about 4x speedup of BioJava MMTF parsing. Largest change (but not most significant in terms of time) is replacing Jackson with Jmol version of MessagePack parsing (sources were extracted, no dependency). A test for that was created, comparing objects created by both methods.