Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve WacArcInputFormat.java test coverage #80

Closed
ruebot opened this issue Oct 2, 2017 · 18 comments
Closed

Improve WacArcInputFormat.java test coverage #80

ruebot opened this issue Oct 2, 2017 · 18 comments
Assignees
Labels

Comments

@ruebot
Copy link
Member

ruebot commented Oct 2, 2017

WacArcInputFormat.java test coverage can be improved.

@ruebot ruebot added the tests label Oct 2, 2017
@ruebot ruebot added this to To Do in Test coverage Oct 2, 2017
greebie added a commit that referenced this issue Oct 10, 2017
@ruebot ruebot moved this from To Do to In Review in Test coverage Oct 12, 2017
ruebot pushed a commit that referenced this issue Oct 13, 2017
* Improve coverage for Issue #80

* Minor additions to improve coverage slightly.  Still unable to test for empty .arc.gz.
@ruebot
Copy link
Member Author

ruebot commented Oct 13, 2017

Partially resolved with: 23d0cb4

@ruebot ruebot moved this from In Review to In Progress in Test coverage Oct 13, 2017
@greebie
Copy link
Contributor

greebie commented Dec 5, 2017

It appears to me that this + #78 can be resolved by removing the files, in favor of the Generic class. I defer to @ruebot on this though. I do not see where it is currently being used after 5e99d63

@ruebot
Copy link
Member Author

ruebot commented Dec 6, 2017

[nruest@gorila:aut] (git)-[issue-80]-$ grep -R "WacArcInputFormat" .
./src/main/java/io/archivesunleashed/mapreduce/WacArcInputFormat.java:public class WacArcInputFormat extends FileInputFormat<LongWritable,
./src/test/java/io/archivesunleashed/io/ArcRecordWritableTest.java:import io.archivesunleashed.mapreduce.WacArcInputFormat;
./src/test/java/io/archivesunleashed/io/ArcRecordWritableTest.java:        WacArcInputFormat.class, conf);
./src/test/java/io/archivesunleashed/mapreduce/WacArcInputFormatTest.java:public class WacArcInputFormatTest {
./src/test/java/io/archivesunleashed/mapreduce/WacArcInputFormatTest.java:        ReflectionUtils.newInstance(WacArcInputFormat.class, conf);
Binary file ./.git/index matches
[nruest@gorila:aut] (git)-[issue-80]-$ grep -R "WacWarcInputFormat" .
./src/main/java/io/archivesunleashed/mapreduce/WacWarcInputFormat.java:public class WacWarcInputFormat extends FileInputFormat<LongWritable,
./src/test/java/io/archivesunleashed/io/WarcRecordWritableTest.java:import io.archivesunleashed.mapreduce.WacWarcInputFormat;
./src/test/java/io/archivesunleashed/io/WarcRecordWritableTest.java:        WacWarcInputFormat.class, conf);
./src/test/java/io/archivesunleashed/mapreduce/WacWarcInputFormatTest.java:public class WacWarcInputFormatTest {
./src/test/java/io/archivesunleashed/mapreduce/WacWarcInputFormatTest.java:        ReflectionUtils.newInstance(WacWarcInputFormat.class, conf);
Binary file ./.git/index matches

That's what I got. If you want to proceed with the refactoring, go for it.

@greebie
Copy link
Contributor

greebie commented Dec 6, 2017

Looks like they can go. I'll check these today and see what blows up. I'll also change the "generic" formats to just plain archives.

@greebie
Copy link
Contributor

greebie commented Dec 6, 2017

The only risk here is that, if all works out, we would remove .arc and .warc specific loading functions from the code without a deprecation warning. This is already technically the case with the changes to RecordLoader in 0.11 anyway, but still. Just a head's up @lintool @ianmilligan1 !

@ruebot
Copy link
Member Author

ruebot commented Dec 6, 2017

We don't have a 1.0.0 release, so we don't have to do it. But, I think a deprecation cycle would be helpful. Is this something you want to take on?

@greebie greebie self-assigned this Dec 6, 2017
@greebie
Copy link
Contributor

greebie commented Dec 6, 2017

Sure thing! I meant to assign this to myself earlier.

Draft plan as follows:

  1. Create a branch and test changes removing references to LoadArcRecord and LoadWarcRecord, and associated interfaces and classes.
  2. Change "Generic{x}" to {x}
  3. Establish whether deprecation cycle is necessary -- (is anyone using these now that we have LoadArchive -- I don't believe they are even in the documentation anymore)?
  4. Create PR when ready.

@ianmilligan1
Copy link
Member

I go back and forth on the deprecation cycle. My sense is that if a user if engaged enough to see a deprecation cycle – i.e. they're checking out a new release – they've noticed the change in all our docs, etc. that we've made since February 2016 when this new functionality came in.

@greebie
Copy link
Contributor

greebie commented Dec 6, 2017

I think I agree here. It would be good to establish deprecation procedures generally in future, but we may be able to get away with no deprecation in this particular case. For example, do you want to deprecate WriteGDF in future?

@ruebot
Copy link
Member Author

ruebot commented Dec 6, 2017

In Islandora and Fedora, we would do one release cycle. So, we'd identify something that need to be deprecated, and wrap it in a deprecation warning. Then release, and it would be in the release notes. Then after that we'd remove it.

@ruebot
Copy link
Member Author

ruebot commented Dec 6, 2017

@greebie
Copy link
Contributor

greebie commented Dec 6, 2017

Okay - so two alternatives.

  1. Remove from 1.0, no deprecation warnings, but include notice in release notes.
  2. Keep in branch for now, add deprecation warnings to 1.0 including warning of removal in next release, create PR and merge for 1.1.

There would also be a number of "Generic_._" items deprecated, none of them for front-end usage. If we kept deep documentation on the library I would see the deprecation warnings as more important, but I'm good with either option.

If we go with option #2, I will take responsibility for adding the deprecation comments to the code for 1.0. The only two user-relevant deprecations (imho) are RecordLoader.loadArcRecord() and RecordLoader.loadWarcRecord()

@ianmilligan1
Copy link
Member

OK thanks all – my own vote is to go with option 2 then, it'd be good to get in the habit of doing this.

And yes, my gut says maybe we should deprecate WriteGDF too. Although you're the networks scholar.. can you think of any cases where somebody might want a GDF instead of a GEXF?

@ruebot
Copy link
Member Author

ruebot commented Dec 6, 2017

What's 1.0 and 1.1? Our current release is 0.11.0, and we use semantic versioning.

@greebie
Copy link
Contributor

greebie commented Dec 6, 2017

It appears that 5e99d63 already removed the associated commands. I also take note the comment by @lintool here as related to java files: #102 (comment) .

I pushed a new branch "LoaderRefactor." It has been tested and works for a minimum of 4 scripts in the documentation.

@ruebot
Copy link
Member Author

ruebot commented Dec 7, 2017

Partially resolved with: 47b5d26

@greebie
Copy link
Contributor

greebie commented Jan 11, 2018

Code deprecated and will be removed in a later release. Closing.

@greebie greebie closed this as completed Jan 11, 2018
@ruebot ruebot moved this from In Progress to Done in Test coverage Jan 11, 2018
@ruebot
Copy link
Member Author

ruebot commented Jan 11, 2018

Fully resolved with: 8fb766e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

No branches or pull requests

3 participants