Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor/graph writers #375

Merged
merged 2 commits into from
Jul 26, 2019
Merged

Conversation

angriman
Copy link
Member

@angriman angriman commented Jul 9, 2019

All classes that write graphs inherit from GraphWriter. Exposed NetworkitBinaryReader/Writer to Python, together with the GraphWriter hierarchy.

Copy link
Contributor

@manpen manpen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR really simplifies code. Awesome. Most of my comments are "while-you're-at-it" style (that's what you get for touching old code ;) ), so feel free to disregard. But I think const write-methods should be considered.

networkit/_NetworKit.pyx Outdated Show resolved Hide resolved
include/networkit/io/DotGraphWriter.hpp Outdated Show resolved Hide resolved
* @param[in] graph The graph object
* @param[in] path The file path to be written to
*/
virtual void write(Graph& graph, std::string path) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame we are loosing the constness here. I just (quickly) went through all writers and it seems to me that only NetworKitBinaryWriter (which needlessly write to its nodes attribute) and GraphToolBinaryWriter (which needlessly has non-const methods which can be marked const) have use non-const writers.

If you've time, please make the GraphWriter::write const; otherwise we can do it in another PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, as you wrote those classes could easily be adapted to implement a const write method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree here. Why write() is essentially the GraphWriters run method. If you want to write in parallel, construct two writers. Making the virtual method const heavily constraints what is efficiently possible in a GraphWriter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To elaborate: The base class for GraphWriters should absolutely remain non-const. I don't see what advantage const offers for writers.

More broadly, I do not think that we should support parallel usage of any algorithm (or I/O interface) in NetworKit unless there is a good reason to do so. const is an unnecessary pessimization here: it forces implementers to maintain local copies of algorithm state instead of storing the state in the algorithm's object. Often, this is much less efficient, especially if an algorithm's object is reused. Reuse of algorithm objects is arguably much more common than parallel use of the same algorithm's object.

Also note that const does affect the API: turning a const method to non-const is a breaking change. We should only apply const to functions that will not write to the algorithm's state, forever, regardless of implementation strategy. I am not sure if any of the GraphWriters should have const write() methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I see your point and are kind of convinced that the base class should remain non-const, I still have two points:

  • Currently there ARE writers that are const and the breaking of API you mentioned would occur in this PR
  • AFAICS we currently have no examples where the state of the write should change (I do not think that nodes should be a member of NetworKitBinaryWriter but rather a local variable; it's only used in write. In GraphToolBinaryWriter no state is changed)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that those count variables should be local. However, there might be writers that have a legitimate reason to construct large data structures (e.g., std::vectors or std::unordered_maps to remap node IDs?) in memory. Those structures should probably be stored in the object to avoid reallocating them whenever write() is called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree that for future writers it might be an issue, and -- given that -- I agree that the base class should remain non-const. I'm just saying that if we do this, we should make a conscious decision to break the interface (also as you already pointed out with probably no serious implications).

Copy link
Member Author

@angriman angriman Jul 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the base class write is non-const. Are there any other issues or we can merge this?

networkit/_NetworKit.pyx Show resolved Hide resolved
include/networkit/io/DotGraphWriter.hpp Outdated Show resolved Hide resolved
include/networkit/io/SNAPGraphWriter.hpp Outdated Show resolved Hide resolved
include/networkit/io/EdgeListWriter.hpp Outdated Show resolved Hide resolved
networkit/_NetworKit.pyx Show resolved Hide resolved
@angriman angriman force-pushed the refactor/graph-writers branch 3 times, most recently from 5c46d71 to c729c10 Compare July 13, 2019 12:34
Copy link
Contributor

@manpen manpen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes and sorry for the fuzz about the const-ness.
I only spotted two minor issues. Everything else looks fine to me :)


virtual void write(const Graph& G, bool weighted, std::string path);
virtual void write(const Graph &G, bool weighted, const std::string &path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not have to be virtual -- the writer is declared final

#include <fstream>
#include <iostream>
#include <string>

#include <networkit/io/GraphReader.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for bringing this up again: I believe #include <networkit/io/GraphReader.hpp> is not required.
Also while fstream was not required in the header, the implemenation needs it. So please include it in the .cpp file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed fstream because it is already included in GraphWriter.hpp. However, GraphWriter.hpp does not need fstream, only its derived classes use it. Should I move fstream to the derived classes and remove it from GraphWriter.hpp?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed that. Sorry. Never mind then. While I try to not rely on indirect includes, it's quite common in networkit, so I guess it's fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there is no valid reason for including fstream in GraphWriter.hpp, and I agree that we should not rely on indirect includes. I'll move fstream to the derived classes.

manpen
manpen previously approved these changes Jul 17, 2019
@manpen
Copy link
Contributor

manpen commented Jul 17, 2019

Thank you :)

@manpen
Copy link
Contributor

manpen commented Jul 22, 2019

I just wanted to leave a big thank you to @cndolo, @angriman, and @avdgrinten. I used this PR (based on #325) to cache my rather large ASCII-based inputs with NKB. In one case, reading the file dropped from ~5min to less than 10s; that's really cool!

@avdgrinten
Copy link
Contributor

That nice to hear @manpen! Indeed, fast caching is one of the main motivations of the NKBG format.

Unfortunately, #380 bumps the version number of the format once more, without adding compatibility fixes for the current version (as I didn't expect that there were any users already). Is that a problem for your use case? If yes, we should add those compatibility fixes (those fixes should only consist of a few lines anyway - basically we would need to add code to ignore the last two fields of the file header).

My general plan is provide backwards-compatibility once the format appears in a released version of NetworKit.

@manpen
Copy link
Contributor

manpen commented Jul 22, 2019

For me personally that's not an issue and I would even vote to not include backward compatibility just yet in favour of an easier code -- after all, it the Dev branch and thinks are bound to break ;)

@avdgrinten
Copy link
Contributor

Great, then our compatibility guarantee for the NKBG format is: once it appears in a release, we provide backwards compatibility indefinitely.

@avdgrinten
Copy link
Contributor

@angriman This PR has conflicts against Dev.

Signed-off-by: Eugenio Angriman <eugenio.angriman@informatik.hu-berlin.de>
…m GraphWriter

Signed-off-by: Eugenio Angriman <eugenio.angriman@informatik.hu-berlin.de>
@angriman
Copy link
Member Author

@avdgrinten conflicts have been fixed.

@avdgrinten avdgrinten merged commit 228f2f0 into networkit:Dev Jul 26, 2019
@angriman angriman deleted the refactor/graph-writers branch February 1, 2020 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants