-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coappearances graph as exhaustive example. #21
Conversation
The goal of this example is to show how to convert a nested data format in to a flat representation in flatdata (here: json). Further, it should introduce and use all available data structures in flatdata.
examples/coappearances.cpp
Outdated
{ | ||
// Since flatdata's mutators are not holding any data, we are creating a vector with a single | ||
// element for holding the data. | ||
flatdata::Vector< co::Meta > data( 1 ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better way to do it right now? -- If not, we should think about a better approach. I find this very contra-intuitive.
One approach could be introducing a method start_meta
to the builder, which would hide the vector with a single field from the user. close
method would then do what set_meta
is doing now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We always wanted to add a flatdata::Struct< T >
or flatdata::Object< T >
but there was no time (aka good reason) to. Could do that, if you like (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
examples/coappearances.cpp
Outdated
|
||
auto character = vertices.grow( ); | ||
character.name_ref = strings.size( ); | ||
strings += data.at( "name" ).get< std::string >( ) + '\0'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty straight-forward, but still too error-prone. It is very easy to forget to add + '\0'
and then the reader would read complete garbage (happened to me several times). What do you think about adding a simple helper class? The interfaces could be similar to flatdata::Vector
: grow
and close
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Something like a "StringList" or "StringListBuilder". And raw raw_data
is for other uses then.
examples/coappearances.cpp
Outdated
} | ||
} | ||
|
||
vertices_data.next_item( ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hit me several times already. In vector
I need first to call grow
and then start adding data. In the multivector it is the other way around: first, add the data (to the bucket), then call next_item
. What do you think about unifying the behavior?
We could even go further and unify the interface by replacing next_item
by grow
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solved in #25.
Coappearances | ||
------------- | ||
|
||
This examples converts a graph of coappearances from json to flatdata. A graph of coappearances is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a wonderful example, finally ٩(◕‿◕。)۶
examples/coappearances.flatdata
Outdated
minor: u8 : 7; | ||
} | ||
|
||
// @bound_implicitly( characters: vertices, vertices_data ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We never thought of that... "Bound Implicitly" meant binding two vectors together, as a columnar way to store a single structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was done in #18.
examples/coappearances.flatdata
Outdated
meta : Meta; | ||
|
||
@explicit_reference( Character.name_ref, strings ) | ||
vertices : vector< Character >; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not characters
? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am describing a graph consisting of vertices and edges. This is also a nice way to say vertices are Characters by using types.
examples/coappearances.cpp
Outdated
int | ||
main( int argc, char const* argv[] ) | ||
{ | ||
if ( argc < 3 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might make sense to use gflags
or similar library to make this leaner. As an example, this can afford extra dependency IMO.
examples/coappearances.cpp
Outdated
return 1; | ||
} | ||
} | ||
catch ( std::runtime_error err ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When catching an exception, catch by const&
. https://stackoverflow.com/questions/2145147/why-catch-an-exception-as-reference-to-const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
examples/coappearances.cpp
Outdated
return ( static_cast< uint8_t >( id[ 0 ] ) << 8 ) + static_cast< uint8_t >( id[ 1 ] ); | ||
} | ||
|
||
using CharactersIndex = std::map< uint16_t /* id */, uint16_t /* ref */ >; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might need a better comment as id
on one side and ref
on the other is a little bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, makes sense. I also realized that I actually do not need convert_id
function, since I am not storing ids anyway.
examples/coappearances.flatdata
Outdated
* ] | ||
* } | ||
*/ | ||
namespace coappearances { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The schema needs a few comments. Mostly, about the way things refer to each other. Otherwise presence of both "Relations" and "edges" doesn't make it straightforward to understand..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, make sense. I also added extended docs to the whole schema.
Also: * Do not map character ids to uint16_t but use just as is in the mapping.
int | ||
main( int argc, char const* argv[] ) | ||
{ | ||
auto args = docopt::docopt( USAGE, {argv + 1, argv + argc} ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure it makes sense to use something like that for such simple argument parsing.
We should wait with merging until #25 is merged. The latter solves an issue with |
* Use `flatdata::Struct` to hold memory for populating Meta. * Use the new `flatdata::MultiVector` apo.
All comments are fixed. Please review. |
Also * Output the number of vertices and edges.
auto to_refs = relation.at( "to" ).get< picojson::array >( ); | ||
assert( to_refs.size( ) == 2 ); | ||
rel.to_a_ref = characters_index.at( to_refs[ 0 ].get< std::string >( ) ); | ||
rel.to_a_ref = characters_index.at( to_refs[ 1 ].get< std::string >( ) ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assigned the same value twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Fixed.
The goal of this example is to show how to convert a nested data format into a flat representation in flatdata (here: json, a sample file is included). Further, it should introduce and use all available data structures in flatdata, hence exhaustive.
When #2 is implemented, this example could be extended to use enums.
I personally think that this example can be also used for an introductory tutorial in flatdata. Especially, the techniques as sentinels, ranges, saving of string in a raw memory blocks, multivectors etc. can be explained easily.