-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gephi Export/Import Format #190
Comments
@kalanicraig @jdanish Thanks for sending the files. I have some questions about the file/format (keeping it here in GitHub so we have a record). I'm trying to reconcile our existing NetCreate data format with the Gephi format and trying to decide what, if anything we should change, with an eye towards future expansion and flexibility. Elsewhere you noted: Gephi node tables require:
Gephi’s edge table requires:
Documented here: #179 (comment) This kind of suggests that there is a base file format that can be augmented by any number of fields. The other example csv files you sent seem to confirm this. I think when we had started NetCreate, we were working off of a format exported from that old Google network diagram app (I can't remember the name). We ended up baking some assumptions about the data format into the tool itself based on that data format. But in reviewing the Gephi files and given your "requiremens" listed above, I think we might want to revisit those assumptions. I know in the past we had talked about adding the ability to add arbitrary data fields. And while our basic data format somewhat supports this (we use an The need to rework the data structures for exporting, importing, and template editing suggests that this might be the ideal time to at least lay the groundwork for supporting arbitrary fields. Proposed Data Field TypesThere are three types of fields:
1. RequiredAll nodes must have:
All edges must have:
2. Built-in SupportThese fields are optional, but their implementation requires API support.
3. ArbitraryThese fields are optional and can make use of simple type validation.
Even if we don't fully implement this, implementing import, export, and template editing requires at least some of these modifications. Any thoughts on this? Did I interpret the data format wrong? Does this feel like an overreach? |
I don’t think it’s overreach at all! The required and built-in, fields outlined here all track with what I would start for a new, very basic network, and the arbitrary fields suit what I would need as a humanist in order to tie additional non-network data to the node/edge data.
Google Fusion Tables handled network data a little differently, with a greater focus on labels and alphanumeric than on numeric-ID governed relationships between tables.
Only one issue that I can see cropping up based on these notes here: some of the arbitrary data types might take several forms. Not all numbers will be integers. Some will be decimals. For instance, a user might want latitude and longitude attributes in separate fields rather than in a user-Entered text string (which is how I’ve done lat/long up to now). That said, I’m not sure the average user will know the difference (or want to differentiate) between integer and number so we should figure out how to deal with display and validation with that in mind. I also imagine the user focus (or lack thereof) on the differences between integer/decimal and date/datetime would be similar and lead to similar validation issues for us. (I do feel strongly about ISO datetime formats for display and storage tho; YYYY-MM-DD for the win).
… On Jan 19, 2022, at 7:37 PM, benloh ***@***.***> wrote:
@kalanicraig @jdanish Thanks for sending the files. I have some questions about the file/format (keeping it here in GitHub so we have a record).
I'm trying to reconcile our existing NetCreate data format with the Gephi format and trying to decide what, if anything we should change, with an eye towards future expansion and flexibility.
Elsewhere you noted:
Gephi node tables require:
ID: numeric only
Label: Any
Gephi’s edge table requires:
Source: numeric ID from node table
Target: numeric ID from node table
Gephi prefers a “Type” column in edge import that is “Directed” or “Undirected” but there’s a batch setting in the import process itself that supports users in choosing directed/undirected
Documented here: #179 (comment)
This kind of suggests that there is a base file format that can be augmented by any number of fields. The other example csv files you sent seem to confirm this.
I think when we had started NetCreate, we were working off of a format exported from that old Google network diagram app (I can't remember the name). We ended up baking some assumptions about the data format into the tool itself based on that data format. But in reviewing the Gephi files and given your "requiremens" listed above, I think we might want to revisit those assumptions.
I know in the past we had talked about adding the ability to add arbitrary data fields. And while our basic data format somewhat supports this (we use an attributes designator to group arbitrary fields), it is not currently supported by the UI.
The need to rework the data structures for exporting, importing, and template editing suggests that this might be the ideal time to at least lay the groundwork for supporting arbitrary fields.
Proposed Data Field Types
There are three types of fields:
Required -- All nodes and edges must have these fields.
Built-in Support -- Optional fields that require integration with NetCreate application.
Arbitrary -- Optional fields with generic NetCreate support
1. Required
All nodes must have:
ID number
Label string
All edges must have:
ID number
Source number
Target number
2. Built-in Support
These fields are optional, but their implementation requires API support.
NodeType string -- supports enumeration of types, supports color definition
EdgeType string -- supports enumeration of types, supports color definition (future?)
Degree number -- aka "Weight", requires NetCreate to calculate and store values
3. Arbitrary
These fields are optional and can make use of simple type validation.
String
Number
Date
Boolean
Even if we don't fully implement this, implementing import, export, and template editing requires at least some of these modifications.
Any thoughts on this? Did I interpret the data format wrong? Does this feel like an overreach?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
|
We can and probably will differentiate between the data type (e.g. 'number') and validation (e.g. integer vs float vs date). Javascript lets us treat all those as a |
Kalani provided this sample Gephi node + edge export file.
NodeEdgeGephiImport.xlsx
The text was updated successfully, but these errors were encountered: