Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gephi Export/Import Format #190

Open
benloh opened this issue Jan 19, 2022 · 3 comments
Open

Gephi Export/Import Format #190

benloh opened this issue Jan 19, 2022 · 3 comments
Labels
discussion feature discussion
Projects

Comments

@benloh
Copy link
Collaborator

benloh commented Jan 19, 2022

Kalani provided this sample Gephi node + edge export file.

NodeEdgeGephiImport.xlsx

@benloh
Copy link
Collaborator Author

benloh commented Jan 20, 2022

@kalanicraig @jdanish Thanks for sending the files. I have some questions about the file/format (keeping it here in GitHub so we have a record).

I'm trying to reconcile our existing NetCreate data format with the Gephi format and trying to decide what, if anything we should change, with an eye towards future expansion and flexibility.

Elsewhere you noted:

Gephi node tables require:

  • ID: numeric only
  • Label: Any

Gephi’s edge table requires:

  • Source: numeric ID from node table
  • Target: numeric ID from node table
  • Gephi prefers a “Type” column in edge import that is “Directed” or “Undirected” but there’s a batch setting in the import process itself that supports users in choosing directed/undirected

Documented here: #179 (comment)

This kind of suggests that there is a base file format that can be augmented by any number of fields. The other example csv files you sent seem to confirm this.

I think when we had started NetCreate, we were working off of a format exported from that old Google network diagram app (I can't remember the name). We ended up baking some assumptions about the data format into the tool itself based on that data format. But in reviewing the Gephi files and given your "requiremens" listed above, I think we might want to revisit those assumptions.

I know in the past we had talked about adding the ability to add arbitrary data fields. And while our basic data format somewhat supports this (we use an attributes designator to group arbitrary fields), it is not currently supported by the UI.

The need to rework the data structures for exporting, importing, and template editing suggests that this might be the ideal time to at least lay the groundwork for supporting arbitrary fields.

Proposed Data Field Types

There are three types of fields:

  1. Required -- All nodes and edges must have these fields.

  2. Built-in Support -- Optional fields that require integration with NetCreate application.

  3. Arbitrary -- Optional fields with generic NetCreate support

1. Required

All nodes must have:

  • ID number
  • Label string

All edges must have:

  • ID number
  • Source number
  • Target number

2. Built-in Support

These fields are optional, but their implementation requires API support.

  • NodeType string -- supports enumeration of types, supports color definition
  • EdgeType string -- supports enumeration of types, supports color definition (future?)
  • Degree number -- aka "Weight", requires NetCreate to calculate and store values
  • Updated number -- last update date, requires NetCreate to store value

3. Arbitrary

These fields are optional and can make use of simple type validation.

  • String
  • Number
  • Date
  • Boolean

Even if we don't fully implement this, implementing import, export, and template editing requires at least some of these modifications.

Any thoughts on this? Did I interpret the data format wrong? Does this feel like an overreach?

@kalanicraig
Copy link
Collaborator

kalanicraig commented Jan 20, 2022 via email

@benloh
Copy link
Collaborator Author

benloh commented Jan 20, 2022

We can and probably will differentiate between the data type (e.g. 'number') and validation (e.g. integer vs float vs date). Javascript lets us treat all those as a number type and then display and validation can be handled separately in the Template specification.

@benloh benloh added this to To Do in Version 1.4 via automation Jan 21, 2022
@benloh benloh moved this from To Do to In Progress in Version 1.4 Jan 26, 2022
@benloh benloh moved this from In Progress to In Review in Version 1.4 Jan 28, 2022
@benloh benloh added the discussion feature discussion label Mar 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion feature discussion
Projects
Version 1.4
In Review
Development

No branches or pull requests

2 participants