-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NullPointerException on importing CSV data in Data Laboratory #48
Comments
(by a-bruns) Gephi chokes on the import after some time (never at the same place, it seems), and throws up the error; after that, it's possible in the Data Laboratory to find out where it finished (i.e. what the last imported edge was - the one with the highest edge ID). It's also possible to continue the import if you remove the already imported edges from the CSV, and import the new truncated file. With the attached CSV, I had to repeat that process more than half a dozen times to import all the edges... Hope that helps... Axel |
(by eduramiba) |
(by eduramiba) |
(by a-bruns) On the parallel edges - yes, but they're probably occurring during different timeframes, right ? Will Gephi permit that ? Axel |
(by mathieu.bastian) If you import your data using the Time Frame import (as described here: http://wiki.gephi.org/index.php/Import_Dynamic_Data) it will manage the parallel edges Otherwise no, at this level I think we cannot guess they belong to different timeframes anyway, right? |
(by a-bruns) Just following up again on the parallel edges question: what I'm wondering is whether there's a way to get Gephi to accept edges which appear only in specific time slices when importing CSVs of edges. For example, say I want an edge to exist only during 2001-2003 and 2005-2007 - which could be expressed like this: source,target,Time Interval At the moment, importing this CSV into Gephi ignores the second line (it's regarded as a parallel edge). Does Gephi do slices at all at the moment, and if so, how do you express them in the internal time interval notation ? Axel |
(by a-bruns) just following up on this again: I've experimented some more, and I'm now at a point where I'm not sure whether I'm doing something wrong, or whether there's still an error in Gephi somewhere. I've created a simple test network in Gephi: two nodes (labelled 'c' and 'g'), which are visible between times 0 and 30, which are linked with an edge between times 1 and 10, and again between 20 and Infinity. If I export the nodes and edges tables as CSVs from Gephi's Data Laboratory, this is what I get: Nodes.csv: Id,Label,Time Interval Edges.csv: Source,Target,Type,Id,Label,Weight,Time Interval Importing these two CSVs back into the Gephi Data Laboratory (in a clean Workspace) works fine, but even if I make sure that the field type for 'Time Interval' is set to TimeInterval, I'm still not able to do any dynamic visualisation - the dynamic filter function simply doesn't work. Even only importing the nodes CSV file by itself (without the edges CSV) doesn't work - the lower and upper limits of the timeline slider are (correctly) set to 0 and 30, but on the 'filters' tab, no filter is available in the 'dynamic' subsection, and clicking on the 'Filter' button does nothing... Is there something I'm doing wrong here ? Axel |
(by mathieu.bastian) On the tutorial you also have the method with importing slices from a GEXF file. The fact you're properly importing a Time_INTERVAL column but it is not activated as the dynamic column could be improved, we could activate it by default is no other TIME_INTERVAL columns exists. It is currently not done, as it is not the default use-case. |
(by a-bruns) many thanks for this again. I think what I'm trying to do is more complicated than that, though: First, as far as I can tell, the first approach (transforming numerical columns into a time interval) doesn't work for nodes or edges which appear, disappear, and the re-appear - e.g. <[1.0, 10.0); [20.0, Infinity]> in my example above, which is visible from 1 to 10, and then again from 20 to the end. I'm not sure whether Gephi allows me to select four columns (e.g. start1, end1, start2, end2) to convert into one single time interval - but even that would get very messy in the case of a node which appears and disappears very frequently over time... And I've been reluctant to try the GEXF route, mainly because of the nature of my data: if I understand GEXF slices right (and I may not), they work only for fairly well-defined and consistent time intervals (day by day, hour by hour, etc.). However, I'm dealing with continuous Twitter data which has no clear time intervals - tweets may appear at random, any second of the timeline, and I don't want to lose too much of that resolution by defining hour-by-hour slices, for example. Also, converting my Twitter data (which is in CSV form by default) to GEXF for import into Gephi would be very work-intensive - being able to import the CSVs of nodes and edges which I already have would be a lot easier...;-) So if there's a way to add functionality in Gephi to activate the Time Interval column which I can already import, that would be the path of least resistance for my purposes... I know it's a very obscure request, but if there's a any way to do it, that would be fantastic ! Axel |
(by mathieu.bastian) I created the bug 659017 to enable dynamic when importing time interval column from CSV wizard. That would make possible direct import of time intervals if they are correctly formatted. |
(by a-bruns) Thanks for setting up the new bug as well ! Axel |
(by eduramiba) About the several slices I think you will need to use gexf format to achieve that, since this csv importer was intended to be a simple table formatted data importer. |
(by mathieu.bastian) I deployed the patch on AutoUpdate, Alex you can update your Gephi, it should work now. |
(by a-bruns) brilliant - many thanks for this very fast work. I've just tried this with the very simple example I posted in comment #7 above, and everything works perfectly - scrolling through the timeline, the edge disappears between 10.0 and 20.0 as it should, but is visible at all other times between 1.0 and 30.0... Thanks again - this will make importing my Twitter data a lot easier. Axel |
(by a-bruns) sorry to follow up on this again. There's one last thing missing that would be extremely helpful for me. Edge weights in Gephi can be time-dependent as well - you can see this if you have a network with time intervals and export an edge list as CSV from the Data Laboratory, for example. The format for time-dependent weights looks like this: <[100.0, Infinity, 1.0]> Or presumably, if there are different weights at different times, <[100.0, 200.0, 1.0); [200.0, 300.0, 2.0]> etc. Currently, Gephi doesn't accept time-dependent weights when importing edge list CSVs into the Data Laboratory, though. Here's a simple edge list which I've exported from the Data Laboratory: Source,Target,Type,Id,Label,Weight,Time Interval When re-importing this into Gephi, the weights are simply set to 1. Is there any chance of fixing this? Hopefully it would be just a matter of being able to choose the data type for the weight field when importing CSVs? Axel |
(by eduramiba) |
(by a-bruns) Thanks for this. I've now bitten the bullet and scripted something that creates a working GEXF file from my data - the method is explained at http://www.mappingonlinepublics.net/2010/10/20/dynamic-networks-in-gephi-from-twapperkeeper-to-gexf/. As far as I can tell, it is possible for edge weights to be dynamic, though - in GEXF, for example, this would be expressed in the following fashion: This works in the GEXF files I've created now... Axel |
Graph API throws NPE because source or target is null
Stack trace:
java.lang.NullPointerException
at org.gephi.graph.dhns.core.GraphFactoryImpl.newEdge(GraphFactoryImpl.java:128)
at org.gephi.graph.dhns.core.GraphFactoryImpl.newEdge(GraphFactoryImpl.java:123)
at org.gephi.graph.dhns.core.GraphFactoryImpl.newEdge(GraphFactoryImpl.java:47)
at org.gephi.datalab.impl.GraphElementsControllerImpl.buildEdge(GraphElementsControllerImpl.java:431)
at org.gephi.datalab.impl.GraphElementsControllerImpl.createEdge(GraphElementsControllerImpl.java:89)
at org.gephi.datalab.impl.AttributeColumnsControllerImpl.importCSVToEdgesTable(AttributeColumnsControllerImpl.java:597)
at org.gephi.datalab.plugin.manipulators.general.ui.ImportCSVUIWizardAction.performAction(ImportCSVUIWizardAction.java:85)
at org.gephi.datalab.plugin.manipulators.general.ImportCSV.execute(ImportCSV.java:40)
Imported from Launchpad using lp2gh.
The text was updated successfully, but these errors were encountered: