-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add table with character data to API #31
Comments
The above PR implements the See also https://dracor.org/documentation/api/#/public/get-cast. |
This is a prerequisite for dracor-org/dracor-api#31
The The values differ significantly from those provided in the Shiny app. At least in the case of closeness centrality there seems to be either a different concept or a mistake on Shiny's side. Take, for instance, https://dracor.org/ger/lessing-philotas: if closeness centrality is the "reciprocal of the sum of the length of the shortest paths between the node and all other nodes in the graph" (Wikipedia), Philotas' closeness should be 1/3 = 0.3333333333333333. Shiny gives 1.0000 though. JSNetworkX's values also differ from those of the Shiny app for betweenness centrality, while the degrees match and there is no eigenvector centrality in Shiny. (see https://dracor.org/api/corpora/ger/play/lessing-philotas/cast) @lehkost could you have a look at those values and clarify which ones are correct. Also, I'm not sure how the weighted degree is calculated. It does not seem to be in Shiny yet and I have not found a function in JSNetworkX that would seem to provide it. And finally, when calculating eigenvector centrality, JSNetworkX throws an exception with four of our plays (gogol-tjazhba, lermontov-strannyj-chelovek, brandes-ariadne-auf-naxos, panizza-nero), which is why in the cast lists of those plays the |
Great work already, so let's try to resolve remaining issues. This is what Gephi throws for Philotas: Closeness centrality: So it seems that for the example provided, 1.0 is actually the correct value. Philotas has 3 direct connections to the other nodes (so in each case the distance = 1), so the sum of these is 3. Now, the number of other nodes (also 3) divided by that sum is 1.0. For Parmenio, this would be 3 divided by 5 = 0.6. The definition on Wikipedia is a bit hard to grasp, and the formula is different. This paper (PDF) actually has it much shorter: CC = "Number of nodes divided by the sum of the topological distances" – it also cites the benchmark book by Wasserman/Faust. Weighted degree: This is basically the same value as degree, but instead of adding 1 to the degree number with every new relation of a node, we add the weight number for this pair of nodes (corresponding to the Weight table in our CSV files), i.e., if two characters co-appear in 4 scenes, their weight would be 4. |
@lehkost I adjusted the closeness centrality and added the weighted degree. Now there is still a mismatch between Gephi's and NetworkX's eigenvector calculation (I checked, Python's NetworkX and JSNetworkX yield the same values for Philotas). |
Some more info on the Eigenvector Centrality mismatch, which seems to happen between igraph and NetworkX. We're not the first to notice that (cf. "Eigenvector Centrality Oddity with iGraph, Gephi, and NetworkX"). While that article finds diverging values for all three, igraph, Gephi and NetworkX, we find that igraph and Gephi throw the same results, while NetworkX begs to differ. To add another example, here's what our R script throws (using igraph) for "Emilia Galotti": The documentation for igraph and NetworkX both insinuate that they're relying on the same algorithm. Could you maybe check if you throw the 'edge weights' into the formula (which we don't do)? This could explain the different values… |
Proposed name:
/corpora/{corpusname}/play/{playname}/characters/csv
Proposed values:
ID and label:
Three quantitative measures:
<sp>
per characterFive network-based measures (per character)
As far as I can see, we do not calculate network values per character for API purposes yet. Our Shiny app has implemented this already in the
Vertices
tab and may serve as point of references for these values.The text was updated successfully, but these errors were encountered: