1.3.1 legend related scales in viz + title in meta #18

paulgirard · 2022-08-31T16:04:40Z

This 1.3.1 proposal aims at adding to GEXF data to document graph drawing with legend and title.
GEXF viewer tools (such as https://gitlab.com/ouestware/retina/) can't so far indicate what is the logic behind the visual aspects without reverse engineering the viz parameters/node attributes sets with some heuristic magic.
A good graph drawing often also starts with a good title on top of description.

Therefore this proposals has two main parts:

add a title element in the GEXF meta
extend the viz module to add ways to describe how the viz parameters were calculated from node/edge attributes. It adds ways to store the ranking/partition parameters and layout settings used in Gephi or in other GEXF producers. It has primarily a documentation objective but the current specs looks complete enough to allow drawing tools to not only draw a legend but also recompute the viz parameters from attributes.

To get a more precise idea of the proposal see:

the new 1.3.1 primer section about scales: gexf-131-primer-legend.pdf
the extended viz rnc file: https://github.com/gephi/gexf/blob/legend/specs/1.3.1/_viz.rnc#L82-L158

ping @duncdrum and @gvegayon for comments.

Ideas to be discussed before writting proper relaxng

a workaround a multiple include issue with common in viz and gexf

- use a common rnc to chare type declarations in gexf and viz - legend/scale features in viz - one example to test validation

paulgirard · 2022-09-01T08:24:36Z

In the layout documentation, the current specs use a layoutalgorithm attribute which is a string.
This suppose that GEXF related tools agrees on a layout algorithm name convention. Not only the algorithm name but also the parameters.
Since the primary objective is only documentation that's probably fine. Recomputing the layout would require to recognize layout algo and parameters names...

jacomyma

Nice, I just have 2 observations.

Scales

Two systems coexist for the scale factors (splines):

10 data points
the name (ex: "square-root")

When it comes to generate the scale from a device, I assume that neither the name nor the 10 data points are an issue. If the goal were just to document, we could reasonably stop there. But one of the motives for the proposal is to use these data as inputs for some tools (GEXF viewers). Let us then consider the reading side of the question.

A tool might just read and write the 10 data points. It would work well aside from the "name" attribute being essentially useless. But some tools like current Gephi (0.9) work with native curves. Infering the curve after the data points only seems unreasonable to me, but it could work with the name. The settings, however, are not included. Here is the important remark: in most cases, a single data point suffices to infer the settings. For instance, for a power-law with a variable exponent, we can retrieve the exact equation from the name "power-law" and the single scalepoint at 0.5. For that nonobvious reason, I think adding a field for the settings, in addition to the name of the spline, is not necessary.

Layout

I understand where the logic comes from, and it is an OK compromise, but let me highlight that the final node coordinates are not necessarily the fruit of just one layout, and that the last layout applied is not necessarily the most relevant. Ex: Force Atlas 2 + label adjust as a finishing touch. Ultimately, it is each software's job to be clever about what it retains. Retaining everything does not make sense either (ex: a random layout "erases" whatever happened before). Current proposition covers the case of a single layout algorithm well, which is the most important, so I still support it. Edge cases will arise but I do not have a better idea.

paulgirard · 2022-09-02T12:19:40Z

Thank you @jacomyma

Scales

I added the scalelabel for documentation (I mean for human read). I agree the scalepoint are not optimal to recreate the curve but it would work for any curve even complex non-function based one. An alternative would be to agree on a function expression language or a finite list of frequently used method. My opinion, the former could be an option the later feels too limited.

layout

Good point. I would propose to extend the layout element to allow to host a list of layouts rather than just one. The order is important but I guess we can use the order of XML children.

<viz:positions>
   <viz:layout algorithm="forceatlas2" referenceURL="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0098679">
       <viz:param name="scale" type="integer" value="10"/>
       <viz:param name="stronger gravity" type="boolean" value="true"/>
   </viz:layout>
   <viz:layout algorithm="nooverlap">
       <viz:param name="speed" type="integer" value="3"/>
       <viz:param name="ratio" type="float" value="1.2"/>
       <viz:param name="margin" type="float" value="5.0"/>
    </viz:layout>
 </viz:positions>

What do you think?

duncdrum · 2022-09-02T12:33:20Z

@paulgirard while we could rely on sequence position, the actual order of steps is kind of important, why not allow for an optional @step element that takes xs:positiveInteger as values. E.g.:

<viz:positions>
   <viz:layout algorithm="forceatlas2" referenceURL="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0098679" step="1">
       <viz:param name="scale" type="integer" value="10"/>
       <viz:param name="stronger gravity" type="boolean" value="true"/>
   </viz:layout>
   <viz:layout algorithm="nooverlap" step="2">
       <viz:param name="speed" type="integer" value="3"/>
       <viz:param name="ratio" type="float" value="1.2"/>
       <viz:param name="margin" type="float" value="5.0"/>
    </viz:layout>
 </viz:positions>

paulgirard · 2022-09-02T12:51:42Z

Indeed, explicit is always better than implicit. I am adding this too.

gvegayon · 2022-09-02T15:54:23Z

@paulgirard, thanks for this; it is looking very useful. I like @duncdrum's idea about step, especially in reproducible research. Now, the challenge will be on the Gephi side, which sequence of steps to store. Thinking out loud here, before saving GEXF files, Gephi could show the user the last n layout changes and select which ones to store; but that's a problem for later, I guess.

I also like @duncdrum's idea about using a common math language for the scalelabel attribute. Since all is going web, I would suggest something like JavaScript's Math. In such a case, it could be beneficial to perhaps define attributes as functions, for example, instead of:

<viz:sizes scale=”quantitative” scalelabel=”square−root”>
  <viz:scalepoint forratio=”0” factor=”0” />
  <viz:scalepoint forratio=”0.1” factor=”0.316227766”/>
  <viz:scalepoint forratio=”0.2” factor=”0.447213595”/>
  <viz:scalepoint forratio=”0.3” factor=”0.547722558”/>
  <viz:scalepoint forratio=”0.4” factor=”0.632455532”/>
  <viz:scalepoint forratio=”0.5” factor=”0.707106781”/>
  <viz:scalepoint forratio=”0.6” factor=”0.774596669”/>
  <viz:scalepoint forratio=”0.7” factor=”0.836660027”/>
  <viz:scalepoint forratio=”0.8” factor=”0.894427191”/>
  <viz:scalepoint forratio=”0.9” factor=”0.948683298”/>
  <viz:scalepoint forratio=”1.0” factor=”1”/>
  <viz:range min=”1” max=”10” default=”1” />
</viz:sizes>

Do

<viz:sizes scale=”function” scalelabel=”square−root”>
    function(x) {
      return Math.sqrt((x-1)/(10-1));
    }
</viz:sizes>

I am no expert on XML, but having something like this would be super. Is this something worth implementing?

paulgirard · 2022-09-05T14:38:22Z

Thank you @gvegayon
I don't think accepting plain JavaScript is a good idea as it opens code injection security risks and it supposes to chose/promote one programming language into a neutral data format.

Ideally such expression should be mathematics only.
To take your example it should reduce to:

sqrt((x-1)/(10-1))

I can't find a standard for mathematical expression syntax targeting evaluation and not rendering (MathML is for rendering).
(note: the shunting yard algo is for parsing and ordering tokens and not a math language standard https://en.wikipedia.org/wiki/Shunting_yard_algorithm)

In my opinion such a mathematical expression should be easy to evaluate in: Java, Python and JavaScript worlds.
It looks like every math expression evaluation library is using its own syntax without pointing to a standard:

But maths main functions looks like having the same name is those examples. Which means that we should check/document what are the supported math functions for this expression after having check they are common to most frequently used implementations...
Doable but not exactly exciting. Or to put differently looks like a more complicated not bringing much more than a set of common mathematic functions we add to the GEXF format.

To finish we should keep in mind that this would require GEXF producer/consumer such as Gephi to implement math expressions production/evaluation. So we should evaluate the ease of use of our representation choice in this regards.

To finish on this here are the so far encountered possible ways to describe a quantitative scale non-linear function in GEXF:

a finite list of common mathematical functions (log, sqrt, pow...) to add in GEXF format
the GEPHI spline solution : two points in 0-1 0-1 space defining a bézier curve from 0,0 to 1,1
a discrete version of the curve (what is in the current proposal): finite list of normalization curve points
a mathematic expression as discussed in this comment

At this point my personal feeling is to chose a finite list of common math functions (D3 does that : https://github.com/d3/d3-scale#continuous-scales) or splines (already implemented in Gephi and flexible).

duncdrum · 2022-09-05T14:50:12Z

@paulgirard since we are talking about gexf as data format, is there anything missing from Xpath math functions? https://www.w3.org/2005/xpath-functions/math/#fo-math-summary I'd say these would be a more natural fit than Java or a custom syntax. Any xpath processor would be able to handle these already.

just to note not having math expressions is not a showstopper for me.

Yomguithereal · 2022-09-05T15:01:21Z

I tend to agree with @paulgirard personally and would be happy with only well-known, parametrizable, scale options following d3 etc. such as pow, log, lin and sqrt. I would go as far as using the splines for Gephi compat and if you need more complexity but I draw the line at custom math expressions as it would introduce too much complexity and potential hurdles. I am not very fond of curve discretization with points (but it could be helpful with color and their strange spaces).

gvegayon · 2022-09-06T16:28:17Z

Good points, @paulgirard! The thing about personalized math functions, @Yomguithereal, is mostly about flexibility. In general, I like building tools/standards that provide some wiggle room for things I have not thought of. Nevertheless, I also appreciate having a well-encapsulated file format! On a related note, the NeXML file format (for phylogenics) includes a meta tag that allows adding arbitrary annotations.

That said, I agree with your last comment, @paulgirard,

At this point my personal feeling is to chose a finite list of common math functions (D3 does that : https://github.com/d3/d3-scale#continuous-scales) or splines (already implemented in Gephi and flexible).

mbastian · 2022-09-07T06:52:19Z

+1 on supporting a finite set of common functions, in addition of the splines for compatibility. If we have this, do we really need to support the discrete version?

- removed scalepoint - added transform as a finite list of math functions or spline definition - primer and readme not updated yet

paulgirard · 2022-09-14T14:52:30Z

Thank you all.
As we converged to a solution I updated the proposal :

added pow, sqrt, log10, log, exp, exp10 transform functions
added spline
removed discretized solution

      <attribute id="degree" title="Degree" type="integer">
        <default>0</default>
        <viz:sizes scale="quantitative" scalelabel="square-root">
          <viz:transform>
            <viz:sqrt />
          </viz:transform>
          <viz:range min="1" max="10" default="1" />
        </viz:sizes>
      </attribute>
      <attribute id="size" title="Size" type="integer">
        <default>0</default>
        <viz:sizes scale="quantitative" scalelabel="square-root">
          <viz:transform>
            <viz:pow exponent="2"/>
          </viz:transform>
          <viz:range min="1" max="25" default="1" />
        </viz:sizes>
      </attribute>
      <attribute id="pagerank" title="Page Rank" type="integer">
        <default>0</default>
        <viz:sizes scale="quantitative" scalelabel="spline">
          <viz:transform>
            <viz:spline>
              <viz:origin-control-point x="0.6" y="0.01"/>
              <viz:destination-control-point x="0.8" y="0.9" />
            </viz:spline>
          </viz:transform>
          <viz:range min="1" max="5" default="1" />
        </viz:sizes>
      </attribute>

What do you think?

I am waiting for some approvals before updating the primer.

paulgirard · 2022-09-14T14:54:06Z

ps: I couldn't find a way to reuse XPATH math function definition as XMl specs are very new to me. If anyone think there is a better way to specify math transform function please let my know 🙏

gvegayon · 2022-09-19T17:48:25Z

Thank you, @paulgirard! Question: How do <default>0</default> and <viz:range min="1" max="5" default="1" /> coexist (honest question)?

duncdrum · 2022-09-19T18:24:44Z

@paulgirard just saw this, I ll try to have a fork of your PR ready with xpath math before the weekend.

mbastian · 2022-09-23T17:28:57Z

@paulgirard One thought about degree columns. Normally, a GEXF wouldn't include a degree, in-degree, out-degree or edge kind columns as those directly depend on the graph so not really needed to have it as an attribute. We would't plan to export those columns in GEXF via Gephi for instance. But if a legend is based on the degree column we should still include it somehow, right? What do you suggest?

paulgirard added 7 commits August 30, 2022 12:23

GEXF legend specification first draft

6decb51

Ideas to be discussed before writting proper relaxng

[DOC] add jing requirements

abbedc0

don't build hidden (_ prefixes) rnc files

2025a9e

a workaround a multiple include issue with common in viz and gexf

[1.3.1] first complete rnc draft

7ea95cb

- use a common rnc to chare type declarations in gexf and viz - legend/scale features in viz - one example to test validation

[primer] 1.3.1 primer first draft

3dfdde9

[1.3.1] Add a title in GEXF

fe54bcb

[1.3.1] README changelog

b5d0d9c

mbastian linked an issue Sep 1, 2022 that may be closed by this pull request

GEXF Legend #17

Open

mbastian requested review from duncdrum, eduramiba and jacomyma September 2, 2022 06:09

jacomyma approved these changes Sep 2, 2022

View reviewed changes

paulgirard added 3 commits September 2, 2022 15:00

[1.3.1] restrict ratio/factor atts to [0-1] range

2a32808

[1.3.1] positions as an ordered list of layouts

e2158e8

[1.3.1] primer with the new layouts specs

11a9c80

paulgirard mentioned this pull request Sep 7, 2022

Design guidelines gephi/gephi-lite#13

Closed

[1.3.1] viz:transform element

3a4ce8f

- removed scalepoint - added transform as a finite list of math functions or spline definition - primer and readme not updated yet

paulgirard mentioned this pull request Oct 25, 2022

Caption gephi/gephi-lite#21

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.3.1 legend related scales in viz + title in meta #18

1.3.1 legend related scales in viz + title in meta #18

paulgirard commented Aug 31, 2022 •

edited

paulgirard commented Sep 1, 2022

jacomyma left a comment

paulgirard commented Sep 2, 2022

duncdrum commented Sep 2, 2022 •

edited

paulgirard commented Sep 2, 2022

gvegayon commented Sep 2, 2022

paulgirard commented Sep 5, 2022 •

edited

duncdrum commented Sep 5, 2022 •

edited

Yomguithereal commented Sep 5, 2022

gvegayon commented Sep 6, 2022

mbastian commented Sep 7, 2022

paulgirard commented Sep 14, 2022

paulgirard commented Sep 14, 2022

gvegayon commented Sep 19, 2022

duncdrum commented Sep 19, 2022

mbastian commented Sep 23, 2022

1.3.1 legend related scales in viz + title in meta #18

Are you sure you want to change the base?

1.3.1 legend related scales in viz + title in meta #18

Conversation

paulgirard commented Aug 31, 2022 • edited

paulgirard commented Sep 1, 2022

jacomyma left a comment

Choose a reason for hiding this comment

Scales

Layout

paulgirard commented Sep 2, 2022

Scales

layout

duncdrum commented Sep 2, 2022 • edited

paulgirard commented Sep 2, 2022

gvegayon commented Sep 2, 2022

paulgirard commented Sep 5, 2022 • edited

duncdrum commented Sep 5, 2022 • edited

Yomguithereal commented Sep 5, 2022

gvegayon commented Sep 6, 2022

mbastian commented Sep 7, 2022

paulgirard commented Sep 14, 2022

paulgirard commented Sep 14, 2022

gvegayon commented Sep 19, 2022

duncdrum commented Sep 19, 2022

mbastian commented Sep 23, 2022

paulgirard commented Aug 31, 2022 •

edited

duncdrum commented Sep 2, 2022 •

edited

paulgirard commented Sep 5, 2022 •

edited

duncdrum commented Sep 5, 2022 •

edited