# Co-publication graph

## Lab authors

Lab authors are the main ingredient to analyse a single lab (i.e. a group of researchers). You can create one just with a name and then automatically ask to retrieve the DB endpoints for this author.

In [1]:
from gismap.lab import LabAuthor

maria = LabAuthor("Maria Potop")
maria.auto_sources()



We see a warning here. Let's look at the sources:

In [2]:
maria.sources

[HALAuthor(name='Maria Potop', key='858256', key_type='pid'),
 HALAuthor(name='Maria Potop', key='841868', key_type='pid'),
 DBLPAuthor(name='Maria Potop', key='p/MariaPotopButucaru')]

This is actually normal: Maria has multiple identities in Hal. The warning is there to tell there is a possibility of homonyms but that is not the case here. Note that an author can have many names.

In [3]:
maria.aliases

['Maria Gradinariu', 'Maria Gradinariu Potop-Butucaru', 'Maria Potop-Butucaru']

When using `auto_source`, you can tell which DBs should be uses (only online DBLP and HAL are available right now).

In [4]:
from gismap.sources.hal import HAL

celine = LabAuthor("Céline Comte")
celine.auto_sources(dbs=[HAL])
celine.sources

[HALAuthor(name='Céline Comte', key='celine-comte')]

When the sources of an author are set one can retrieve her publications.

In [5]:
celine.get_publications()

{'2118156': SourcedPublication(title="0 = 0, c'est le truc du noyau ! Application aux files d'attente", authors=[HALAuthor(name='Anne Bouillard', key='anne-bouillard'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Élie de Panafieu', key='Élie de Panafieu', key_type='fullname'), HALAuthor(name='Fabien Mathieu', key='fabien-mathieu')], venue='ALGOTEL 2019 - 21èmes Rencontres Francophones sur les Aspects Algorithmiques des Télécommunications', type='conference', year=2019, key='2118156', url='https://hal.science/hal-02118156v1'),
 '1889101': SourcedPublication(title='Of Kernels and Queues: when network calculus meets analytic combinatorics', authors=[HALAuthor(name='Anne Bouillard', key='anne-bouillard'), LabAuthor(name='Céline Comte', metadata=AuthorMetadata()), HALAuthor(name='Élie de Panafieu', key='Élie de Panafieu', key_type='fullname'), HALAuthor(name='Fabien Mathieu', key='fabien-mathieu')], venue='NetCal 2018', type='conference', year=2018, key='18891

Lab authors can have metadata that can be used for display and further analysis but we will not cover that in this tutorial.

## Your first lab

In GISMAP, a Lab is a class whose instances have two methods:

- `update_authors` automatically refresh the members of the lab. It is useful at creation or when a lab evolves.
- `update_publications` makes a full refresh of the publications of a lab. All publications from lab members are considered (temporal filtering may be enabled later).

The simplest usable subclass of Lab is `ListLab`, which uses a list of names. For example, consider the executive committee of the LINCS lab plus Fabien Mathieu (GISMAP author).

In [6]:
from gismap.lab import ListLab

lab = ListLab(
    author_list=[
        "Tixeuil Sébastien",
        "Mathieu Fabien",
        "Kofman Daniel",
        "Baccelli François",
        "Noirie Ludovic",
        "Bassi Francesca",
    ],
    name="toy_example",
)
lab.update_authors()
lab.authors

{'tixeuil': LabAuthor(name='Tixeuil Sébastien', metadata=AuthorMetadata()),
 'fabien-mathieu': LabAuthor(name='Mathieu Fabien', metadata=AuthorMetadata()),
 'daniel-kofman': LabAuthor(name='Kofman Daniel', metadata=AuthorMetadata()),
 'francois-baccelli': LabAuthor(name='Baccelli François', metadata=AuthorMetadata()),
 'ludovic-noirie': LabAuthor(name='Noirie Ludovic', metadata=AuthorMetadata()),
 'francesca-bassi': LabAuthor(name='Bassi Francesca', metadata=AuthorMetadata())}

In [7]:
lab.update_publis()
len(lab.publications)

939

Labs can be saved to you don't have to re-update them all the time.

In [8]:
lab.dump(lab.name)

File toy_example.pkl.zst already exists! Use overwrite option to overwrite.


When you have a populated lab, you can use `lab2graph` to create the collaboration graph. That graph is a standalone HTML that can be displayed in a notebook or saved for inclusion in a web page (`iframe` is recommended then).

In [9]:
from gismap.lab import lab2graph
from IPython.display import display, HTML

display(HTML(lab2graph(lab)))

Few things about the generated graph:

- Singletons (authors with no co-publications) are discarded by default.
- Authors are represented with their initials unless some picture url is provided.
- You can hover an author to get her name. If you click, you have a modal with the list of publications.
- The width and length of an edge depend on the number of co-publications. If you click you have a modal with the list of co-publications.

## Make your own lab

The easiest way to manage a lab is to specify an internal method `_author_iterator` that returns Lab authors. 

To GISMAP a lab, you just need to specify that method. Most of the time, this is done by scrapping some Web page(s). See the references for examples.


## Example

The `LaasLab` class automatically builds a lab representation from `https://www.laas.fr/fr/equipes/*team_name*/`

In [10]:
from gismap.lab import LaasLab

display(HTML(lab2graph(LaasLab.load("sara"))))