## Tutorial

This tutorial demonstrates how to utilize the `grobidmonkey` package to read GROBID-processed papers. We here use the published paper 'Influential Node Detection on Graph on Event Sequence' as an example, you can found the pdf at [here](https://github.com/com3dian/Grobidmonkey/tree/master/Document/resources/example.pdf) and the TEI-XML at [here](https://github.com/com3dian/Grobidmonkey/tree/master/Document/resources/example.pdf.tei.xml).

In [1]:
# import grobid reader
from grobidmonkey import reader

# select one of the three methods: monkey, lxml, x2d
monkeyReader = reader.MonkeyReader('monkey') # or 'lxml' or 'x2d'

# read paper outline
outline = monkeyReader.readOutline('resources/example.pdf.tei.xml', True)

Article
├── 1 Introduction
├── 2 Proposed Method
│   ├── 2.1 Graph on Event Sequence
│   ├── 2.2 Hawkes Process for Influence Measurement
│   └── 2.3 Soft K-Shell Algorithm
├── 3 Experiments and Results
│   ├── 3.1 SIR Simulation Results
│   ├── 3.2 Computational Complexity Results
│   └── 3.3 Soft Shell Decomposition
└── 4 Conclusion


The second argument allows you to print the outline while reading, you can also try:

In [2]:
outline = monkeyReader.readOutline('resources/example.pdf.tei.xml')

# outline is an anytree.rendertree object, to print it run
for pre, fill, node in outline:
    print("%s%s" % (pre, node.name))

Article
├── 1 Introduction
├── 2 Proposed Method
│   ├── 2.1 Graph on Event Sequence
│   ├── 2.2 Hawkes Process for Influence Measurement
│   └── 2.3 Soft K-Shell Algorithm
├── 3 Experiments and Results
│   ├── 3.1 SIR Simulation Results
│   ├── 3.2 Computational Complexity Results
│   └── 3.3 Soft Shell Decomposition
└── 4 Conclusion


The grobidmonkey reader is also capable of reading the entire essay as a dictionary, where each key represents section titles and the corresponding values are lists of section contents in paragraphs.

In [3]:
essay = monkeyReader.readEssay('resources/example.pdf.tei.xml')

for key, value in essay.items():
    print(key)
    for i, paragraph in enumerate(value):
        print(' * ' + paragraph[:20] + '...')
    print('-----')

Abstract
 * Numerous research ef...
-----
Introduction
 * Real-world networks ...
 * However, many real-w...
 * Although the already...
 * To fix the mentioned...
-----
Proposed Method
 * This section present...
-----
Graph on Event Sequence
 * This research propos...
 * 1: The graph is a di...
 * Why 'Graph on Event ...
 * models only consisti...
-----
Hawkes Process for Influence Measurement
 * The Hawkes process [...
 * Definition 1. A grap...
 * where u, v are the n...
-----
Soft K-Shell Algorithm
 * Accordingly, we prop...
-----
Experiments and Results
 * This study compares ...
-----
SIR Simulation Results
 * In this study, we co...
 * The SIR model is a w...
 * The transmission rat...
 * (a) The SIR results ...
 * Fig. 3. Following th...
 * The results of Soft ...
 * In particular, four ...
 * data set and the NCJ...
 * Table 1 presents the...
 * -Shell O(|V | 2 ) MD...
-----
Computational Complexity Results
 * In this paper, we al...
-----
Soft Shell Decomposition
 * Besides th