# Verrific Quick Demo

This notebook shows how to:
0. Get a TEI XML file from a PDF
1. Load references from an existing Grobid TEI XML file.
2. (Optionally) Enrich them using a running **biblio-glutton** service.
3. View a concise summary table of extracted vs. matched references.


## Setup
Run Grobid on a PDF file to get a TEI XML file.

In [1]:
# Imports & path setup
from verrific.core import Verrific
import pandas as pd
from pathlib import Path

# process a PDF file by GROBID
Verrific.process_pdf_dir(Path('/Users/jakub/dev/verrific/pdf'))

tei_path = Path('/Users/jakub/dev/verrific/tei/2012-an-open-large-scale-collaborative-effort-to-estimate-the-reproducibility-of-psychological-science.grobid.tei.xml')
assert tei_path.exists(), f'TEI file not found: {tei_path}'
tei_path

INFO - Loading configuration file from /Users/jakub/dev/verrific/.config.json
INFO - Configuration file loaded successfully
2025-09-09 10:22:48,697 - grobid_client.grobid_client - INFO - Logging configured - Level: INFO, Console: True, File: disabled
2025-09-09 10:22:48,716 - grobid_client.grobid_client - INFO - GROBID server http://devserver:8070 is up and running
2025-09-09 10:22:48,717 - grobid_client.grobid_client - INFO - Found 1 file(s) to process
2025-09-09 10:22:52,107 - grobid_client.grobid_client - INFO - Processing completed: 1 out of 1 files processed


PosixPath('/Users/jakub/dev/verrific/tei/2012-an-open-large-scale-collaborative-effort-to-estimate-the-reproducibility-of-psychological-science.grobid.tei.xml')

## 1. Parse TEI into structured references

In [6]:
v = Verrific.from_grobid_tei(tei_path)
print(f'Extracted {len(v.references)} references.')
# Peek at the first 3 references (raw truncated)
print('First 3 extracted references (raw):\n')
for r in v.references[:3]:
    print('- DOI:', r.doi, '| First Author:', r.first_author_surname, '\n  Title:', (r.title[:80] + '...') if r.title and len(r.title) > 80 else r.title)

Extracted 21 references.
First 3 extracted references (raw):

- DOI: None | First Author: Bacon 
  Title: Fr. Rogeri Bacon Opera quaedam hactenus inedita
- DOI: 10.1038/483531a | First Author: Begley 
  Title: Raise standards for preclinical cancer research
- DOI: 10.1177/1745691611429353 | First Author: Bertamini 
  Title: Bite-size science and its undesired side effects


## 2. (Optional) Enrich via biblio-glutton

In [3]:
# You can skip this cell if you don't have biblio-glutton running.
BASE_URL = 'http://devserver:8080'  # Change if needed (e.g., 'http://localhost:8080')
try:
    await v.enrich_with_biblio_glutton(base_url=BASE_URL)
    print('Enrichment complete.')
except Exception as e:
    print('Enrichment skipped / failed:', e)

Enrichment complete.


## 3. Summary table
A checkmark indicates a successful enrichment (HTTP 200 with JSON) and absence of an error marker.

In [4]:
summary_df = v.summary()
summary_df

Unnamed: 0,DOI,Title,First Author Surname,Raw,Matched
0,,Fr. Rogeri Bacon Opera quaedam hactenus inedita,Bacon,"Bacon, R. (1859). Fr. Rogeri Bacon Opera quaed...",✅
1,10.1038/483531a,Raise standards for preclinical cancer research,Begley,"Begley, C. G., & Ellis, L. M. (2012). Raise st...",✅
2,10.1177/1745691611429353,Bite-size science and its undesired side effects,Bertamini,"Bertamini, M., & Munafò, M. R. (2012). Bite-si...",✅
3,,ESP and psychokinesis. A philosophical examina...,Braude,"Braude, S. E. (1979). ESP and psychokinesis. A...",✅
4,,Changing order,Collins,"Collins, H. M. (1985). Changing order. London,...",✅
5,10.1037/h0076157,Consequences of prejudice against the null hyp...,Greenwald,"Greenwald, A. G. (1975). Consequences of preju...",✅
6,10.3389/fncom.2012.00008,Tracking replicability as a method of post-pub...,Hartshorne,"Hartshorne, J. K., & Schachner, A. (2012). Tra...",✅
7,10.1371/journal.pmed.0020124,Why most published research findings are false,Ioannidis,"Ioannidis, J. P. A. (2005). Why most published...",✅
8,10.1126/sci-ence.334.6060.1225,"Again, and again, and again",Jasny,"Jasny, B. R., Chin, G., Chong, L., & Vignieri,...",⚠️
9,10.1177/0956797611430953,Measuring the prevalence of questionable resea...,John,"John, L., Loewenstein, G., & Prelec, D. (2012)...",✅
