# 🎓 Scientific Article Metadata Extractor — Demo

This notebook demonstrates the full workflow of our Gemini-based scientific metadata extractor:
- Parsing academic PDFs
- Building prompts dynamically
- Extracting structured metadata using Gemini API
- Compiling results into a DataFrame

In [None]:
import os
from pathlib import Path
from pprint import pprint
import pandas as pd

# Load .env manually
from dotenv import load_dotenv
load_dotenv()

# Project imports
from src.metadata_extractor import extract_metadata
from src.batch_runner import extract_all

In [None]:
pdf_path = Path("data/ID_120_SANCHEZ_GORDON.pdf")
metadata = extract_metadata(pdf_path)
pprint(metadata)

In [None]:
data_dir = Path("data")
df = extract_all(data_dir)
df.head()

In [None]:
df.to_csv("output/demo_output.csv", index=False)
print("Saved to output/demo_output.csv")

## ✅ Summary

- Successfully extracted metadata from all provided PDFs.
- Gemini returned accurate structured JSON.
- Results were consolidated and exported to CSV.

This notebook can be submitted as part of the demo deliverables.