# ArXiv Service Examples

This notebook demonstrates how to use the `ArxivService` to search for and download papers from arXiv.

## Setup

First, let's import the necessary modules and initialize the service.

In [None]:
import logging
from pathlib import Path
import sys

# Add parent directory to path
sys.path.append(str(Path.cwd().parent))

from src.services.arxiv_service import ArxivService

# Setup logging
logging.basicConfig(level=logging.INFO)

# Initialize the service
service = ArxivService(rate_limit_delay=3.0)

## Example 1: Search by Topic

Search for papers on a specific topic. You can use keywords, categories (e.g., `cat:cs.AI`), or any arXiv query.

In [None]:
# Search for recent papers on machine learning
papers = service.search_by_topic(
    topic="attention is all you need",
    max_results=5,
    exact=True
)

print(f"Found {len(papers)} papers\n")

for i, paper in enumerate(papers, 1):
    print(f"{i}. {paper.title}")
    print(f"   ID: {paper.arxiv_id}")
    print(f"   Authors: {', '.join([a.name for a in paper.authors[:3]])}...")
    print(f"   Published: {paper.published.strftime('%Y-%m-%d')}")
    print()

## Example 2: Access Paper Metadata

Each `Paper` object has extensive metadata you can access.

In [None]:
if papers:
    paper = papers[0]
    
    print(f"Paper ID: {paper.arxiv_id}")
    print(f"Short ID: {paper.short_id}")
    print(f"Year: {paper.year}")
    print(f"Primary Category: {paper.primary_category}")
    print(f"All Categories: {', '.join(paper.categories)}")
    print(f"\nAbstract:\n{paper.abstract[:300]}...")
    print(f"\nDOI: {paper.doi}")
    print(f"Journal Ref: {paper.journal_ref}")
    print(f"Status: {paper.status}")

## Example 3: Download a Paper

Download the PDF of a paper to your local machine.

In [None]:
if papers:
    paper_to_download = papers[0]
    download_dir = Path("downloads")
    
    print(f"Downloading: {paper_to_download.title}")
    print(f"To: {download_dir}")
    
    try:
        pdf_path = service.download_pdf(
            paper=paper_to_download,
            destination=download_dir
        )
        print(f"\nSuccessfully downloaded to: {pdf_path}")
        print(f"Paper status: {paper_to_download.status}")
        print(f"Downloaded at: {paper_to_download.downloaded_at}")
    except Exception as e:
        print(f"Error downloading: {e}")

## Advanced: Search by Category

You can search for papers in specific arXiv categories using the `cat:` prefix.

In [None]:
# Search for recent papers in Computer Science - Artificial Intelligence
papers = service.search_by_topic(
    topic="cat:cs.AI",
    max_results=5,
    exact=False
)

print(f"Found {len(papers)} papers in cs.AI\n")

for i, paper in enumerate(papers, 1):
    print(f"{i}. {paper.title}")
    print(f"   ID: {paper.arxiv_id}")
    print(f"   First Author: {paper.first_author}")
    print(f"   Publish Date: {paper.published}")
    print()