# Programmatic ingestion via the REST API

The frontend normally drives the ingestion workflow by calling three FastAPI endpoints in sequence:

1. [`/upload`](../backend/score.py) registers a `Document` node and stores the original file under `backend/merged_files`.
2. [`/extract`](../backend/score.py) chunks the text, generates embeddings, writes graph triples, and updates the document status.
3. [`/post_processing`](../backend/score.py) materialises optional indices (vector + hybrid search, similarity graph, etc.).

When you already have a project that has been preprocessed into Markdown files the UI can be bypassed entirely. Run the backend as normal (for example with `uvicorn score:app --reload` from the `backend` directory or through Docker) and then call the helper located in `backend/scripts/bulk_ingest_via_api.py`.

### Use directly from Python (ideal for notebooks)

The helper exposes an `ingest_project_via_api` function so you can trigger ingestion from Python without touching the command line. This works well at the end of your preprocessing notebook:

```python
from backend.scripts.bulk_ingest_via_api import ingest_project_via_api

ingest_project_via_api(
    project_name="PROJECT_NAME",
    base_url="http://localhost:8000",
    uri="neo4j://neo4j:7687",
    user="neo4j",
    password="letmein123",
    database="neo4j",
)
```

All parameters fall back to the same environment variables that the UI relies on (`LLM_GRAPH_BUILDER_BASE_URL`, `NEO4J_URI`, etc.), so you can omit explicit arguments when they are already configured.

In [2]:
import sys
from pathlib import Path

repo_root = Path.cwd().resolve()
while repo_root != repo_root.parent and not (repo_root / "backend").exists():
    repo_root = repo_root.parent

sys.path.insert(0, str(repo_root))

In [9]:
from backend.scripts.bulk_ingest_via_api import ingest_project_via_api

PROJECT_NAME="moadchat"

ingest_project_via_api(
    project_name=PROJECT_NAME,
    base_url="http://localhost:8000",
    uri="neo4j://neo4j:7687",
    user="neo4j",
    password="letmein123",
    database="neo4j",
    model='openai_gpt_5_mini',
    token_chunk_size=200,
    chunk_overlap=50,
    chunks_to_combine=1
)

INFO - Uploading Componentization- Decomposing Monolithic LLM Responses into Manipulable Semantic Units V1.md
INFO - Extracting Componentization- Decomposing Monolithic LLM Responses into Manipulable Semantic Units V1.md
INFO - Running post processing tasks: enable_hybrid_search_and_fulltext_search_in_bloom, materialize_text_chunk_similarities
INFO - Completed ingestion for moadchat (1 files)


1