In [None]:
# %pip install retab

`Projects` provide a systematic way to test and validate your extraction schemas against known ground truth data.

**They combines the best of all worlds, from schema generation, reasoning, consensus, to deployment.**

Once optimized through iterations, a project enables you to define a precise schema, get high accuracy outputs, and optimize the costs by balancing accuracy against processing costs of the models for your use-case.

We wanted to provide you with an **easy deployment** solution, defined mainly by 2 parameters: `project_id` and `iteration_id`. You can integrate your project into you codebase, or by using low-code platforms such as [Dify](https://cloud.dify.ai/apps) or [n8n](https://n8n.io/).

**More information on Projects and Deployments in the [documentation](https://docs.retab.com/core-concepts/Projects).**

### **TYPICAL WORKFLOW**

1. Go on the [retab platform](https://www.retab.com/dashboard) > upload sample documents > add description to precise the information to extract (optional)
2. Validate the Schema > upload test documents that constitute the Dataset > anotate & define your "ground truth"
3. Go to the *Evaluation* section > create a new Iteration by changing some parameters to get better accuracy

Click on Deploy and get your `project_id` and `iteration_id`.

In [5]:
from dotenv import load_dotenv
from retab import Retab

load_dotenv()

client = Retab()

completion = client.deployments.extract(
    project_id="eval_qMRNjHxg67yxCXDH4l797",
    iteration_id="eval_iter_nB0PsLkVUBSjQH1knVXXu", # We're using here the best iteration we've built
    document="../assets/code/NVIDIA-PR-Q1-2026.pdf"
)

print(completion)

RetabParsedChatCompletion(id='chatcmpl-Bw9pWBgjOYoy4y68tBJ55RkXKhWEc', choices=[RetabParsedChoice(finish_reason='stop', index=0, logprobs=None, message=ParsedChatCompletionMessage(content='{"fund_name": "NVIDIA Corporation", "fund_code": "NVDA", "as_of_date": "2025-04-27", "risk_classification": "Medium", "key_facts": {"inception_date": "2024-06-07", "distributions": "Quarterly", "nav": 44.062, "aggregate_assets": 125.254, "mer": 0, "trailing_yield": 0, "also_available_through": "", "benchmark": "", "fund_status": "", "morningstar_category": ""}, "performance": {"start_date": "2024-04-28", "end_date": "2025-04-27", "start_value": 26.044, "end_value": 44.062, "annualized_return": 69, "total_distributions": 0, "calendar_returns": [], "standard_period_returns": [{"period": "1 yr", "value": 69}]}, "total_holdings": 0, "top_holdings_aggregate": 0, "footnotes": ["All per share amounts presented herein have been retroactively adjusted to reflect NVIDIA\\u2019s ten-for-one stock split, which w

In [7]:
import json

parsed = json.loads(completion.choices[0].message.content)  # Convert the string to a Python dictionary
print(json.dumps(parsed, indent=2, ensure_ascii=False)) # Pretty-print with indentation

{
  "fund_name": "NVIDIA Corporation",
  "fund_code": "NVDA",
  "as_of_date": "2025-04-27",
  "risk_classification": "Medium",
  "key_facts": {
    "inception_date": "2024-06-07",
    "distributions": "Quarterly",
    "nav": 44.062,
    "aggregate_assets": 125.254,
    "mer": 0,
    "trailing_yield": 0,
    "also_available_through": "",
    "benchmark": "",
    "fund_status": "",
    "morningstar_category": ""
  },
  "performance": {
    "start_date": "2024-04-28",
    "end_date": "2025-04-27",
    "start_value": 26.044,
    "end_value": 44.062,
    "annualized_return": 69,
    "total_distributions": 0,
    "calendar_returns": [],
    "standard_period_returns": [
      {
        "period": "1 yr",
        "value": 69
      }
    ]
  },
  "total_holdings": 0,
  "top_holdings_aggregate": 0,
  "footnotes": [
    "All per share amounts presented herein have been retroactively adjusted to reflect NVIDIA’s ten-for-one stock split, which was effective June 7, 2024.",
    "Certain statements in