# **SCRAPING USING LINKUP** & **RETAB**

We show you here how to scrape and structure **the latest Quarterly PR issued by NVIDIA** (Investor Relations page [here](https://investor.nvidia.com/financial-info/financial-reports/)) using:

- **[Linkup `Search` Endpoint](https://docs.linkup.so/pages/documentation/api-reference/endpoint/post-search)** to get the clean Markdow of the latest NVIDIA's PR.

- **[Retab](https://www.retab.com/)** to define a `schema` and `prompt` and generate precise structured output without LLMs' hallucinations from Firecrawl's clean Markdown. 

*Retab's platform enables to automatically generale - iterate - deploy our schemas & prompts into production. See the [Documentation here](https://docs.retab.com/overview/introduction)*

Built with 🩷 by retab.

### **INITIALIZATION**

Initiate your **API Keys** on **[Linkup](https://app.linkup.so/api-keys)** and **[Retab](https://www.retab.com/)** and save them in a `.env` file.

You should have:
```
LINKUP_API_KEY=***
RETAB_API_KEY=sk_retab_***


### **RUN**

In [1]:
# %pip install retab
# %pip install linkup-sdk

In [2]:
# GET THE LATEST PRESS RELEASE MARKDOWN WITH LINKUP
from dotenv import load_dotenv
from linkup import LinkupClient

load_dotenv()

client = LinkupClient()

response = client.search(
    query="Extract all the information from this PR: https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-first-quarter-fiscal-2026",
    depth="standard",
    output_type="searchResults",
    include_images=True
)

print(response)



You can use [Retab platform](https://www.retab.com/dashboard) to quickly generate a `schema` & `prompt` to extract the information with high accuracy.

You configuration is viewed as a unique `project_id` to be referenced below.

You can check the [Documentation here](https://docs.retab.com/core-concepts/Projects).

In [3]:
# STRUCTURE THE INFORMATION WITH RETAB
from retab import Retab

client = Retab()

with open("nvidia_pr_markdown.md", "w") as f:
    f.write(response.results[0].content)

completion = client.deployments.extract(
    project_id="proj_4M3KWJsuk8ivAn0GP-cMH",
    iteration_id="base-configuration",
    document="nvidia_pr_markdown.md"
)

print(completion)

RetabParsedChatCompletion(id='chatcmpl-Bw9IDOWNDiVdsBkmusloWZYq1UTkz', choices=[RetabParsedChoice(finish_reason='stop', index=0, logprobs=None, message=ParsedChatCompletionMessage(content='{"summary_type": "GAAP", "last_period": "Q1 FY26", "caption": "Q1 Fiscal 2026 Summary - GAAP ($ in millions, except earnings per share)", "rows": [{"Metrics": "Revenue", "values": ["$44,062"]}, {"Metrics": "Gross margin", "values": ["60.5%"]}, {"Metrics": "Operating income", "values": ["$21,638"]}, {"Metrics": "Operating expenses", "values": ["$5,030"]}, {"Metrics": "Net income", "values": ["$18,775"]}, {"Metrics": "Diluted earnings per share", "values": ["$0.76"]}]}', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None, parsed={'summary_type': 'GAAP', 'last_period': 'Q1 FY26', 'caption': 'Q1 Fiscal 2026 Summary - GAAP ($ in millions, except earnings per share)', 'rows': [{'Metrics': 'Revenue', 'values': ['$44,062']}, {'Metrics': 'Gross margin', 'values':

In [4]:
import json, textwrap

parsed_data = json.loads(completion.choices[0].message.content)
formatted_json = json.dumps(parsed_data, indent=2, ensure_ascii=False)
print(textwrap.indent(formatted_json, "  "))

  {
    "summary_type": "GAAP",
    "last_period": "Q1 FY26",
    "caption": "Q1 Fiscal 2026 Summary - GAAP ($ in millions, except earnings per share)",
    "rows": [
      {
        "Metrics": "Revenue",
        "values": [
          "$44,062"
        ]
      },
      {
        "Metrics": "Gross margin",
        "values": [
          "60.5%"
        ]
      },
      {
        "Metrics": "Operating income",
        "values": [
          "$21,638"
        ]
      },
      {
        "Metrics": "Operating expenses",
        "values": [
          "$5,030"
        ]
      },
      {
        "Metrics": "Net income",
        "values": [
          "$18,775"
        ]
      },
      {
        "Metrics": "Diluted earnings per share",
        "values": [
          "$0.76"
        ]
      }
    ]
  }
