Testing connection and API keys

In [1]:
from dotenv import load_dotenv
load_dotenv()

import os

import sys
sys.path.append("..")

from openai import AsyncOpenAI
from anthropic import AsyncAnthropic
from exa_py import AsyncExa

Test OpenAI

In [7]:
client = AsyncOpenAI()

response = await client.responses.create(
  model="gpt-4.1",
  input="Tell me a three sentence bedtime story about a unicorn."
)

print(response.output_text)

In a shimmering forest, a gentle unicorn named Luna sprinkled stardust wherever she pranced, turning every flower silver. One night, she discovered a lost firefly and guided it home with the glowing light of her magical horn. As they reached the firefly’s family, Luna was gifted a twinkle that made her dreams sparkle brighter than ever before.


In [8]:
response.usage

ResponseUsage(input_tokens=18, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=72, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=90)

Test Anthropic

In [9]:
client = AsyncAnthropic()

message = await client.messages.create(
    max_tokens=1024,
    messages=[{
        "content": "Tell me a three sentence bedtime story about a unicorn.",
        "role": "user",
    }],
    model="claude-sonnet-4-5-20250929",
)

print(message.content[0].text)

Once upon a time, a gentle unicorn with a shimmering silver mane wandered through a moonlit forest, touching flowers with her glowing horn to help them bloom in the darkness. As she reached a peaceful meadow, she lay down in the soft grass and hummed a magical lullaby that made the stars twinkle brighter. Soon all the creatures of the forest drifted off to sleep, wrapped in sweet dreams carried on the unicorn's song.


In [10]:
message.usage

Usage(cache_creation=CacheCreation(ephemeral_1h_input_tokens=0, ephemeral_5m_input_tokens=0), cache_creation_input_tokens=0, cache_read_input_tokens=0, input_tokens=20, output_tokens=102, server_tool_use=None, service_tier='standard')

Test EXA

In [2]:
exa = AsyncExa(api_key=os.environ.get("EXA_API_KEY"))

In [None]:
result = await exa.search(
  "hottest AI startups",
  num_results=2
)

In [39]:
result.results[0].title

'Hottest AI Startups in the USA - LinkedIn'

In [40]:
result.results[0].url

'https://www.linkedin.com/pulse/hottest-ai-startups-usa-aitechsupports-ouwfc'

In [37]:
result.results[0].published_date

'2025-12-04T12:04:50.000Z'

In [43]:
print(result.results[0].text[:500])

Hottest AI Startups in the USA
Agree & Join LinkedIn
By clicking Continue to join or sign in, you agree to LinkedIn’s[User Agreement](https://www.linkedin.com/legal/user-agreement?trk=linkedin-tc_auth-button_user-agreement),[Privacy Policy](https://www.linkedin.com/legal/privacy-policy?trk=linkedin-tc_auth-button_privacy-policy), and[Cookie Policy](https://www.linkedin.com/legal/cookie-policy?trk=linkedin-tc_auth-button_cookie-policy).
``````````````
![]()## Sign in to view more content
Create y


In [48]:
answer_with_text = await exa.answer(
  query = "What are the hottest AI startups?",
  model = "exa-pro",
  text = True
)

In [51]:
answer_with_text.answer

'As of December 22, 2025, some of the hottest AI startups include Thinking Machines Lab, valued at $12 billion, and Glean, valued at $7.2 billion ([Reuters](https://www.reuters.com/technology/mira-muratis-ai-startup-thinking-machines-raises-2-billion-a16z-led-round-2025-07-15), [Reuters](https://www.reuters.com/technology/ai-company-glean-hits-72-billion-valuation-latest-funding-round-2025-06-10)). Other notable companies mentioned as "hottest" or "most promising" in 2025 include AI Squared, Morphos AI, and Writer ([CRN](https://www.crn.com/news/ai/2025/the-10-hottest-ai-startup-companies-of-2025-so-far)).'

EXA Search and Contents is deprecated --> search is the one to use

In [19]:
query = "agentic memory for financial, private equity, due diligence use cases"

result = await exa.search(
    query=query,
    type="deep",
    moderation=False,
    num_results=100,
    contents={
        "text": {"maxCharacters": 100_000},
        "summary": True,
        #"highlights": True
    },
)

In [20]:
result.__dict__.keys()

dict_keys(['results', 'resolved_search_type', 'auto_date', 'context', 'statuses', 'cost_dollars'])

In [21]:
result.results[0].__dict__.keys()

dict_keys(['url', 'id', 'title', 'score', 'published_date', 'author', 'image', 'favicon', 'subpages', 'extras', 'text', 'summary', 'highlights', 'highlight_scores'])

In [17]:
result.cost_dollars

CostDollars(total=0.085, search={'neural': 0.025}, contents={'text': 0.03, 'summary': 0.03})

In [18]:
result



Test EXA search method with saving of content to data path

In [5]:
from components.agents.research_agents import search_with_exa
from data.content_manager import ContentManager
content_manager = ContentManager(base_path="../data/content")

INFO:data.content_saver:Created folder: ../data/content/2025-12-25


In [6]:
results = await search_with_exa(
    focus_key="agentic memory",
    focus_description="for financial, private equity, due diligence use cases",
    num_results=5,
    content_saver=content_saver
)

INFO:components.agents.research_agents:Running Exa research agent for focus area: agentic memory
INFO:httpx:HTTP Request: POST https://api.exa.ai/search "HTTP/1.1 200 OK"
INFO:data.content_saver:Saved markdown: ../data/content/2025-12-25/197413759250c467/content.md
INFO:data.content_saver:Saved metadata: ../data/content/2025-12-25/197413759250c467/metadata.json
INFO:components.agents.research_agents:Saved content for https://pathway.com/blog/ai-agents-finance-due-diligence
INFO:data.content_saver:Saved markdown: ../data/content/2025-12-25/0958fa00e1d8f3a0/content.md
INFO:data.content_saver:Saved metadata: ../data/content/2025-12-25/0958fa00e1d8f3a0/metadata.json
INFO:components.agents.research_agents:Saved content for https://everworker.ai/blog/agentic-ai-use-cases-financial-services-companies
INFO:data.content_saver:Saved markdown: ../data/content/2025-12-25/a69103629229dc80/content.md
INFO:data.content_saver:Saved metadata: ../data/content/2025-12-25/a69103629229dc80/metadata.json
IN

In [8]:
content_saver.lookup_url("https://beam.ai/use-cases/private-equity")

PosixPath('../data/content/2025-12-25/5942a2e9e46b7b64')

EXA WebSets

https://docs.exa.ai/websets/api/get-started

In [60]:
from exa_py.websets.types import CreateWebsetParameters, CreateEnrichmentParameters
from exa_py import Exa

exa = Exa(os.getenv('EXA_API_KEY'))

In [77]:
preview = exa.websets.preview(params={
    'query': 'Latest AI research papers that are applicable to private equity'
})

In [78]:
preview.search.entity

WebsetResearchPaperEntity(type='research_paper')

In [79]:
for e, i in enumerate(preview.search.criteria):
    print(f"{e+1}: {i.description}")
    print("===" * 30)

1: Research paper focused on artificial intelligence (AI) applications or relevance to private equity
2: Published during 2024 or 2025


arXiv

In [None]:
from components.tools.arxiv import search_arxiv, print_results, get_arxiv_categories

In [4]:
arxiv_categories = get_arxiv_categories()
print(arxiv_categories)

['cs.AI', 'cs.CE', 'cs.CL', 'cs.DB', 'cs.DC', 'cs.GL', 'cs.GT', 'cs.HC', 'cs.IR', 'cs.LG', 'cs.MA', 'cs.NE', 'econ.GN', 'math.ST', 'q-fin.CP', 'q-fin.GN', 'q-fin.MF', 'q-fin.PM', 'q-fin.RM', 'q-fin.ST', 'q-fin.TR', 'stat.ML']


In [5]:
# =============================================================================
# Example: Search across multiple categories (AI, NLP, Finance)
# =============================================================================
feed = search_arxiv(
    categories=arxiv_categories,
    max_results=5,
    sort_by="submittedDate",
    sort_order="descending"
)
print_results(feed)

Query URL: http://export.arxiv.org/api/query?search_query=%28cat%3Acs.AI+OR+cat%3Acs.CE+OR+cat%3Acs.CL+OR+cat%3Acs.DB+OR+cat%3Acs.DC+OR+cat%3Acs.GL+OR+cat%3Acs.GT+OR+cat%3Acs.HC+OR+cat%3Acs.IR+OR+cat%3Acs.LG+OR+cat%3Acs.MA+OR+cat%3Acs.NE+OR+cat%3Aecon.GN+OR+cat%3Amath.ST+OR+cat%3Aq-fin.CP+OR+cat%3Aq-fin.GN+OR+cat%3Aq-fin.MF+OR+cat%3Aq-fin.PM+OR+cat%3Aq-fin.RM+OR+cat%3Aq-fin.ST+OR+cat%3Aq-fin.TR+OR+cat%3Astat.ML%29&start=0&max_results=5&sortBy=submittedDate&sortOrder=descending

Total results: 516269
Showing: 5 entries


[1] LongVideoAgent: Multi-Agent Reasoning with Long Videos
    ID: http://arxiv.org/abs/2512.20618v1
    Published: 2025-12-23T18:59:49Z
    Authors: Runtao Liu, Ziyi Liu, Jiaqi Tang + 4 more
    Categories: cs.AI, cs.CV, cs.LG, cs.MA
    Summary: Recent advances in multimodal LLMs and systems that use tools for long-video QA point to the promise of reasoning over hour-long episodes. However, many methods still compress content into lossy summa...
----------------------

In [8]:
# =============================================================================
# Example: Keyword search
# =============================================================================
feed = search_arxiv(
    search_query="\"private equity\" AND \"agent\"", #"LLM OR \"large language model\""
    #categories=arxiv_categories,
    max_results=5,
    sort_by="submittedDate",
    sort_order="descending"
)
print_results(feed)

Query URL: http://export.arxiv.org/api/query?search_query=all%3A%22private+equity%22+AND+%22agent%22&start=0&max_results=5&sortBy=submittedDate&sortOrder=descending

Total results: 2
Showing: 2 entries


[1] Finance Agent Benchmark: Benchmarking LLMs on Real-world Financial Research Tasks
    ID: http://arxiv.org/abs/2508.00828v1
    Published: 2025-05-20T18:22:10Z
    Authors: Antoine Bigeard, Langston Nashold, Rayan Krishnan + 1 more
    Categories: cs.CE
    Summary: Artificial Intelligence (AI) technology has emerged as a transformative force in financial analysis and the finance industry, though significant questions remain about the full capabilities of Large L...
--------------------------------------------------------------------------------

[2] Solving the Equity Risk Premium Puzzle and Inching Towards a Theory of Everything
    ID: http://arxiv.org/abs/1604.04872v7
    Published: 2016-04-17T14:07:15Z
    Authors: Ravi Kashyap
    Categories: q-fin.GN, econ.GN
    Summary: Th

Test sending an email

In [1]:
from dotenv import load_dotenv
load_dotenv()

import os

import sys
sys.path.append("..")

In [2]:
from components.email.gmail_sender import GmailSender
sender = GmailSender(
    credentials_path="../gmail_client_secret.json",
    token_path="../gmail_token.json"
)

✓ Gmail API authenticated successfully


In [None]:
to = os.getenv("EMAIL_TO")
subject = "Test email from API"
markdown_content = """
# Research Digest

## This Week's Highlights

**Important findings:**

- First item with *emphasis*
- Second item with **strong emphasis**
- Third item

### Key Takeaways

1. Numbered list item one
2. Numbered list item two

[Click here for more details](https://example.com)

---

*Sent via Research Digest*
"""
sender.send_email(to, subject, markdown_content, markdown_mode=True)

Test JINA reader

In [13]:
import httpx

url = "https://www.imdb.com/title/tt28083456/"

async with httpx.AsyncClient(timeout=60) as client:
    response = await client.post(
        "https://r.jina.ai/",
        headers={
            "Authorization": f"Bearer {os.getenv('JINA_API_KEY')}",
            "Accept": "application/json",
            "Content-Type": "application/json",
        },
        json={"url": url},
    )

response.raise_for_status()
data = response.json()

In [16]:
data['meta']

{'usage': {'tokens': 17381}}