# Retrieval API DEMO

## Code Initialisation
Dependencies and environment initialisation. Ensure there's a `.env` file with your credentials in the same directory as this script. Use the `env` file as reference.

In [1]:
import os
import json
import requests as r
from IPython.display import Markdown
import utils as u
from dotenv import load_dotenv

load_dotenv()

True

## Constants

In [2]:
API_HOST = 'api.dowjones.com'
AUTH_HOST = 'accounts.dowjones.com'
CLIENT_ID = os.getenv('FACTIVA_CLIENTID')
USERNAME = os.getenv('FACTIVA_USERNAME')
PASSWORD = os.getenv('FACTIVA_PASSWORD')
AUTH_URL = f"https://{AUTH_HOST}/oauth2/v1/token"
GCP_PROJECT = os.getenv('GCP_PROJECT')
GCP_LOCATION = os.getenv('GCP_LOCATION')

## Authentication - Generate Bearer

For details about getting the `bearer_token`, please see the `utils.py` file.

In [3]:
bearer_token = u.get_bearer_token(CLIENT_ID, USERNAME, PASSWORD, AUTH_URL)
if bearer_token:
    display(Markdown(f"**Authentication Successful**: Bearer token OK for user {USERNAME.split('@')[0]}"))
else:
    display(Markdown('**Authentication Failed**: Cannot obtain the Bearer token'))

**Authentication Successful**: Bearer token OK for user 9ZZZ159100-svcaccount

In [4]:
req_headers = {
    "Authorization": f"Bearer {bearer_token}",
    "Content-Type": "application/json"
}

## Factiva Retrieval API Query

### Prompt
Add the prompt to be sent to the Retrieval API

In [5]:
# "What are NASA's planned missions to the Moon in 2025, and what are their primary objectives?"
# "美国宇航局计划在 2025 年进行哪些月球任务？其主要目标是什么?"
# "Summarise the latest earnings report from Microsoft Corp"
# "What are the perspectives of bitcoin in the next year?"
frapi_prompt = "Summarise the latest earnings report from Microsoft Corp"

Assemble the Retrieval API payload

In [6]:
frapi_query = {
  "data": {
    "attributes": {
      "response_limit": 3,
      "query": {
        "search_filters": [
          {
            "scope": "Language",
            "value": "en"
          }
        ],

        "value": frapi_prompt
      }
    },
    "id": "GenAIRetrievalExample",
    "type": "genai-content"
  }
}


## Send Query and Receive Chunks from the Retrieval API

In [7]:
chunks_resp = r.post(f"https://{API_HOST}/content/gen-ai/retrieve", json=frapi_query, headers=req_headers)

if chunks_resp.status_code == 200:
    print('Successfully retrieved chunks')
else:
    print(f"Request Failed: {chunks_resp.json()}")

Successfully retrieved chunks


## Print Chunks

In [8]:
chunks_list = chunks_resp.json()['data']

# u.print_full_chunks(chunks_list)
u.print_partial_chunks(chunks_list)

###                            Microsoft Corporation - Earnings Release FY25 Q2

**Public Companies News and Documents via PUBT** - 2025-01-29 - LCDVP00020250129el1t00rh3 - Lang: en

Access the original document here · supply or quality problems; · government enforcement under competition laws and new market regulation may limit ho...

---

### EXTRA: Soft Azure showing casts a cloud over strong Microsoft quarter

**Alliance News Global 500 Corporate** - 2025-01-29 - ALNEG00020250130el1t00001 - Lang: en

(Alliance News) - Microsoft Corp on Wednesday reported second quarter results which beat expectations but investors were left underwhelmed by a weaker...

---

###                            Microsoft earns 10% more but growth of its cloud business slows down

**CE NoticiasFinancieras** - 2025-01-29 - NFINCE0020250129el1t00dwo - Lang: en

Microsoft published on Wednesday results for the second quarter of its fiscal year with lights and shadows. The technology giant slightly exceeded mos...

---

## Retrieval API Conclusion

Up to this point it the seen functionalities are connected with the Retrieval API.

However, given that this service is an intermediate component in a full-stack solution, the below Test LLM steps and the [Read Article](2_read_article.ipynb) notebook are the two complementary avenues to get a full working solution.

## Test LLM

This is a downstream step that is only illustrative on how the response generation stage can be implemented. In the below example only a few articles are used to respond the prompt using an LLM hosted in **Google Cloud Vertex AI**. The passed request is built from the original prompt enhanced with the retrieved articles as gounding context.

The tested LLM is Google Gemini 2.0 Flash. Response generations took a few seconds for a context between 3K to 4K LLM tokens.

### Prerequisite

To successfully complete the following steps, it's important that there's an active GCloud authentication in the environment where this notbeook is executed from. This can be done by running the following command:

```bash
$ gcloud auth application-default login
```

For more information see [Set up ADC for a local development environment](https://cloud.google.com/docs/authentication/set-up-adc-local-dev-environment).

### Gemini Structured Prompt

In [9]:
instructions_text ="""
    You are an experienced business analyst that respond in a professional manner.
    Answer the query using only the information provided in the list of articles.
    If you use information from an article, cite it using squared brackets containing the index number.
    At the end of the answer, show a list of the cited articles ordered by index and under the title Cited Articles.
    Each cited article must be displayed in the following Markdown format:
    - [index] [headline - source_name - publication_date](url)
    Use Markdown for the output.
"""
article_list = []

for chunk in chunks_list:
    article = {
        'index': len(article_list) + 1,
        'url': f"https://dj.factiva.com/article?id=drn:archive.newsarticle.{str(chunk['meta']['original_doc_id']).strip()}",
        'source_name': str(chunk['meta']['source']['name']).strip(),
        'headline': str(chunk['attributes']['headline']['main']['text']).strip(),
        'publication_date': chunk['attributes']['publication_date'],
        'content': f"{str(chunk['attributes']['snippet']['content'][0]['text']).strip()} {str(chunk['attributes']['content'][0]['text']).strip()}"
    }
    article_list.append(article)

llm_prompt = {
    'query': frapi_prompt,
    'articles': article_list,
    "instructions": instructions_text.strip()
}

In [14]:
# u.print_full_llm_prompt(llm_prompt)
u.print_partial_llm_prompt(llm_prompt)

{
    "query": "Summarise the latest earnings report from Microsoft Corp",
    "articles": [
        {
            "index": 1,
            "url": "https://dj.factiva.com/article?id=drn:archive.newsarticle.LCDVP00020250129el1t00rh3",
            "source_name": "Public Companies News and Documents via PUBT",
            "headline": "Microsoft Corporation - Earnings Release FY25 Q2",
            "publication_date": "2025-01-29",
            "content": "Access the original document here \u00b7 supply or quality problems; \u00b7 government enforcement under competition laws and new market regulation may limit ho..."
        },
        {
            "index": 2,
            "url": "https://dj.factiva.com/article?id=drn:archive.newsarticle.ALNEG00020250130el1t00001",
            "source_name": "Alliance News Global 500 Corporate",
            "headline": "EXTRA: Soft Azure showing casts a cloud over strong Microsoft quarter",
            "publication_date": "2025-01-29",
            "content":

### Gemini Request

In [16]:
response = u.gemini_generate(llm_prompt, GCP_PROJECT, GCP_LOCATION)
display(Markdown(response))

Microsoft Corp's second quarter results, for the period ending December 31, 2024, showed a 10% increase in net income, reaching $24.10 billion, compared to $21.87 billion a year prior [1, 2, 3]. Diluted earnings per share climbed 10% to $3.23, beating the FactSet consensus of $3.11 [2, 3]. Revenue jumped 12% to $69.63 billion, also beating the FactSet consensus of $68.87 billion [2, 3].

Key highlights from the earnings report include:

*   **Cloud Performance:** Microsoft Cloud revenue was $40.9 billion, up 21% year-over-year [2]. Intelligent Cloud revenue reached $25.5 billion, a 19% increase [2, 3]. However, Azure's growth slowed to 31%, below the 32% forecast by analysts [2, 3]. Azure's growth included 13 points from AI Services, which grew 157% year-over-year [2].
*   **Productivity and Business Processes:** Revenue increased 14% to $29.4 billion, including a 15% growth in Microsoft 365 Commercial products and cloud services revenue. LinkedIn revenue increased 9% [2, 3].
*   **More Personal Computing:** Revenue remained relatively unchanged at $14.7 billion [2, 3].
*   **AI Growth:** Microsoft's AI business has surpassed an annual revenue run rate of $13 billion, up 175% year-over-year [2, 3].
*   **Future Outlook:** For the third quarter, Microsoft forecasts Azure revenue to grow between 31% and 32% in constant-currency [2]. They expect revenue in productivity and business processes to grow between 11% and 12% in constant-currency to between $29.4 billion and $29.7 billion. For Intelligent Cloud, they expect revenue to grow between 19% and 20% in constant-currency to between $25.9 billion and $26.2 billion. In More Personal Computing, they expect revenue to be $12.4 billion to $12.8 billion [2]. Microsoft continues to expect double-digit revenue and operating income growth for the full year [2].

Despite exceeding overall expectations, the slower-than-anticipated growth in Azure led to investor concerns [2, 3].

Cited Articles:

*   [1] [Microsoft Corporation - Earnings Release FY25 Q2 - Public Companies News and Documents via PUBT - 2025-01-29](https://dj.factiva.com/article?id=drn:archive.newsarticle.LCDVP00020250129el1t00rh3)
*   [2] [EXTRA: Soft Azure showing casts a cloud over strong Microsoft quarter - Alliance News Global 500 Corporate - 2025-01-29](https://dj.factiva.com/article?id=drn:archive.newsarticle.ALNEG00020250130el1t00001)
*   [3] [Microsoft earns 10% more but growth of its cloud business slows down - CE NoticiasFinancieras - 2025-01-29](https://dj.factiva.com/article?id=drn:archive.newsarticle.NFINCE0020250129el1t00dwo)
