# How to get MRN News Analytics via WebSocket API with Python - RTO Connection

**Last Update**: Nov 2025

## Prerequisite

This article/notebook supports all [Machine Readable News Analytics (TRNA)](https://www.lseg.com/en/data-analytics/financial-news-service/machine-readable-news) data data consumption from The Real-Time with the WebSocket API. However, the data model description is focusing on the News Analytics (TRNA) data processing only. 

I highly recommend you check the  [WebSocket API Tutorials](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/websocket-api/tutorials) page if you are not familiar with WebSocket API. The Tutorials page provides a step-by-step guide (connect, log in, request data, parse data, etc) for developers who are interested in developing a WebSocket application to consume real-time data from  Real-Time. 

If you are focusing on the Real-Time News, please check the following GitHub repositories
- [LSEG-API-Samples/Example.WebSocketAPI.Python.MRN](https://github.com/LSEG-API-Samples/Example.WebSocketAPI.Python.MRN).
- [LSEG-API-Samples/Example.WebSocketAPI.Python.MRN.RTO](https://github.com/LSEG-API-Samples/Example.WebSocketAPI.Python.MRN.RTO)

The notebook file provides a step-by-step guide (connect, log in, request data, parse data, etc) for developers who are interested in developing a WebSocket application to consume real-time data from  Real-Time - Optimized (RTO) using the Version 1 Authentication (Machine-ID). You can find how to migrate your Version 1 Authentication application to the Version 2 Authentication (Service ID) from the [Real-Time WebSocket API: The Real-Time Optimized Version 2 Authentication Migration Guide](https://developers.lseg.com/en/article-catalog/article/webSocket-api-rto-v2-authentication-migration-guide) and [Migrating the WebSocket Machine Readable News Application to Version 2 Authentication](https://developers.lseg.com/en/article-catalog/article/migrating-the-websocket-machine-readable-news-to-rto-v2) articles.

If you are using the deployed  Real-Time Distribution System (RTDS), please check the [mrn_trna_notebook_app.ipynb](./mrn_trna_notebook_app.ipynb) notebook file.

Please contact your LSEG's representative to help you to access the RTO account, and services. You can find more detail regarding the RTO access credentials set up from the *Getting Started for Machine ID* section of the [Getting Start with Data Platform article](https://developers.lseg.com/en/article-catalog/article/getting-start-with-refinitiv-data-platform) article.

## News Analytics and Machine Readable News Overview

[Machine Readable News Analytics (TRNA)](https://www.lseg.com/en/data-analytics/financial-news-service/machine-readable-news) provides real-time numerical insight into the events on multiple news sources, in a format that can be directly consumed by algorithmic trading systems. TRNA enables algorithms to exploit the power of news to seize opportunities, capitalize on market inefficiencies, and manage event risk.

TRNA is published via the Real-Time Platform as part of Machine Readable News (MRN) data model. MRN is an advanced service for automating the consumption and systematic analysis of news. It delivers deep historical news archives, ultra-low latency structured news, and news analytics directly to your applications. 

### MRN Data behavior

RN is published over the Real-Time platform using an Open Message Model (OMM) envelope in News Text Analytics domain messages. The Real-time News content set is made available over MRN_STORY RIC. The content data is contained in a FRAGMENT field that has been compressed and potentially fragmented across multiple messages, to reduce bandwidth and message size.

A FRAGMENT field has a different data type based on a connection type:

- RSSL connection (RTSDK [C++](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/rt-sdk-cc)/[C#](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/rt-sdk-csharp)/[Java](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/rt-sdk-java)): BUFFER type
- WebSocket connection: Base64 ASCII string

The data goes through the following series of transformations:

1. The core content data is a UTF-8 JSON string
2. This JSON string is compressed using gzip
3. The compressed JSON is split into several fragments (BUFFER or Base64 ASCII string) which each fit into a single update message
4. The data fragments are added to an update message as the FRAGMENT field value in a FieldList envelope


<img src="images/trna_process.png"/>

Therefore, to parse the core content data, the application will need to reverse this process. The WebSocket application also needs to convert a received Base64 string in a FRAGMENT field to bytes data before further process this field. This application uses Python [base64](https://docs.python.org/3/library/base64.html) and [zlib](https://docs.python.org/3/library/zlib.html) modules to decode Base64 string and decompress JSON string. 

### MRN Data model

Five fields, as well as the RIC itself, are necessary to determine whether the entire item has been received in its various fragments and how to concatenate the fragments to reconstruct the item:
* MRN_SRC: identifier of the scoring/processing system that published the FRAGMENT
* GUID: a globally unique identifier for the data item. All messages for this data item will have the same GUID values.
* FRAGMENT: compressed data item fragment, itself
* TOT_SIZE: total size in bytes of the fragmented data
* FRAG_NUM: sequence number of fragments within a data item. This is set to 1 for the first fragment of each item published and is incremented for each subsequent fragment for the same item.

A single MRN data item publication is uniquely identified by the combination of RIC, MRN_SRC, and GUID.

#### Fragmentation
For a given RIC-MRN_SRC-GUID combination, when a data item requires only a single message, then TOT_SIZE will equal the number of bytes in the FRAGMENT, and FRAG_NUM will be 1.

When multiple messages are required, then the data item can be deemed as fully received once the sum of the number of bytes of each FRAGMENT equals TOT_SUM. The consumer will also observe that all FRAG_NUM range from 1 to the number of the fragment, with no intermediate integers, skipped. In other words, a data item transmitted over three messages will contain FRAG_NUM values of 1, 2, and 3.

#### Compression
The FRAGMENT field is compressed with gzip compression, thus requiring the consumer to decompress to reveal the JSON plain-text data in that FID.

When an MRN data item is sent in multiple messages, all the messages must be received and their FRAGMENTs concatenated before being decompressed. In other words, the FRAGMENTs should not be decompressed independently of each other.

The decompressed output is encoded in UTF-8 and formatted as JSON.

If you are not familiar with MRN concept, please visit the following resources which will give you a full explanation of the MRN data model and implementation logic:

* [Webinar Recording: Introduction to Machine Readable News](https://developers.lseg.com/news#news-accordion-nid-12045)
* [Introduction to Machine Readable News (MRN) with Enterprise Message API (EMA)](https://developers.lseg.com/en/article-catalog/article/introduction-machine-readable-news-mrn-elektron-message-api-ema).
* [MRN Data Models and Refinitiv Real-Time SDK Implementation Guide](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/rt-sdk-java/documentation#mrn-data-models-and-elektron-implementation-guide).
* [Introduction to Machine Readable News with WebSocket API](https://developers.lseg.com/en/article-catalog/article/introduction-machine-readable-news-elektron-websocket-api-refinitiv).
* [How to get MRN News Analytics Data via WebSocket API](https://developers.lseg.com/en/article-catalog/article/how-to-get-mrn-news-analytics-data-via-elektron-websocket-api).

In [1]:
# #uncomment if you do not have requests, websocket-client (version 1.2.1), and python-dotenv installed\n
# #Install requests, websocket-client, and python-dotenv packages in a current Jupyter kernal\n

import sys

# !{sys.executable} -m pip install requests
# !{sys.executable} -m pip install websocket-client
# !{sys.executable} -m pip install python-dotenv

In [2]:
import time
import getopt
import socket
import json
import websocket
import threading
from threading import Thread, Event
import base64
import zlib
import requests
import os
from dotenv import load_dotenv
import datetime

%load_ext dotenv

# Use find_dotenv to locate the file
%dotenv

You should save a text file with **filename** `.env` or OS Environment Variables having the following configurations:

```
# RTO Credentials
RTO_USERNAME=Machine-ID
RTO_PASSWORD=RTO-Password
RTO_CLIENTID=App-Key

# RDP-RTO Core Configurations
RDP_BASE_URL=https://api.refinitiv.com
RDP_AUTH_URL=/auth/oauth2/v1/token
RDP_DISCOVERY_URL=/streaming/pricing/v1/
```

### <a id="whatis_rdp"></a>What is Delivery (RDP) APIs?

The [Delivery Platform APIs](https://developers.lseg.com/en/api-catalog/refinitiv-data-platform/refinitiv-data-platform-apis) (aka Data Platform or RDP) provide various LSEG data and content for developers via easy to use Web-based API.

RDP APIs give developers seamless and holistic access to all of the LSEG content such as Historical Pricing, Environmental Social and Governance (ESG), News, Research, etc and commingled with their content, enriching, integrating, and distributing the data through a single interface, delivered wherever they need it. 

The RTO utilizes RDP APIs authentication and service discovery services. For more detail regarding Delivery Platform, please see the following APIs resources: 
- [Quick Start](https://developers.lseg.com/en/api-catalog/refinitiv-data-platform/refinitiv-data-platform-apis/quick-start) page.
- [Tutorials](https://developers.lseg.com/en/api-catalog/refinitiv-data-platform/refinitiv-data-platform-apis/tutorials) page.
- [RDP APIs: Introduction to the Request-Response API](https://developers.lseg.com/en/api-catalog/refinitiv-data-platform/refinitiv-data-platform-apis/tutorials#introduction-to-the-request-response-api) page.
- [RDP APIs: Authorization - All about tokens](https://developers.lseg.com/en/api-catalog/refinitiv-data-platform/refinitiv-data-platform-apis/tutorials#authorization-all-about-tokens) page.

### <a id="whatis_rto"></a>What is the Real-Time - Optimized?

As part of the Delivery Platform, [the Real-Time - Optimized (RTO)](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/websocket-api/quick-start#connecting-to-refinitiv-real-time-optimized) (formerly known as ERT in Cloud) gives you access to best in class Real-Time market data delivered in the cloud.  Real-Time - Optimized is a new delivery mechanism for RDP, using the AWS (Amazon Web Services) cloud. Once a connection to RDP is established using Real-Time - Optimized, data can be retrieved using [Websocket API for Pricing Streaming and Real-Time Services](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/websocket-api) aka WebSocket API.

For more detail regarding Refinitiv Real-Time - Optimized, please see the following APIs resources: 

- [WebSocket API RTO Quick Start (Authentication V1)](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/websocket-apiquick-start#connecting-to-refinitiv-real-time-optimized) page.
- [WebSocket API RTO Quick Start (Authentication V2)](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/websocket-api/quick-start#connecting-to-real-time-optimized-v2) page.
- [WebSocket API Tutorials](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/websocket-api/tutorials#connect-to-refinitiv-real-time-optimized) page.
- [How to Setup Amazon EC2 Instance for the Real-Time Optimized (RTO)](https://developers.lseg.com/en/article-catalog/article/how-to-setup-amazon-ec2-instance-for-rto) article.

#### RDP and RTO Endpoints

The RTO application needs to get authentication information from the RDP Auth Service. The application can get the Refinitiv Real-Time service endpoints dynamically from the RDP Service Discovery service.

In [3]:
base_url = os.getenv('RDP_BASE_URL')
auth_url = base_url +  os.getenv('RDP_AUTH_URL');
discovery_url = base_url +  os.getenv('RDP_DISCOVERY_URL');

#### RTO Credentials

The Refinitiv Real-Time - Optimized (RTO) needs the following access credentials:
- Machine-ID
- RTO Password
- ClientID (aka AppKey)


In [4]:
user= os.getenv('RTO_USERNAME') 
clientid= os.getenv('RTO_CLIENTID') 
password= os.getenv('RTO_PASSWORD') 

# RTO variables
sts_token = ''
refresh_token = ''
original_expire_time = '0'
expire_time = '0'
client_secret = ''
scope = 'trapi.streaming.pricing.read'
region = 'us-east-1'
service = 'ELEKTRON_DD'

In [5]:
# Refinitiv Real-Time Advanced Distribution Server connection variables
hostList = []
port = '15000'
app_id = '256'
position = socket.gethostbyname(socket.gethostname())
login_id = 1

In [6]:
# WebSocket connections Variables

web_socket_app = None
web_socket_open = False
_news_envelopes = []

# keeps decompress news JSON messaage
_trna_messages = []

### RDP Token retrival

In order to connect to RTO, you need to retrieve token from the Token API first. This part of the code will take care of then token retrieval for you.

In [7]:
def get_sts_token(current_refresh_token, url=None):
    """
        Retrieves an authentication token.
        :param current_refresh_token: Refresh token retrieved from a previous authentication, used to retrieve a
        subsequent access token. If not provided (i.e. on the initial authentication), the password is used.
    """

    if url is None:
        url = auth_url

    if not current_refresh_token:  # First time through, send password
        data = {'username': user, 'password': password, 'client_id': clientid, 'grant_type': 'password', 'takeExclusiveSignOnControl': True,
                'scope': scope}
        print('Sending authentication request with password to {} ...'.format(url))
        #print(data)
    else:  # Use the given refresh token
        data = {'username': user, 'client_id': clientid, 'refresh_token': current_refresh_token, 'grant_type': 'refresh_token'}
        print("Sending authentication request with refresh token to {} ... ".format(url))
    if client_secret != '':
        data['client_secret'] = client_secret;
        
    try:
        # Request with auth for https protocol    
        r = requests.post(url,
                          headers={'Accept': 'application/json'},
                          data=data,
                          auth=(clientid, client_secret),
                          verify=True,
                          allow_redirects=False)

    except requests.exceptions.RequestException as e:
        print('Refinitiv Data Platform authentication exception failure:', e)
        return None, None, None

    if r.status_code == 200:
        auth_json = r.json()
        print('Refinitiv Data Platform Authentication succeeded. RECEIVED:')
        print(json.dumps(auth_json, sort_keys=True, indent=2, separators=(',', ':')))

        return auth_json['access_token'], auth_json['refresh_token'], auth_json['expires_in']
    elif r.status_code == 301 or r.status_code == 302 or r.status_code == 307 or r.status_code == 308:
        # Perform URL redirect
        print('Refinitiv Data Platform authentication HTTP code:', r.status_code, r.reason)
        new_host = r.headers['Location']
        if new_host is not None:
            print('Perform URL redirect to ', new_host)
            return get_sts_token(current_refresh_token, new_host)
        return None, None, None
    elif r.status_code == 400 or r.status_code == 401:
        # Retry with username and password
        print('Refinitiv Data Platform authentication HTTP code:', r.status_code, r.reason)
        if current_refresh_token:
            # Refresh token may have expired. Try using our password.
            print('Retry with username and password')
            return get_sts_token(None)
        return None, None, None
    elif r.status_code == 403 or r.status_code == 451:
        # Stop retrying with the request
        print('Refinitiv Data Platform authentication HTTP code:', r.status_code, r.reason)
        print('Stop retrying with the request')
        return None, None, None
    else:
        # Retry the request to Refinitiv Data Platform 
        print('Refinitiv Data Platform authentication HTTP code:', r.status_code, r.reason)
        print('Retry the request to Refinitiv Data Platform')
        return get_sts_token(current_refresh_token)

#### RDP Service Discovery

Once authentication is successful, you can request the list of RTO WebSocket endpoints from the RDP Service Discovery.

In [8]:
def query_service_discovery(url=None):

    if url is None:
        url = discovery_url

    print("Sending Refinitiv Data Platform service discovery request to " + url)

    try:
        r = requests.get(url, headers={"Authorization": "Bearer " + sts_token}, params={"transport": "websocket"}, allow_redirects=False)

    except requests.exceptions.RequestException as e:
        print('Refinitiv Data Platform service discovery exception failure:', e)
        return False

    if r.status_code == 200:
        # Authentication was successful. Deserialize the response.
        response_json = r.json()
        print("Refinitiv Data Platform Service discovery succeeded. RECEIVED:")
        print(json.dumps(response_json, sort_keys=True, indent=2, separators=(',', ':')))

        for index in range(len(response_json['services'])):
            if not response_json['services'][index]['location'][0].startswith(region):
                continue

            if len(response_json['services'][index]['location']) == 1:
                hostList.append(response_json['services'][index]['endpoint'] + ":" +str(response_json['services'][index]['port']))


        if len(hostList) == 0:
            print("The region:", region, "is not present in list of endpoints")
            sys.exit(1)

        return True

    elif r.status_code == 301 or r.status_code == 302 or r.status_code == 303 or r.status_code == 307 or r.status_code == 308:
        # Perform URL redirect
        print('Refinitiv Data Platform service discovery HTTP code:', r.status_code, r.reason)
        new_host = r.headers['Location']
        if new_host is not None:
            print('Perform URL redirect to ', new_host)
            return query_service_discovery(new_host)
        return False
    elif r.status_code == 403 or r.status_code == 451:
        # Stop trying with the request
        print('Refinitiv Data Platform service discovery HTTP code:', r.status_code, r.reason)
        print('Stop trying with the request')
        return False
    else:
        # Retry the service discovery request
        print('Refinitiv Data Platform service discovery HTTP code:', r.status_code, r.reason)
        print('Retry the service discovery request')
        return query_service_discovery()

### MRN Process Code

The MRN data can be subscribed with the *NewsTextAnalytics* domain and MRN-specific RIC name as following:
- *MRN_TRNA*: News Analytics: Company and C&E assets
- *MRN_TRNA_DOC*: News Analytics: Macroeconomic News & events
- *MRN_STORY*: Real-time News
- *MRN_TRSI*: News Sentiment Indices

In [9]:
# MRN variables

mrn_domain = 'NewsTextAnalytics'
mrn_item = 'MRN_TRNA'

def send_mrn_request(ws):
    """ Create and send MRN request """
    mrn_req_json = {
        'ID': 2,
        "Domain": mrn_domain,
        'Key': {
            'Name': mrn_item,
            'Service': service
        }
    }

    ws.send(json.dumps(mrn_req_json))
    print("SENT:")
    print(json.dumps(mrn_req_json, sort_keys=True, indent=2, separators=(',', ':')))

### Initial Refresh Message
The Initial Refresh response does not contain any NTA data, all the fields related to news item and fragment are empty or 0. It contains only the relevant feed related or other static Fields. 

The application can just print out each incoming field data in a console for informational purpose or just ignore it.

In [10]:
# Process FieldList, Refresh and Status messages.

def decodeFieldList(fieldList_dict):
    for key, value in fieldList_dict.items():
        print("Name = %s: Value = %s" % (key, value))

def processRefresh(ws, message_json):

    print("RECEIVED: Refresh Message")
    decodeFieldList(message_json["Fields"])

def processStatus(ws, message_json):  # process incoming status message
    print("RECEIVED: Status Message")
    print(json.dumps(message_json, sort_keys=True, indent=2, separators=(',', ':')))

### MRN News Update messages Process Code

The updates contain only fields related to the item and the fragment. They do not contain any of the static or per-feed fields. The updates are not cached or conflated.

#### First Update
The first update contains all the fields related to the item and the first fragment, subsequent updates only contain the fields relating to the fragment they contain. The FRAG_NUM FID is set to 1 for the first Update of each item and is incremented in each subsequent Update for that item. This allows you to you to detect a missing fragment (and ensure correct order of the fragments for re-assembly). 


#### Subsequent Update and Multi Fragment Items
The subsequent update contains the fields necessary to identify the MRN data item, the order of this fragment among all the fragments for this item, and the fragment itself. The other point to note is that (for a Multi fragment item), Update messages with FRAG_NUM >1 will have fewer FIDs as the metadata is included in the first Update message (FRAG_NUM=1) for that item

#### News Fragments simple handle logic


<img src="images/mrn_flow_reconstruct.png"/>


In [11]:
def processMRNUpdate(ws, message_json):  # process incoming News Update messages

    fields_data = message_json["Fields"]
    # Dump the FieldList first (for informational purposes)
    # decodeFieldList(message_json["Fields"])

    # declare variables
    tot_size = 0
    guid = None

    try:
        # Get data for all requried fields
        fragment = base64.b64decode(fields_data["FRAGMENT"])
        frag_num = int(fields_data["FRAG_NUM"])
        guid = fields_data["GUID"]
        mrn_src = fields_data["MRN_SRC"]

        #print("GUID  = %s" % guid)
        #print("FRAG_NUM = %d" % frag_num)
        #print("MRN_SRC = %s" % mrn_src)

        if frag_num > 1:  # We are now processing more than one part of an envelope - retrieve the current details
            guid_index = next((index for (index, d) in enumerate(
                _news_envelopes) if d["guid"] == guid), None)
            envelop = _news_envelopes[guid_index]
            if envelop and envelop["data"]["mrn_src"] == mrn_src and frag_num == envelop["data"]["frag_num"] + 1:
                print("process multiple fragments for guid %s" %
                      envelop["guid"])

                #print("fragment before merge = %d" % len(envelop["data"]["fragment"]))

                # Merge incoming data to existing news envelop and getting FRAGMENT and TOT_SIZE data to local variables
                fragment = envelop["data"]["fragment"] = envelop["data"]["fragment"] + fragment
                envelop["data"]["frag_num"] = frag_num
                tot_size = envelop["data"]["tot_size"]
                print("TOT_SIZE = %d" % tot_size)
                print("Current FRAGMENT length = %d" % len(fragment))

                # The multiple fragments news are not completed, waiting.
                if tot_size != len(fragment):
                    return None
                # The multiple fragments news are completed, delete assoiclate GUID envelop
                elif tot_size == len(fragment):
                    del _news_envelopes[guid_index]
            else:
                print("Error: Cannot find fragment for GUID %s with matching FRAG_NUM or MRN_SRC %s" % (
                    guid, mrn_src))
                return None
        else:  # FRAG_NUM = 1 The first fragment
            tot_size = int(fields_data["TOT_SIZE"])
            print("FRAGMENT length = %d" % len(fragment))
            # The fragment news is not completed, waiting and add this news data to envelop object.
            if tot_size != len(fragment):
                print("Add new fragments to news envelop for guid %s" % guid)
                _news_envelopes.append({  # the envelop object is a Python dictionary with GUID as a key and other fields are data
                    "guid": guid,
                    "data": {
                        "fragment": fragment,
                        "mrn_src": mrn_src,
                        "frag_num": frag_num,
                        "tot_size": tot_size
                    }
                })
                return None

        # News Fragment(s) completed, decompress and print data as JSON to console
        if tot_size == len(fragment):
            print("decompress News FRAGMENT(s) for GUID  %s" % guid)
            decompressed_data = zlib.decompress(fragment, zlib.MAX_WBITS | 32)
            
            json_news = json.loads(decompressed_data)
            _trna_messages.append(json_news)
            print("News = %s" % json_news)

    except KeyError as keyerror:
        print('KeyError exception: ', keyerror)
    except IndexError as indexerror:
        print('IndexError exception: ', indexerror)
    except binascii.Error as b64error:
        print('base64 decoding exception:', b64error)
    except zlib.error as error:
        print('zlib decompressing exception: ', error)
    # Some console environments like Windows may encounter this unicode display as a limitation of OS
    except UnicodeEncodeError as encodeerror:
        print("UnicodeEncodeError exception. Cannot decode unicode character for %s in this enviroment: " %
              guid, encodeerror)
    except Exception as e:
        print('exception: ', sys.exc_info()[0])

### JSON-OMM Process functions

In [12]:
def process_message(ws, message_json):
    """ Parse at high level and output JSON of message """
    message_type = message_json['Type']

    if message_type == "Refresh":
        if "Domain" in message_json:
            message_domain = message_json["Domain"]
            if message_domain == "Login":
                process_login_response(ws, message_json)
            elif message_domain:
                processRefresh(ws, message_json)
    elif message_type == "Update":
        if "Domain" in message_json and message_json["Domain"] == mrn_domain:
            processMRNUpdate(ws, message_json)
    elif message_type == "Status":
        processStatus(ws, message_json)
    elif message_type == "Ping":
        pong_json = {'Type': 'Pong'}
        ws.send(json.dumps(pong_json))
        print("SENT:")
        print(json.dumps(pong_json, sort_keys=True,
                         indent=2, separators=(',', ':')))


def process_login_response(ws, message_json):
    """ Send item request """
    send_mrn_request(ws)


def send_login_request(ws ,auth_token, is_refresh_token):
    """ Generate a login request from command line data (or defaults) and send """
    login_json = {
        'ID': 1,
        "Domain": 'Login',
        'Key': {
            'NameType': 'AuthnToken',
            'Elements': {
                'ApplicationId': '',
                'Position': '',
                'AuthenticationToken': ''
            }
        }
    }

    login_json['Key']['Name'] = user
    login_json['Key']['Elements']['ApplicationId'] = app_id
    login_json['Key']['Elements']['Position'] = position
    login_json['Key']['Elements']['AuthenticationToken'] = auth_token
    
    if is_refresh_token:
        login_json['Refresh'] = False

    ws.send(json.dumps(login_json))
    print("SENT:")
    print(json.dumps(login_json, sort_keys=True, indent=2, separators=(',', ':')))

def send_refresh_token(ws):
    print('Refreshing the access token')
    send_login_request(ws, sts_token, True)

def ws_disconnect(ws):
    print('Closing the WebSocket connection')
    if web_socket_open:
        ws.close()

### WebSocket Process functions

The code runs for 25~ minutes before stop the connection. You can change it to run forever by changing from the ```while time.time() < t_end:``` statement to ```while True:``` statment.

In [13]:
def on_message(ws, message):
    """ Called when message received, parse message into JSON for processing """
    print("RECEIVED: ")
    message_json = json.loads(message)
    # Uncomment to print RAW JSON message from the server
    #print(json.dumps(message_json, sort_keys=True, indent=2, separators=(',', ':')))

    for singleMsg in message_json:
        process_message(ws, singleMsg)
        
def on_error(ws, error):
    """ Called when websocket error has occurred """
    print(error)
    
def on_close(ws, close_status_code, close_msg):
    """ Called when websocket is closed """
    global web_socket_open
    print("WebSocket Closed")
    web_socket_open = False
    
def on_open(ws):
    """ Called when handshake is complete and websocket is open, send login """

    print("WebSocket successfully connected!")
    global web_socket_open
    web_socket_open = True
    send_login_request(ws,sts_token, refresh_token)
    

## Main Function 

if __name__ == "__main__":
    # RTO - RDP Login
    #print(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))
    sts_token, refresh_token, expire_time = get_sts_token(None)
    if not sts_token:
        exit

    original_expire_time = expire_time

    # Query VIPs from Refinitiv Data Platform service discovery
    if not query_service_discovery():
        print('Failed to retrieve endpoints from Refinitiv Data Platform Service Discovery. Exiting...')
        exit
    #print(hostList[0])
    # Start websocket handshake
    ws_address = "wss://{}/WebSocket".format(hostList[0])
    print("Connecting to WebSocket " + ws_address + " ...")
    web_socket_app = websocket.WebSocketApp(ws_address, header=['User-Agent: Python'],
                                            on_message=on_message,
                                            on_error=on_error,
                                            on_close=on_close,
                                            subprotocols=['tr_json2'])
    web_socket_app.on_open = on_open
    
    #web_socket_app.keep_running = False

    # Event loop
    #wst = threading.Thread(target=web_socket_app.run_forever)
    wst = threading.Thread(target=web_socket_app.run_forever, kwargs={'sslopt': {'check_hostname': False}})
    wst.start()

    #time.sleep(90)
    #web_socket_app.close()
    t_end = time.time() + (60 * 10) # Running for 25~ Minutes
    try:
        #while True: # Change to this line for running forever
        while time.time() < t_end: # Running for 25~ Minutes
            #  Continue using current token until 90% of initial time before it expires.
            time.sleep(int(float(expire_time) * 0.90))

            sts_token, refresh_token, expire_time = get_sts_token(refresh_token)
            if not sts_token:
                exit

            if int(expire_time) != int(original_expire_time):
               print('expire time changed from {} sec to {} sec; retry with password'.format(str(original_expire_time), str(expire_time)))
               sts_token, refresh_token, expire_time = get_sts_token(None)
               if not sts_token:
                   exit 
               original_expire_time = expire_time

            # Update token.
            send_refresh_token(web_socket_app)
        else: # End time
            print('Close connection')
            ws_disconnect(web_socket_app)
            #print(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S'))

    except KeyboardInterrupt:
        pass

Sending authentication request with password to https://api.refinitiv.com/auth/oauth2/v1/token ...
Refinitiv Data Platform Authentication succeeded. RECEIVED:
{
  "access_token":"eyJhbGciOiJSUzI1NiIsImtpZCI6ImJlcGpHV0dkOW44WU9VQ1NwX3M3SXlRMmlKMFkzeWRFaHo1VDJJVlNqWTgiLCJ0eXAiOiJhdCtqd3QifQ.eyJhdWQiOiJiNDg0MmYzOTA0ZmI0YTFmYTE4MjM0Nzk2MzY4Nzk5MDg2YzYzNTQxIiwiZGF0YSI6IntcImNpcGhlcnRleHRcIjpcImlSX0RnU1EyTG9waG1EWWJxdFYyWXRQM1BSai0yc1g1WTVwZk9WTjhlMzdoLXNsTUpjWU91WGFnS0dtZjRVOS1ubnF5aEZaM3dYaVlQTGo2Y0hOeWdUU3M1clpyeml3WDZWeTFYZUJSaE5MeHdkQzViQ1I1a3lnbWt6ckhOa0Q1S19mMlA2Q3p2elVlYnh0aHZFNk56ZzZCZ29rS3YtYU94YWl2ZFQ5YTdNUlN6NDVtYjc2bnEydzJBYnJyR0FHQkpLUnF1ckFrZElqSGxEOVRfWkpmNU9LM3BNZGNPUEZPSjUtaXBPVlRaVUVvXzhGc2NNR3RoRUlMUzk5S25iblFYZGlMR3lKXzJ3UjlMUlExRU82XzVRWW9jMzl3VXlQd3FWc0dCVVd3R1lERXFOR3pIbXR5Vm0zS3ZQTndVaUV0OTRhQmJiOG9EeV9OWXFpLXJBQWtJXzc1R1FHU0xGS2psdTg0VFFxaXNBdE94alRaNEVSTmRHSEFGNWJ5Q0cyMTR6X2FWeWZ3V0ZqMkVLV1lcIixcImVuY3J5cHRlZF9rZXlcIjpcIkFRSUJBSGlTZUVwdWFLRlhsdzlNSWQ1cEQ3TVc3b3h0T1

Check News Messages

In [16]:
_trna_messages[0:3]

[{'analytics': {'analyticsScores': [{'assetClass': 'CMPNY',
     'assetCodes': ['P:4296689152', 'R:MRCU.BO', 'R:MRCU.INx'],
     'assetId': '4296689152',
     'assetName': 'Mercury Laboratories Ltd',
     'brokerAction': 'UNDEFINED',
     'firstMentionSentence': 1,
     'linkedIds': [],
     'noveltyCounts': [{'itemCount': 0, 'window': '12H'},
      {'itemCount': 0, 'window': '24H'},
      {'itemCount': 0, 'window': '3D'},
      {'itemCount': 0, 'window': '5D'},
      {'itemCount': 0, 'window': '7D'}],
     'priceTargetIndicator': 'UNDEFINED',
     'relevance': 1.0,
     'sentimentClass': 1,
     'sentimentNegative': 0.00782282,
     'sentimentNeutral': 0.447781,
     'sentimentPositive': 0.544396,
     'sentimentWordCount': 82,
     'volumeCounts': [{'itemCount': 0, 'window': '12H'},
      {'itemCount': 1, 'window': '24H'},
      {'itemCount': 1, 'window': '3D'},
      {'itemCount': 1, 'window': '5D'},
      {'itemCount': 1, 'window': '7D'}]}],
   'newsItem': {'bodySize': 357,
    'co

This Notebook application connects to Real-Time Advanced Distribution Server via the WebSocket connection, then consumes TRNA Data as MRN data domain. When the Notebook receives TRNA data from Real-Time Advanced Distribution Server, it assembles, decodes MRN textual News Analytics message and keeps them in ```_trna_messages``` [list](https://docs.python.org/3.7/tutorial/datastructures.html#more-on-lists) variable. You can get each TRNA JSON message and associate analytics fields from this variable.

## News Analytics Data Model Overview

The structure of the data within each data feed is defined in the following sections. After assembly and decompression, the data appears as JSON in UTF-8.

You can find the full detail of News Analytics Data Models and Fields in *User Guide* section of [My LSEG's News Analytics page](https://myaccount.lseg.com/en/product/machine-readable-news-analytics).

The News Analytics feed has three top-level items:
- *id*: The value of this field is in ```[feedFamilyCode]:[sourceId]``` format.
- *analytics*: Analytics Groups sub-group containing the analytics scores
- *newsItem*: This group contains metadata sourced directly from the STORY item, in contrast to the newsItem group also inside the analytics group that contains data derived from the TRNA scoring.

In [17]:
analytics_id  = _trna_messages[0]["id"]
print("Analytics id: ", analytics_id)

Analytics id:  tr:BSE4YTRrJ_2511052T1z0HN8/9d8ErWcEnrjyhBnQWl0h5JAkypZ2K


### News Item Group (Top-Level Group)

The News Analytics feed contains two news item groups. This top-level group contains values which are contained within the news item being processed; the other group (above section) within the analytics group contains values derived from the news item by the analytics system.

Because the fields below are sourced from the incoming news item data and mapped to the below fields, those mappings can vary by the feedFamilyCode value. Those mappings are distinguished in the Notes section in the below table.

Example Fields:
- ```dataType```: The broad type of data the news item belongs to. One of "News", "Social"
- ```feedFamilyCode```: A code that identifies the family of feeds the news item came from. Thomson Reuters feeds = "tr"
- ```headline```: The headline text of the news item.
- ```sourceTimestamp```: UTC timestamp of this news item. Millisecond precision. The source of this data varies by the feedFamilyCode value.
- ```provider```: Identifier for the organization which provided the news item. The source of this data varies by the feedFamilyCode value.
    * "tr": from provider field
    * "mrvr": from sourceName or publisher field
- ```urgency```: Differentiates story types. 1: alert, 3: article
    

In [18]:
news_item  = _trna_messages[0]["newsItem"]

In [19]:
print("dataType: ", news_item["dataType"])
print("Headline: ", news_item["headline"])
print("sourceTimestamp: ", news_item["sourceTimestamp"])
print("feedFamilyCode: ", news_item["feedFamilyCode"])
print("provider: ", news_item["provider"][3:]) # news_item["provider"] == NS:RTRS
print("urgency: ", news_item["urgency"], " : ", 
      (lambda item_type: "alert" if 1 else "article")(news_item["urgency"]))

dataType:  News
Headline:  Mercury Laboratories Ltd - 538964 - Board Meeting Intimation for Approval Of The Unaudited Financial Results For The Quarter And Half Year Ended On September 30, 2025 Scheduled To Be Held On November 11, 2025
sourceTimestamp:  2025-11-05T04:51:03.803Z
feedFamilyCode:  tr
provider:  BSE
urgency:  3  :  alert


In [20]:
analytic_scores_group = _trna_messages[0]["analytics"]["analyticsScores"]

### Analytics Score Group

Each analytics score group contains all the analytics information derived from the news item for a specific asset as a simple group of named values.

Example Fields:
- ```assetClass```: The broad class that the asset belongs to. Also describes the type of TRTS sentiment engine used in the scoring.
    * Either "CMPNY" for a company or "COM" for a commodity.
    * Set to “CMPNY” for document-level scores because of use of the same scoring engine as used for company-level scores.
- ```assetCodes```: List of prefixed codes, in conjunction with assetId field below, which identify the asset within various symbologies.

    * By assetClass value:
    “CMPNY”: “P:” prefix for PermID and “R:” for RIC. Can contain multiple RICs for a single company, including the primary one and those tagged to the news item.
    
    * “COM”: “N2” for topic code
- ```assetId```: Primary identifier for the asset. PermID for company and topic code for commodity.
- ```assetName```: A human readable name for the asset, used as an identifier for unknown entity scoring.
- ```brokerAction```: Denotes whether the news item is reporting the action of a broker recommendation for a security issued by the company.

    * One of "UPGRADE", "DOWNGRADE", "MAINTAIN", "BROKER", "INITIATE", "UNDEFINED"
- ```firstMentionSentence```: The first sentence, starting with the headline, in which the scored asset is mentioned. Thus, a value of 1 denotes the headline, 2 the first sentence of the story body, 3 the second sentence, etc.
- ```priceTargetIndicator```: When the news item is a price target indicator for the asset.

    * One of "INCREASE", "DECREASE", "MAINTAIN", "BROKER", "INITIATE", "UNDEFINED"
    * Set to “UNDEFINED” for all Japanese-language and document-level scores.
- ```relevance```: A decimal number indicating the relevance of the news item to the asset. It ranges from 0 to 1.
- ```sentimentClass```: This field indicates the predominant sentiment class for this news item with respect to this asset. The indicated class is the one with the highest probability.
    * 1: Positive
    * 0: Neutral 
    * -1: Negative
- ```sentimentNegative```: The probability that the sentiment of the news item was negative for the asset.
- ```sentimentNeutral```: The probability that the sentiment of the news item was neutral for the asset.
- ```sentimentPositive```: The probability that the sentiment of the news item was positive for the asset.
- ```sentimentWordCount```: The number of lexical tokens (words and punctuation) in the sections of the item text that are deemed relevant to the asset.

#### TRNA Analytics Group Processing functions

In [21]:
def get_permid(asset_codes):
    for code in asset_codes:
        if code[:2] == "P:":
            return code[2:]

def get_company(asset_codes):
    company = [code[2:] for code in asset_codes if code[:2] == "R:"]
    return " ".join(company)

def get_topic_code(asset_codes):
    topic = [code[3:] for code in asset_codes if code[:3] == "N2:"]
    return " ".join(topic)

In [22]:
# Analytics Group Fields

asset_class = None
asset_codes = None
sentiment_class = {-1: 'Negative', 0: 'Neutral', 1: 'Postivie'}

for analytic_score in analytic_scores_group:
    if analytic_score["assetClass"]:
        asset_class = analytic_score["assetClass"]
        asset_codes = analytic_score["assetCodes"]
        print("assetClass: ", asset_class)
        print("assetCodes: ", asset_codes)
        if asset_class == "CMPNY":
            print("PermID: ", get_permid(asset_codes))
            print("Co: ", get_company(asset_codes))
        elif asset_class == "COM":
            print("Topic Codes: ", get_topic_code(asset_codes))
        print("assetId: ", analytic_score["assetId"])
        print("assetName: ", analytic_score["assetName"])
        print("brokerAction: ", analytic_score["brokerAction"])
        print("relevance: ",analytic_score["relevance"])
        print("sentimentClass: ", analytic_score["sentimentClass"], 
              ":", sentiment_class[analytic_score["sentimentClass"]] )
        print("sentimentPositive: ", analytic_score["sentimentPositive"])
        print("sentimentNeutral: ", analytic_score["sentimentNeutral"])
        print("sentimentNegative: ", analytic_score["sentimentNegative"])
        print("priceTargetIndicator: ", analytic_score["priceTargetIndicator"])
        print("firstMentionSentence: ", analytic_score["firstMentionSentence"])
        print("sentimentWordCount: ", analytic_score["sentimentWordCount"])
        print("--------------------------------------------------------")

assetClass:  CMPNY
assetCodes:  ['P:4296689152', 'R:MRCU.BO', 'R:MRCU.INx']
PermID:  4296689152
Co:  MRCU.BO MRCU.INx
assetId:  4296689152
assetName:  Mercury Laboratories Ltd
brokerAction:  UNDEFINED
relevance:  1.0
sentimentClass:  1 : Postivie
sentimentPositive:  0.544396
sentimentNeutral:  0.447781
sentimentNegative:  0.00782282
priceTargetIndicator:  UNDEFINED
firstMentionSentence:  1
sentimentWordCount:  82
--------------------------------------------------------


### Windowed Count Group

The windowed count group is used to associate a count with the window of time it relates to. It is used for the noveltyCounts and volumeCounts.

#### Novelty Counts 

The novelty of the content within a news item on a particular asset is calculated by comparing it with the asset-specific text over a cache of previous news items that contain the asset.

The comparison between items is done using a linguistic fingerprint. If the news items are similar, they are termed as being “linked”. As a result, a content item can “link” only to an item of the same language.

There are five historical periods that are used in the comparison. The default periods are 12 hours, 24 hours, 3 days, 5 days and 7 days prior to the news item’s timestamp.

#### Volume Counts

The volume of news for each asset is calculated. A cache of previous news items is maintained and the number of news items that mention the asset within each of five historical periods is calculated. The cache is language-specific, e.g., a volumeCount on an English-language item measures the number of other English-language items in that historical period.

By default, the historical periods are 12 hours, 24 hours, 3 days, 5 days and 7 days prior to the news item’s timestamp and are the same used in the novelty calculations. Thus, direct comparisons between similar and total items within the historical periods can be achieved.

Example Fields:
- ```itemCount```: Number of items
- ```window```: Length of time the count covers nH (for hours) or nD (for days). Default values are “12H”, “24H”, “3D”, “5D”, and “7D”.

#### TRNA Windowed Count Processing functions

In [23]:
def windowsed_count_group(group):
    for item in group:
        print("itemCount: ", item["itemCount"])
        print("window: ", item["window"])

In [24]:
# Windowed Count - Analytics Group Fields

for analytic_score in analytic_scores_group:
    print("Novelty Counts:\n")
    windowsed_count_group(analytic_score["noveltyCounts"])
    print("--------------------------------------------------------")
    print("Volumn Counts:\n")
    windowsed_count_group(analytic_score["volumeCounts"])
    print("--------------------------------------------------------")

Novelty Counts:

itemCount:  0
window:  12H
itemCount:  0
window:  24H
itemCount:  0
window:  3D
itemCount:  0
window:  5D
itemCount:  0
window:  7D
--------------------------------------------------------
Volumn Counts:

itemCount:  0
window:  12H
itemCount:  1
window:  24H
itemCount:  1
window:  3D
itemCount:  1
window:  5D
itemCount:  1
window:  7D
--------------------------------------------------------


### Linked Id Group

The linked id group is used to associate an id with its position in a longer list of ids. It is used for the linkedIds.
This group is not populated for document-level scores, since novelty is not calculated.

Example Fields:
- ```idPosition```: Position of the linkedId in the complete list of linked Ids. 0 is the first/oldest, and the largest/most recent is the 7-day itemCount minus 1.
- ```linkedId```: id of the item at this position

In [25]:
linked_id_group = None

for analytic_score in analytic_scores_group:
    print("Linked Id Group: ")
    linked_id_group = analytic_score["linkedIds"]
    if linked_id_group:
        for linked_id in linked_id_group:
            print("idPosition: ", linked_id["idPosition"])
            print("linkedId: ", linked_id["linkedId"])

Linked Id Group: 


### News Item Group (Analytics Sub-group)

The TRNA feed contains two news item groups. This group, within the analytics group, contains values derived from the news item by the analytics system.

Example Fields:
- ```companyCount```: The number of companies explicitly listed in the news item in the subjects field
- ```exchangeAction```: One of "IMBALANCE", "HALT", "RESUME", "BLOCK TRADE", "INDICATION", "UNDEFINED".
    * Set to “UNDEFINED” for all Japanese-language scores.
- ```marketCommentary```: Indicator that the item is discussing general market conditions, such as “After the Bell” summaries.
- ```sentenceCount```: The total number of sentences in the news item.
- ```wordCount```: The total number of lexical tokens (words and punctuation) in the news item.

In [26]:
news_item_groups = _trna_messages[0]["analytics"]["newsItem"]

In [27]:
print("companyCount: ", news_item_groups["companyCount"])
print("exchangeAction: ", news_item_groups["exchangeAction"])
print("marketCommentary: ", news_item_groups["marketCommentary"])
print("sentenceCount: ", news_item_groups["sentenceCount"])
print("wordCount: ", news_item_groups["wordCount"])

companyCount:  1
exchangeAction:  UNDEFINED
marketCommentary:  False
sentenceCount:  3
wordCount:  83


## Next Steps

Once the application can retrieve each News Analytics field data from the Real-Time platform, the application needs to implement a business logic to collect and analyze those data based on interested Analytics asset. Please see the examples of how to use each asset below:
- *Sentiment*: Positive sentiment typically leads to asset price rise, negative sentiment to a decline
- *Relevance*: Filter out News Analytics records with low relevance
- *Novelty*: Filter out News Analytics records that are similar to more than 0 or 1 recent news items
- *Volume*: A sudden spike in overall news volume often leads to increased trading volume and volatility

For more detail regarding each asset usage and information, please check [News Analytics Product page](https://myaccount.lseg.com/en/product/machine-readable-news-analytics).

## References

For further details, please check out the following resources:

- [LSEG Real-Time products family page](https://developers.lseg.com/en/use-cases-catalog/real-time) on the [LSEG Developers Community](https://developers.lseg.com/) website.
- [WebSocket API page](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/websocket-api).
- [Developer Webinar Recording: Introduction to Electron WebSocket API](https://www.youtube.com/watch?v=CDKWMsIQfaw).
- [News Analytics Product page](https://myaccount.lseg.com/en/product/machine-readable-news-analytics).
- [Introduction to Machine Readable News with WebSocket API](https://developers.lseg.com/en/article-catalog/article/introduction-machine-readable-news-elektron-websocket-api-refinitiv).
- [Introduction to Machine Readable News (MRN) with Enterprise Message API (EMA)](https://developers.lseg.com/en/article-catalog/article/introduction-machine-readable-news-mrn-elektron-message-api-ema).
- [MRN Data Models and Real-Time SDK Implementation Guide](https://developers.lseg.com/en/api-catalog/real-time-opnsrc/rt-sdk-java/documentation#mrn-data-models-and-elektron-implementation-guide).
- [MRN (Real-Time News) WebSocket Python example on GitHub](https://github.com/LSEG-API-Samples/Example.WebSocketAPI.Python.MRN).
- [MRN (Real-Time News) WebSocket Python Console example on GitHub](https://github.com/LSEG-API-Samples/Example.WebSocketAPI.Python.MRN.RTO)
- [MRN WebSocket JavaScript example on GitHub](https://github.com/LSEG-API-Samples/Example.WebSocketAPI.Javascript.NewsMonitor).
- [MRN WebSocket C# NewsViewer example on GitHub](https://github.com/LSEG-API-Samples/Example.WebSocketAPI.CSharp.MRNWebSocketViewer).
- [Real-Time WebSocket API: The Real-Time Optimized Version 2 Authentication Migration Guide](https://developers.lseg.com/en/article-catalog/article/webSocket-api-rto-v2-authentication-migration-guide).
- [Migrating the WebSocket Machine Readable News Application to Version 2 Authentication](https://developers.lseg.com/en/article-catalog/article/migrating-the-websocket-machine-readable-news-to-rto-v2).

For any question related to this example or WebSocket API, please use the Developer Community [Q&A Forum](https://community.developers.refinitiv.com).