# New issues in the news

The goal of this tutorial is to demonstrate the Eikon API with the focus on the news retrieval in a Jupyter Notebook environment. So, for that purpose we are going to look at new issue news from International Financial Review (IFR), a global capital markets intelligence provider, that is a part of Refinitivs.

We will capture the `PRICED` or `DEAL` notifications that contain structured text that we will extract.

Before we start, let's make sure that:

+ Refinitiv Eikon desktop application is up and running;
+ Eikon Data API library is installed;
+ You have created an application ID for this script.

If you have not yet done this, have a look at the quick start section for this API. 

*A general note on the Jupyter Notebook usage*: in order to execute the code in the cell, press <kbd>Shift</kbd>+<kbd>Enter</kbd>. While notebook is busy running your code, the cell will look like this: `In [*]`. When its finished, you will see it change to the sequence number of the task and the output, if any. For example,

`In [8]: df['Asset Type'].value_counts()`

`Out[8]: Investment Grade    47
High Yield          24
Islamic             10
Covered              2
Name: Asset Type, dtype: int64`

For more info on the Jupyter Notebook, check out Project Jupyter site http://jupyter.org or 'How to set up a Python development environment for Thomson Reuters Eikon' tutorial on this portal.

Let's start with referencing Eikon API library and pandas:

In [1]:
import eikon as ek
import pandas as pd

Paste your application ID in to this line:

In [2]:
ek.set_app_key('your_app_id')

We are going to request emerging market new issue (**ISU**) eurobond (**EUB**) news from International Financial Review Emerging EMEA service (**IFREM**), focusing on the notifications of the already priced issues. You can replicate this request in the **News Monitor** app with the following query:

+ `Product:IFREM AND Topic:ISU AND Topic:EUB AND ("PRICED" OR "DEAL")`

In [3]:
from datetime import date

start_date, end_date = date(2016, 1, 1), date.today()
q = "Product:IFREM AND Topic:ISU AND Topic:EUB AND (\"PRICED\" OR \"DEAL\")"
headlines = ek.get_news_headlines(query=q, date_from=start_date, date_to=end_date, count=100)
headlines.head()

Unnamed: 0,versionCreated,text,storyId,sourceCode
2017-04-13 07:11:15,2017-04-13 07:11:49.650,PRICED: 4finance USD325m 5NC2 at 10.75%; Leads,urn:newsml:refinitiv.com:20170413:nIFR184tb5:1,NS:IFR
2017-04-12 19:58:46,2017-04-12 20:50:14.731,PRICED: Saudi Arabia US$9bn 2-tranche deal,urn:newsml:refinitiv.com:20170412:nIFR36WjXM:1,NS:IFR
2017-04-12 19:58:09,2017-04-12 20:49:38.608,PRICED: Saudi Arabia US$9bn 2-tranche deal,urn:newsml:refinitiv.com:20170412:nIFR1BKY60:1,NS:IFR
2017-04-11 15:03:51,2017-04-11 15:04:33.786,PRICED: X5 RUB20bn 3yr at 9.25%; Leads,urn:newsml:refinitiv.com:20170411:nIFR3Wb3bN:1,NS:IFR
2017-04-10 20:43:18,2017-04-10 20:48:04.320,PRICED: Romania E1.75bn 2-tranche deal,urn:newsml:refinitiv.com:20170410:nIFRS9KcK:1,NS:IFR


In the context of news, each story has its own unique idenifier, created according to the RFC 3085 standard. Here's what the story looks like, notice that I am using the standard HTML() function from Notebook to display it:

In [4]:
from IPython.core.display import HTML
html = ek.get_news_story('urn:newsml:reuters.com:20170405:nIFR5LpzRX:1')
HTML(html)

Now we can parse the data using a regular expression but before this we will need to convert HTML into text. Let's create a function that is going to return a dictionary from the this type of article. I will be using `lxml` library to convert HTML and `re` to parse its output.

In [5]:
from lxml import html
import re

def termsheet_to_dict(storyId):
    x = ek.get_news_story(storyId)
    story = html.document_fromstring(x).text_content()
    matches = dict(re.findall(pattern=r"\[(.*?)\]:\s?([A-Z,a-z,0-9,\-,\(,\),\+,/,\n,\r,\.,%,\&,>, ]+)", string=story))
    clean_matches = {key.strip(): item.strip() for key, item in matches.items()}
    return clean_matches

Let's test it and see if it works:

In [6]:
termsheet_to_dict('urn:newsml:reuters.com:20170323:nIFR9z7ZFL:1')['NOTES']

'EUR400m (from 300m+) 3yr LPN. RegS. Follows rshow. Exp nr/B+/BB.\r\nAlfa/ING/UBS(B&D). IPTs 2.75% area, guidance 2.625%/2.75% WPIR, set at 2.625% on\r\nbks closed >750m.'

Let's extract all data for all headlines:

In [7]:
from time import sleep

result = []

index = pd.DataFrame(headlines, columns=['storyId']).values.tolist()

for i, storyId in enumerate(index):
    x = termsheet_to_dict(storyId[0])
    if x:
        result.append(x)
   sleep(0.5)

df = pd.DataFrame(result)
df.head()

Unnamed: 0,1st Pay,Asset Type,Bookrunners,Business,CUSIP/ISIN,Call,Country,Coupon,DBRS,Denoms,...,Sector,Settledate,Size,Spread,Stabilis,Status,Tenor/Mty,Total,UOP,Yield
0,,High Yield,Stifel,,XS1597294781 /,01-May-19,LATVIA,10.750 Fixed,,200k/1k,...,Financials-Diversified,28-Apr-17,USD 325M,T+894,,PRICED,5yr 01-May-22,,,10.75
1,,Islamic,,,,,SAUDI ARABIA,3.628 Fixed,,200k/1k,...,,20-Apr-17,USD 4.5BN,MS+140,,PRICED,10yr 20-Apr-27,,,3.628\r\nSukuk
2,,Islamic,,,,,SAUDI ARABIA,2.894 Fixed,,200k/1k,...,,20-Apr-17,USD 4.5BN,MS+100,,PRICED,5yr 20-Apr-22,,,2.894\r\nSukuk
3,,High Yield,GS/UBS/VTB,,XS1598697412 Trade House PEREKRIOS...,,RUSSIA,9.250 Fixed,,10m/100k,...,Cons Staples-Retailing Limited Liability...,18-Apr-17,RUB 20BN,,,PRICED,3yr 18-Apr-20,,,9.25
4,,Investment Grade,Barc/Citi/Erste/ING/SG,,XS1313004928 /,,ROMANIA,3.875,,1k+1k,...,,19-Apr-17,EUR 750M,,,PRICED,18.5yr 29-Oct-35,2BN,,3.55\r\nSr Unsec Notes


Now, when we have the dataframe in place, we can perform simple stats on our data. For instance, how many of those issues reported were Investment Grade versus High Yield.

In [8]:
df['Asset Type'].value_counts()

Investment Grade    47
High Yield          24
Islamic             10
Covered              2
Name: Asset Type, dtype: int64

What about a specific country?

In [9]:
df[df['Country']=='RUSSIA']['Asset Type'].value_counts()

Investment Grade    6
High Yield          6
Name: Asset Type, dtype: int64

You can experiment further by changing the original headline search query, for example, by including the RIC into your request.