## Index sectoral analysis
This notebook demonstrates an analysis of current and historical sectoral distribution of an index.

#### Learn more

To learn more about the Data Library for Python please join the LSEG Developer Community. By [registering](https://developers.lseg.com/iam/register) and [logging](https://developers.lseg.com/content/devportal/en_us/initCookie.html) into the LSEG Developer Community portal you will have free access to a number of learning materials like 
 [Quick Start guides](https://developers.lseg.com/en/api-catalog/refinitiv-data-platform/refinitiv-data-library-for-python/quick-start), 
 [Tutorials](https://developers.lseg.com/en/api-catalog/refinitiv-data-platform/refinitiv-data-library-for-python/tutorials), 
 [Documentation](https://developers.lseg.com/en/api-catalog/refinitiv-data-platform/refinitiv-data-library-for-python/documentation)
 and much more.

#### Getting Help and Support

If you have any questions regarding using the API, please post them on 
this [Q&A Forum](https://community.developers.refinitiv.com/spaces/321/index.html). 
The LSEG Developer Community will be happy to help. 

## Imports

In [1]:
import pandas as pd
import refinitiv.data as rd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
rd.open_session()

<refinitiv.data.session.Definition object at 0x168f44a00 {name='workspace'}>

Below we define the index for the analysis

In [2]:
index_ric = '.FTSE'

## Sectoral analysis: Current

#### Request Index constituents with sectors and performance metrics

In [3]:
constituent_sectors = rd.get_data(
        universe = f'0#{index_ric}', 
        fields= [
            "TR.TRBCEconomicSector", "TR.IndexConstituentWeightPercent", 
            "TR.CompanyMarketCapitalization", "TR.TotalReturn"
            ]
)
constituent_sectors

Unnamed: 0,Instrument,TRBC Economic Sector Name,Weight percent,Company Market Capitalization,Total Return
0,STAN.L,Financials,0.822004,20147863273.0508,-0.462844
1,CRDA.L,Basic Materials,0.308591,6444310223.58307,-1.080147
2,ANTO.L,Basic Materials,0.386524,23125178226.009201,2.894034
3,EZJ.L,Industrials,0.143209,3531444805.64024,1.398907
4,BNZL.L,Industrials,0.486676,10233102219.8265,-1.257445
...,...,...,...,...,...
95,ULVR.L,Consumer Non-Cyclicals,5.105034,107833482074.332001,-0.931099
96,OCDO.L,Consumer Cyclicals,0.120173,3428115930.00121,9.791332
97,LSEG.L,Financials,2.146384,49491667037.5243,-1.673102
98,TSCO.L,Consumer Non-Cyclicals,1.046369,22182095144.534801,-1.238095


#### Group the dataset to prepare for plotting

In [4]:
constituent_sectors_grouped  = constituent_sectors.groupby(by = "TRBC Economic Sector Name", ).agg(
    {'Instrument': 'count', 
     'Total Return':'sum', 
     'Weight percent': 'sum',
     'Company Market Capitalization':'sum'}).reset_index()
constituent_sectors_grouped["MarketCap in Total"] = constituent_sectors_grouped['Company Market Capitalization']/constituent_sectors_grouped['Company Market Capitalization'].sum() * 100
constituent_sectors_grouped

Unnamed: 0,TRBC Economic Sector Name,Instrument,Total Return,Weight percent,Company Market Capitalization,MarketCap in Total
0,Basic Materials,9,3.262335,9.106354,248062590404.4625,11.056099
1,Consumer Cyclicals,19,-10.568192,7.651483,167926038413.7562,7.48443
2,Consumer Non-Cyclicals,12,-10.892109,15.377215,345993002203.1225,15.420838
3,Energy,2,1.134795,12.661515,264226230635.03088,11.77651
4,Financials,20,-2.617351,19.909606,440785429432.4352,19.645717
5,Healthcare,6,-6.190343,13.828471,311302192900.07385,13.874676
6,Industrials,15,-5.148191,10.561327,224051347049.83496,9.985923
7,Real Estate,3,0.672324,0.965875,21329464230.02579,0.95065
8,Technology,9,-3.653552,6.011182,136893854169.7222,6.101331
9,Utilities,5,-6.526675,3.489884,83101686703.7238,3.703825


#### Plot number of instruments and the sum of market capitakization per sector

In [5]:
fig = make_subplots(rows=1, cols=2, specs=[[{'type':'pie'}, {'type':'pie'}]])

fig.add_trace(
    go.Pie(labels=constituent_sectors_grouped['TRBC Economic Sector Name'], 
           values=constituent_sectors_grouped['Instrument'],
           name='Number of Instruments'),
    row=1, col=1
)

fig.add_trace(
    go.Pie(labels=constituent_sectors_grouped['TRBC Economic Sector Name'], 
           values=constituent_sectors_grouped['Company Market Capitalization'], 
           name='Market Capitalization'),
    row=1, col=2
)

fig.update_layout(
    width=1000,
    height=500,
    annotations=[
        dict(text='Number of Instruments', x=0.1, xref='paper', y=1.15, yref='paper', showarrow=False, font=dict(size=16)),
        dict(text='Market Capitalization', x=0.9, xref='paper', y=1.15, yref='paper', showarrow=False, font=dict(size=16))
    ]
)

fig.show()


#### Plot the total return by sector

In [6]:
categories = constituent_sectors_grouped.reset_index()['TRBC Economic Sector Name']
values = constituent_sectors_grouped['Total Return']
colors = ['green' if x > 0 else 'red' for x in values]

fig = go.Figure(data=[go.Bar(
    x=categories,
    y=values,
    marker_color=colors,
)])

fig.update_layout(title_text='Total return sum by economic sector',
                  xaxis_title="TRBC Economic Sector Name",
                  yaxis_title="Total Return Sum",
                  plot_bgcolor="white",
                  width=1000, height=500)

fig.show()

#### Plot a treemap with tile size of Market Cap and color of total return

In [7]:
max_abs_return = max(constituent_sectors_grouped['Total Return'].abs())
constituent_sectors_grouped['Color'] = constituent_sectors_grouped['Total Return'] / max_abs_return

colorscale = [
    [0, "red"],   
    [0.5, "white"],
    [1, "green"]
]

fig = go.Figure(go.Treemap(
    labels=constituent_sectors_grouped['TRBC Economic Sector Name'],
    parents=[""]*len(constituent_sectors_grouped),
    values=constituent_sectors_grouped['Company Market Capitalization'],
    textinfo="label+value+percent entry",
    hoverinfo="label+value+percent entry+text",
    hovertext=constituent_sectors_grouped['Total Return'],
    marker=dict(
        colors=constituent_sectors_grouped['Color'],
        colorscale=colorscale,
        cmid=0 
    )
))

fig.update_layout(margin=dict(t=50, l=25, r=25, b=25), 
                  title = 'Treemap for sector performance (tile size - Market Cap, color - Return)',
                  width=1000, height=500)
fig.show()

## Sectoral analysis: Historical

#### Get historical series of index constituents using a custom function

Below we use an helper module which builds the historical constituents of a given equity index. The article describing the python object used below can be found [here](https://developers.lseg.com/en/article-catalog/article/building-historical-index-constituents).

In [18]:
from helper_index_constutents import IndexConstituents
ic = IndexConstituents()

In [9]:
index_historical  = ic.get_historical_constituents(index_ric,  start = '2000-01-01', end='2024-05-29')
index_historical

Unnamed: 0,Date,RIC
0,2000-01-01,III.L
1,2000-01-01,ABF.L
2,2000-01-01,SAB.L^J16
3,2000-01-01,ALLL.L^J08
4,2000-01-01,ALLD.L^G05
...,...,...
21954,2024-03-18,MKS.L
21955,2024-03-18,HWDN.L
21956,2024-03-18,ICGIN.L
21957,2024-03-18,PSN.L


#### Request sectors and performance metrics for hitorical constituents

In [10]:
sectors = rd.get_data(
        universe = list(index_historical['RIC'].unique()), 
        fields = [
            "TR.TRBCEconomicSector", 
            "TR.InstrumentMarketCapitalization", 
            "TR.TotalReturn"]
).rename(columns={'Instrument': 'RIC'})
sectors

Unnamed: 0,RIC,TRBC Economic Sector Name,Instrument Market Capitalization,Total Return
0,III.L,Financials,28812591297.946301,-0.949153
1,ABF.L,Consumer Non-Cyclicals,20157702275.950901,-2.975753
2,SAB.L^J16,Consumer Non-Cyclicals,73013234389.789993,0.0
3,ALLL.L^J08,Financials,989475626.88,-7.692308
4,ALLD.L^G05,Consumer Non-Cyclicals,,0.0
...,...,...,...,...
285,UTG.L,Real Estate,4151572921.79506,0.921409
286,HLN.L,Healthcare,30107261092.9104,0.278638
287,BEZG.L,Financials,4421668856.78936,-1.635688
288,DWL.L,Consumer Cyclicals,1016843326.56739,5.032823


#### Merge index historical constituents with the ingested dataframe and group by sector

In [11]:
sector_df = index_historical.merge(sectors, on = 'RIC')
sector_df_grouped = sector_df.groupby(by = ["Date", "TRBC Economic Sector Name"]).count()
sector_df_grouped = sector_df_grouped.reset_index()
sector_df_grouped

Unnamed: 0,Date,TRBC Economic Sector Name,RIC,Instrument Market Capitalization,Total Return
0,2000-01-01,Basic Materials,8,4,8
1,2000-01-01,Consumer Cyclicals,14,11,14
2,2000-01-01,Consumer Non-Cyclicals,12,11,12
3,2000-01-01,Energy,3,3,3
4,2000-01-01,Financials,25,16,25
...,...,...,...,...,...
2165,2024-03-18,Healthcare,6,6,6
2166,2024-03-18,Industrials,15,15,15
2167,2024-03-18,Real Estate,3,3,3
2168,2024-03-18,Technology,9,9,9


#### Plot the change in number of constituents per sector historically

In [12]:
_sectors = ['Basic Materials','Consumer Cyclicals','Consumer Non-Cyclicals','Technology',
            'Energy', 'Financials','Healthcare','Industrials','Real Estate','Utilities']
fig = px.area(sector_df_grouped[sector_df_grouped['TRBC Economic Sector Name'].isin(_sectors)], 
              x="Date", y="RIC", color="TRBC Economic Sector Name", width=1400, height=600)

fig.update_layout(
    xaxis=dict(
        type='category',
        categoryorder='array',
    ),
    title = 'Number of constituents per sector historically',
    width=1000, height=500
)
fig.show()

#### Request sectors and performance metrics as of the change date 

Below we request economic sector, company market capitalization and the 52Wk total return metrics for historical constituents as of the index constituent change date. We wrap the call in a try except statement to handle possible API exceptions. Also we request for the part of the dataset just to showcase our results and produce the subsequent plots.

In [None]:
max_steps = 3
step = 0
const_df = pd.DataFrame()
subset = 28 # to request the data for only some part of the history
for date in index_historical['Date'].unique()[-subset:]:
    rics  = index_historical[index_historical['Date'] == date]['RIC'].to_list()
    while step < max_steps:
        print(date)
        try: 
            df = rd.get_data(
                    universe = rics, 
                    fields = ["TR.TRBCEconomicSector", "TR.CompanyMarketCap", "TR.TotalReturn52Wk"], 
                    parameters={'SDate': date})
            df['Date'] = date
            const_df = pd.concat([const_df, df])
            break
        except rd.errors.RDError as e: 
            print("RDError:", e) 
            step +=1 
            continue
const_df.dropna(inplace=True)
const_df['52 Week Total Return'] = pd.to_numeric(const_df['52 Week Total Return'], errors='coerce')
const_df

#### Group the output by sector and plot the market capitalization and the 52Wk total return changes historically

In [54]:
const_df_grouped = const_df.groupby(by = ["Date", "TRBC Economic Sector Name"]).sum(numeric_only=True)
const_df_grouped = const_df_grouped.reset_index()
const_df_grouped

Unnamed: 0,Date,TRBC Economic Sector Name,Company Market Cap,52 Week Total Return
0,2019-12-23,Basic Materials,316195955616.388306,312.595567
1,2019-12-23,Consumer Cyclicals,194473080450.392242,725.354251
2,2019-12-23,Consumer Non-Cyclicals,407264725293.099976,174.093838
3,2019-12-23,Energy,455668237470.640259,15.046671
4,2019-12-23,Financials,414629256632.435547,571.203222
...,...,...,...,...
345,2024-03-18,Healthcare,276264460454.218994,59.618271
346,2024-03-18,Industrials,209091730599.307007,487.489147
347,2024-03-18,Real Estate,20164431269.571999,32.490265
348,2024-03-18,Technology,131460546720.458405,82.252278


In [57]:
fig = px.area(const_df_grouped[const_df_grouped['TRBC Economic Sector Name'].isin(_sectors)],
               x="Date", y="Company Market Cap", color="TRBC Economic Sector Name", width=1400, height=600)

fig.update_layout(
    xaxis=dict(
        type='category',
        categoryorder='array',
    ),
    title = 'Total market capitalization per sector historically',
    width=1000, height=500
)
fig.show()

In [58]:
fig = px.bar(const_df_grouped[const_df_grouped['TRBC Economic Sector Name'].isin(_sectors)],
              x="Date", y="52 Week Total Return", color="TRBC Economic Sector Name", width=1400, height=600)

fig.update_layout(
    xaxis=dict(
        type='category',
        categoryorder='array',
    ),
    title = '52Wk Total return per sector historically',
    width=1000, height=500
)
fig.show()

In [None]:
rd.close_session()