## Limits
RDP Search does impose limits on the size of the result set when requesting for large data sets.  The following examples provide some useful techniques when dealing with results reaching the upper limits imposed by the backend.

#### NuGet Packages

In [1]:
#r "nuget:Refinitiv.DataPlatform.Content, 1.0.0-alpha3"
#r "nuget:Microsoft.Data.Analysis"

Installed package Microsoft.Data.Analysis version 0.4.0

Installed package Refinitiv.DataPlatform.Content version 1.0.0-alpha3

In [2]:
using Newtonsoft.Json.Linq;
using Refinitiv.DataPlatform.Content.SearchService;
using Refinitiv.DataPlatform.Core;
using Microsoft.Data.Analysis;
using Microsoft.AspNetCore.Html;
using System.Linq;
using System;

#### Table Output
Helper routine to output data in a table format.

In [3]:
Formatter.Register<IList<JObject>>((hits, writer) =>
{
    if (hits.Count > 0)
    {
        var fields = new List<String>();
        var rows = new List<ICollection<IHtmlContent>>();
        var data = new Dictionary<string, IHtmlContent>();

        foreach(var hit in hits)
        {
            var cells = new List<IHtmlContent>();
            foreach (var val in hit.Properties())
            {
                if ( !fields.Any(item => item.Equals(val.Name)) )
                    fields.Add(val.Name);
                data[val.Name] = td(val.Value.ToString());
            }
            rows.Add(new List<IHtmlContent>(data.Values));
            data.Keys.ToList().ForEach(x => data[x] = td(""));
        }
        
        var headers = new List<IHtmlContent>();
        headers.AddRange(fields.Select(c => (IHtmlContent)th(c)));

        var t = table(thead(headers), tbody(rows.Select(r => tr(r))));
        writer.Write(t);
    }
}, "text/html");

In [4]:
// Create a session into the desktop
var session = CoreFactory.CreateSession(new DesktopSession.Params()
                            .AppKey("Your API Key here")
                            .OnState((s, state, msg) => Console.WriteLine($"{DateTime.Now}:{msg}. (State: {state})"))
                            .OnEvent((s, eventCode, msg) => Console.WriteLine($"{DateTime.Now}:{msg}. (Event: {eventCode})")));
session.Open();

17/12/2020 12:21:44 PM:Session is Pending. (State: Pending)
17/12/2020 12:21:44 PM:{
  "Contents": "Desktop Session Successfully Authenticated"
}. (Event: SessionAuthenticationSuccess)
17/12/2020 12:21:44 PM:Session is Opened. (State: Opened)


#### Grouping
There may be instances where the result set contains groups of values for properties based on your request.  For example, if I'm interested in retrieving all exchanges within the USA, I can execute this request:

In [5]:
var response = Search.Definition(Search.View.EquityQuotes).Filter("RCSExchangeCountryLeaf eq 'United States'")
                                                          .Top(10000)
                                                          .Select("ExchangeCode, RIC")
                                                          .GetData();
response.Data.Total

In [6]:
response.Data.Hits

ExchangeCode,RIC
IOM,EScv1
IOM,NQcv1
IOM,ESc1
IOM,SPc1
IOM,NQc1
IOM,NKc1
CBT,YMc1
IOM,SPv1
CBF,VXc1
IOM,SPcv1


In the above example, you can see the total available documents is over 4,000,000.  However, due to the nature of the data set, the exchange codes have been repeated which brought back the upper limit of documents within the result set.  **Note**: At the time of this writing, the upper limit has been defined as 10000 result sets.

Instead of performing multiple calls and pulling out the unique codes within each result set, I can apply the grouping features offered by Search to significantly reduce the result set returned.  For example:

In [7]:
response = Search.Definition(Search.View.EquityQuotes).Filter("RCSExchangeCountryLeaf eq 'United States'")
                                                      .Top(10000)
                                                      .Select("ExchangeCode")
                                                      .GroupBy("ExchangeCode") // Exchange codes can be grouped
                                                      .GroupCount(1)           // Then limited to 1 for each to create uniqueness
                                                      .GetData();
response.Data.Hits

ExchangeCode
IOM
CBT
CBF
NSQ
NYQ
NAQ
NMQ
ASQ
IMM
PNK


As you can see, I've significantly reduced the result set by grouping which now allows the result set using a single API call.  Using the 'grouping' technique to pull out the unique exchange codes is very useful if you wish to return many other properties as part of your results.  However, if you are stricly after the list of exchange codes, the preferred approach is to use Navigators.

#### Navigators
If the goal of your search is to simply capture the list of exchange codes, then the preferred approach in this case is to use Navigators.  A navigator allows the ability to categorize and summarize properties within the result set.  For example, I can provide a simple navigator where I want to bucket all the exchange codes found within the result set.  You can do this using the following request:

In [8]:
response = Search.Definition(Search.View.EquityQuotes).Filter("RCSExchangeCountryLeaf eq 'United States'")
                                                      .Top(0)
                                                      .Navigators("ExchangeCode(buckets:1000)")
                                                      .GetData();

In [9]:
var code = response.Data.Navigators["ExchangeCode"]["Buckets"];
Console.WriteLine($"Total exchange codes found: {code.Count()}");

Total exchange codes found: 146


In [10]:
Console.WriteLine(code);

[
  {
    "Label": "ONE",
    "Count": 1441126
  },
  {
    "Label": "OPQ",
    "Count": 1375006
  },
  {
    "Label": "IOM",
    "Count": 821584
  },
  {
    "Label": "PNK",
    "Count": 70297
  },
  {
    "Label": "CBT",
    "Count": 57899
  },
  {
    "Label": "OBB",
    "Count": 32934
  },
  {
    "Label": "OTC",
    "Count": 22446
  },
  {
    "Label": "BOS",
    "Count": 18424
  },
  {
    "Label": "THM",
    "Count": 17958
  },
  {
    "Label": "XPH",
    "Count": 15781
  },
  {
    "Label": "MID",
    "Count": 14705
  },
  {
    "Label": "PSE",
    "Count": 14635
  },
  {
    "Label": "NYS",
    "Count": 14237
  },
  {
    "Label": "NYQ",
    "Count": 12718
  },
  {
    "Label": "CIN",
    "Count": 12685
  },
  {
    "Label": "NMS",
    "Count": 10991
  },
  {
    "Label": "NAS",
    "Count": 10176
  },
  {
    "Label": "NTV",
    "Count": 10173
  },
  {
    "Label": "BZX",
    "Count": 10066
  },
  {

#### Segmenting the search
When we started with the above search to retrieve the list of exchange codes within the United States, we discovered that the result set returned the entire universe of instruments.  If our goal is to capture the entire instrument list, we cannot group and bucket the result set as we did above.  The # of hits returned is over 4 million so we are forced to go through a tedious process of segmenting the requests.

One way to do this is to choose some kind of indicator that will allow you to group your individual requests to successfully segment the result set.  However, you need to first ask yourself - do I need the entire data universe?  You may only be interested in a specific asset category thus reducing the universe of results significantly.

One possible way to approach this is to first capture the list of asset categories using a navigator on the property: 'RCSAssetCategoryLeaf'.  
For example:

In [11]:
response = Search.Definition(Search.View.EquityQuotes).Filter("RCSExchangeCountryLeaf eq 'United States'")
                                                      .Top(0)
                                                      .Navigators("RCSAssetCategoryLeaf")
                                                      .GetData();
Console.WriteLine(response.Data.Navigators["RCSAssetCategoryLeaf"]["Buckets"]);

[
  {
    "Label": "Equity Future",
    "Count": 1459469
  },
  {
    "Label": "Equity Cash Option",
    "Count": 1428468
  },
  {
    "Label": "Stock Index Future Option",
    "Count": 705359
  },
  {
    "Label": "Ordinary Share",
    "Count": 375401
  },
  {
    "Label": "Stock Index Cash Option",
    "Count": 72799
  },
  {
    "Label": "American Depository Receipt",
    "Count": 27733
  },
  {
    "Label": "Unit",
    "Count": 23021
  },
  {
    "Label": "Equity Future Option",
    "Count": 21076
  },
  {
    "Label": "Preferred Share",
    "Count": 18503
  },
  {
    "Label": "Equity Future Spread",
    "Count": 17671
  },
  {
    "Label": "Stock Index Future",
    "Count": 16917
  },
  {
    "Label": "Preference Share",
    "Count": 12770
  },
  {
    "Label": "Company Warrant",
    "Count": 8201
  },
  {
    "Label": "Depository Receipt",
    "Count": 8018
  },
  {
    "Label": "Depository Share",
    "Count": 5974
  }

The result of this will not only provide the complete list of categories for you to potentially select the desired ones, but for each, you can see the number of results.  This will further allow you to tune your requests based on these totals.

However, the above summary shows many categories that easily exceed the limits of the server.  If you need to further segment, you can possibly use the ***market cap*** to segment a specific asset category.

For example, let's choose an asset category where we can get a breakdown of the market cap:

In [12]:
// The following navigator will prepare the buckets of evenly distributed market cap ranges such that they fulfill 
// the limit requirements.  Below, I chose 12 as this will produce reasonable buckets we can work with.
var filter = "RCSExchangeCountryLeaf eq 'United States' and RCSAssetCategoryLeaf xeq 'Ordinary Share'";

response = Search.Definition(Search.View.EquityQuotes).Filter(filter)
                                                      .Top(0)
                                                      .Navigators("MktCapTotal(type:range, buckets:12)")
                                                      .GetData();
Console.WriteLine(response.Data.Navigators["MktCapTotal"]["Buckets"]);

[
  {
    "Label": "Below 2688575.25",
    "Filter": "MktCapTotal lt 2688575.25",
    "Count": 9161
  },
  {
    "Label": "Between 2688575.25 And 17051354.97",
    "Filter": "(MktCapTotal ge 2688575.25 and MktCapTotal lt 17051354.97)",
    "Count": 9154
  },
  {
    "Label": "Between 17051354.97 And 51069177.22",
    "Filter": "(MktCapTotal ge 17051354.97 and MktCapTotal lt 51069177.22)",
    "Count": 9175
  },
  {
    "Label": "Between 51069177.22 And 120324187.24",
    "Filter": "(MktCapTotal ge 51069177.22 and MktCapTotal lt 120324187.24)",
    "Count": 9157
  },
  {
    "Label": "Between 120324187.24 And 242977119.83",
    "Filter": "(MktCapTotal ge 120324187.24 and MktCapTotal lt 242977119.83)",
    "Count": 9160
  },
  {
    "Label": "Between 242977119.83 And 439735874.33",
    "Filter": "(MktCapTotal ge 242977119.83 and MktCapTotal lt 439735874.33)",
    "Count": 9177
  },
  {
    "Label": "Between 439735874.33 And 802788332.12",
    "Filter": "(

The first thing to note is that the 'Count' values for each bucket are within the valid limit of the server.  Based on this output, we can use the convenient Filter expressions provided to drive our segmented search requests.

For demonstration purposes, I will select one to retrieve the list of RICs for the specific asset category with the specified market cap range.

In [13]:
// Define our filter
var range1 = response.Data.Navigators["MktCapTotal"]["Buckets"][1]["Filter"];
filter = $"RCSExchangeCountryLeaf eq 'United States' and RCSAssetCategoryLeaf xeq 'Ordinary Share' and {range1}";
filter

RCSExchangeCountryLeaf eq 'United States' and RCSAssetCategoryLeaf xeq 'Ordinary Share' and (MktCapTotal ge 2688575.25 and MktCapTotal lt 17051354.97)

In [14]:
response = Search.Definition(Search.View.EquityQuotes).Filter(filter)
                                                      .Top(0)
                                                      .GetData();
Console.WriteLine($"Request resulted in a segement of {response.Data.Total} documents.");

Request resulted in a segement of 9154 documents.


Based on the buckets I defined, I can now safely use a filter to pull out a segment of instruments.  Despite using a combination of navigators and filters to conveniently define how to break up the segments to avoid these limits, the work to do so is still relatively complicated.

While it may be possible to pull out excessive amounts of data, you should ask yourself if you need to do this.  In most cases, you may be able to reduce the result set when you set up your search instead of pulling in everything then massage the results once you have them in hand.  Search was designed specifically to allow users to filter out unwanted content prior to returning the results.  If you think this way through your searching patterns, you will undoubtedly avoid situations where you need to create complicated algorithms to unnecessarily pull excessive amounts of data. Whether narrowing the request based on interested categories, or data for a specific region, you will find that you can significantly simplify your logic and avoid issues with limits.