# Tick History Time And Sales
This sample demonstrates how to get the list of Tick History Time And Sales fields, request data according to the selected fields and display data. 

## Prerequisites
- Python 3.6 or higher
- Jupyter Notebook
- DSS user account which can access Tick History Time And Sales. Tick History is hosted on DSS(DataScope Select platform). Please contact Refinitiv account team to get the user.

## Implementation
### Step 1. Request authentication token

 - Import required modules

In [None]:
import getpass as gp
import requests
import json
import pandas as pd
import copy
import time
import gzip

 - Input DSS username and password

In [None]:
username=input('Enter DSS username:')
password=gp.getpass('Enter DSS Password:')

 - Create authentication token request containing DSS username and password

In [None]:
requestUrl = "https://hosted.datascopeapi.reuters.com/RestApi/v1/Authentication/RequestToken"
requestHeaders={
    "Prefer":"respond-async",
    "Content-Type":"application/json"
    }
requestBody={
    "Credentials": {
    "Username": username,
    "Password": password
  }
}

- send the request and waits for the response

In [None]:
authenticationResp = requests.post(requestUrl, json=requestBody,headers=requestHeaders)
print("Received the response for authentication request")

- Check if the status code of the response is 200. If yes, the request has succeeded so extracts and prints the authentication token. Otherwise, print the error.

In [None]:
if authenticationResp.status_code == 200 :
    print("Received status code 200, get the authentication token from the response")
    jsonResponse = json.loads(authenticationResp.text.encode('ascii', 'ignore'))
    token = jsonResponse["value"]
    print ('Authentication token (valid 24 hours):')
    print (token)
else:
    print("Error with status code:",authenticationResp.status_code,"\n Text:",json.dumps(json.loads(authenticationResp.text),indent=4))

### Step 2. Retrieve the list of available fields.


- create a  Time and Sales content field types request

In [None]:
requestUrl='https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/GetValidContentFieldTypes(ReportTemplateType=ThomsonReuters.Dss.Api.Extractions.ReportTemplates.ReportTemplateTypes\'TickHistoryTimeAndSales\')'
requestHeaders={
    "Prefer":"respond-async",
    "Authorization": "token " + token
}

- send the request and waits for the response

In [None]:
contentFieldTypesResp = requests.get(requestUrl, headers=requestHeaders)
print("Received the response for content field types request")

- Check if the status code of the response is 200. If yes, the request has succeeded so keep the field types. Otherwise, print the error.

In [None]:
contentFieldTypesJson = None
if contentFieldTypesResp.status_code == 200 :
    print("Received status code 200, requested for content field types successfully")
    contentFieldTypesJson = json.loads(contentFieldTypesResp.text.encode('ascii', 'ignore'))
    wholeContentFieldTypes = contentFieldTypesJson["value"]
else:
    print("Error with status code:",contentFieldTypesResp.status_code,"\n Text:",json.dumps(json.loads(contentFieldTypesResp.text),indent=4))

- Print the fields type

In [None]:
print("The Content Fields Types Table")
df = pd.DataFrame(wholeContentFieldTypes, columns=["Name", "Description","FormatType"])
pd.set_option('display.max_rows', df.shape[0]+1)
pd.set_option('display.max_colwidth', -1)
df.index += 1
dfStyler=df.style.set_properties(**{'text-align': 'left'})
df

### Step 3. Send Time And Sales On demand Extraction Request


- Input the Identifiers and their Identifier types. For the valid Identifier types, please refer to 
[REST API Reference Tree](https://hosted.datascopeapi.reuters.com/RestApi.Help/Home/RestApiProgrammingSdk)

In [None]:
indentifierDict={}
InstrumentIdentifiersList = []
anIndentifier=input("Enter an identifier with its type e.g.IBM.N,Ric (press enter to exit):")
while len(anIndentifier) > 0:
    anIndentifierType=anIndentifier.split(",")
    if(len(anIndentifierType)) >= 2:
        indentifierDict["Identifier"]=anIndentifierType[0]
        indentifierDict["IdentifierType"]=anIndentifierType[1]
        InstrumentIdentifiersList.append(indentifierDict.copy())
    anIndentifier=input("Enter an identifier with its type e.g. (press enter to exit):") 

- Select content fields

In [None]:
totalFields = len(wholeContentFieldTypes)
selectedFields = []
print("Please see the Content Fields Types table above")
requestFields=input("Enter the fields index(1-" + str(totalFields) + ") separated by ','  :")
requestFieldsList=requestFields.split(",")
for aFidNum in requestFieldsList:
    if int(aFidNum) < 1 or int(aFidNum) > totalFields:
        print("Invalid fields index " + str(aFidNum) + ",skip this.");
    else:
        selectedFields.append(wholeContentFieldTypes[int(aFidNum)-1]["Name"])
print()
print("The selected fields are:")
for aField in selectedFields:
    print(aField)

- Create an on demand extracton for Time And Sales. For the parameters of each request, please refer to 
[REST API Reference Tree](https://hosted.datascopeapi.reuters.com/RestApi.Help/Home/RestApiProgrammingSdk)

    - Create request url and headers

In [None]:
requestUrl='https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/ExtractRaw'
requestHeaders={
    "Prefer":"respond-async",
    "Content-Type":"application/json",
    "Authorization": "token " + token
}

    - Create request body containing input identifiers and content field types

In [None]:
requestBody={
  "ExtractionRequest": {
    "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.TickHistoryTimeAndSalesExtractionRequest",
    "ContentFieldNames": selectedFields,
    "IdentifierList": {
      "@odata.type": "#ThomsonReuters.Dss.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",  
      "InstrumentIdentifiers": InstrumentIdentifiersList,
       "UseUserPreferencesForValidationOptions": "false"
    },  
    "Condition": {
       "MessageTimeStampIn": "GmtUtc",
        "ApplyCorrectionsAndCancellations": "false",
        "ReportDateRangeType": "Range",
        "QueryStartDate": "2019-11-06T00:00:00.000Z",
        "QueryEndDate": "2019-11-06T23:59:59.999Z",
        "DisplaySourceRIC": "false"
    }
  }
}

- send the request and waits for the response

In [None]:
extractionResp = requests.post(requestUrl, json=requestBody,headers=requestHeaders)
print("Received the response for on demand extraction request")

### Step 4. Check the request status untill the request has been processed completely.

- If the HTTP status code of response is 202 this means the extraction request was accepted, but processing has not completed yet. Hence, the application gets the received location url from 202 response header received in the previous step.

In [None]:
requestStatus =  extractionResp.status_code
print("Received status code " + str(requestStatus))
requestUrl=None
if requestStatus == 202 :
    requestUrl = extractionResp.headers["location"]
    print ('Extraction is not complete, poll the location URL:')
    print (str(requestUrl))
else:
    print("Error with status code:",requestStatus,"\n Text:",json.dumps(json.loads(extractionResp.text),indent=4))

- While the status of the request is 202, poll the request status every 30 seconds using the location url got from the previous step.

In [None]:
while (requestStatus == 202):
    print ('Received status code 202, waits 30 seconds, then poll again until the status is not 202')
    time.sleep(30)
    extractionResp = requests.get(requestUrl,headers=requestHeaders)
    requestStatus= extractionResp.status_code
print ('Received status code which is not 202')

- When the request is completed (The HTTP status code is not 202), check the status code. If it is 200 or OK, the application gets and prints the results which are jobId and the extraction notes. The jobId is used to retrieve the data while the extraction can be used to analyze data or troubleshooting problems. Apart from the HTTP status code 200, it is an error and prints the error.

In [None]:
if requestStatus == 200 :
    print("Received status code 200, get the JobId and Extraction notes")
    extractionRespJson = json.loads(extractionResp.text.encode('ascii', 'ignore'))
    jobId = extractionRespJson["JobId"]
    print ('\njobId: ' + jobId + '\n')
    notes = extractionRespJson["Notes"]
    print ('Extraction notes:\n' + notes[0])
else:
    print("Error with status code:",extractionResp.status_code,"\n Text:",json.dumps(json.loads(extractionResp.text),indent=4))

### Step 4. Retrieve data from TRTH or AWS

- Send HTTP get with a JobID got from the 200 OK response to retrieve data from TRTH or AWS. TRTH provides downloading some extraction data directly from Amazon Web Services (AWS) where the data files are hosted. The tick history data types which are supported by this feature are:
    * Time and Sales
    * Market Depth
    * Intraday Summaries
    * Raw.
    
  This sample requests for Time and Sales which supports AWS download.  Therefore, I will download data from AWS which provides faster download speed than TRTH directly.

In [None]:
DownloadFromAWS=True
requestUrl="https://hosted.datascopeapi.reuters.com/RestApi/v1/Extractions/RawExtractionResults" + "('" + jobId + "')" + "/$value"
requestHeaders={
        "Prefer":"respond-async",
        "Content-Type":"text/plain",
        "Accept-Encoding":"gzip",
        "Authorization": "token " + token
}
if DownloadFromAWS==True:
    requestHeaders.update({"X-Direct-Download":"true"})
dataRetrieveResp=requests.get(requestUrl,headers=requestHeaders,stream=True)
print("Received the response for retreiving data using the jobId")

- If the status is 200 or OK that means the application can retrieve data from TRTH or AWS successfully. Otherwise, print the error and exits.

In [None]:
if dataRetrieveResp.status_code == 200 :
    print("Received status code 200, retrieved data from the server successfully")
else:
    print("Error with status code:",extractionResp.status_code,"\n Text:",json.dumps(json.loads(extractionResp.text),indent=4))
    exit()

- save the downloaded data before decompressing it instead of decompressing it on the fly. This is to avoid data lost issues especially with large data sets.

In [None]:
import os
import shutil
dataRetrieveResp.raw.decode_content = False
fileName= os.getcwd() + "\compressData.csv.gz" 
print ('Saving compressed data to file:' + fileName + ' ... please be patient')

chunk_size = 1024
rr = dataRetrieveResp.raw
with open(fileName, 'wb') as fd:
    shutil.copyfileobj(rr, fd, chunk_size)
fd.close

print ('Finished saving compressed data to file:' + fileName + '\n')

- For the best practice, you should handle the data line by line instead of store all the data in one variable. This is to avoid issues with large data sets. Below is the code to read and decompress for each line (maximum 15 lines) from the data file that just created and display.

In [None]:
maxLine=15
print ('Reading data from file, and decompress at most ' + str(maxLine) + ' lines of it:')
count = 0
with gzip.open(fileName, 'rb') as fd:
    for line in fd:
        dataLine = line.decode("utf-8")
        print (dataLine)
        count += 1
        if count > maxLine:
            break
fd.close()