# **Trading Signal Analysis**

## Table of contents<a id='toc0_'></a>      
- [Introduction](#toc1_1_)    
- [Cleaning the Data](#toc1_2_)    
- [Working with the Data](#toc1_3_)    
- [Visualization ](#toc1_4_)    
- [Insights](#toc1_5_)    
- [Conclusion](#toc1_6_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

## <a id='toc1_1_'></a>[Introduction](#toc0_)

In this project, I am helping a trading broker analyze the performance of one of its trading signal providers. The broker wants to know if the trading signals generated by this particular provider have any effects on the volume size traded by the broker's clients.

I am going to analyze the XAUUSD product only because it is the most-traded product by the clients in this trading broker.

The purpose of this project is to know whether the trading signal that is given by the application is useful. In other words, can the signal influence clients to initiate trading. This can be done by calculating the total volume that is being traded during the day where the signal is out.
<u>However, in this project I am not going to consider the effect of high-impact economic news</u>. The only variable that I assume to have direct relation with the size of the trading volume is the presence of the trading signas.

Aside from that, the trading signal dataset provides information about the result of the signal, whether it is a profit or loss. So I am also going to analyze the outcome of the signal.

In [24]:
#importing necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import plotly.graph_objects as go
from plotly.subplots import make_subplots

In [25]:
#loading the dataframes
ct = pd.read_csv('Closed Trades - Randomized.csv')
ot = pd.read_csv('Open Trades - Randomized.csv')
signal=pd.read_csv('Signal Data.csv')

In [26]:
#checking the unique values of the Symbols in the Closed Trades datasetfor any duplicates
print(ct['Symbol'].unique())
print(ot['Symbol'].unique())

['gbpjpy' 'xauusd' 'clr' 'usdjpy' 'eurusd' 'usdcad' 'nik' 'audnzd'
 'audusd' 'dj' 'nq' 'gbpusd' 'gbpchf' 'usdchf' 'eurjpy' 'nzdjpy' 'chfjpy'
 'audjpy' 'euraud' 'nzdusd' 'xagusd' 'eurgbp' 'eurcad' 'gbpaud' 'sp' 'has']
['audjpy' 'audnzd' 'audusd' 'chfjpy' 'clr' 'dj' 'euraud' 'eurcad' 'eurgbp'
 'eurjpy' 'eurusd' 'gbpaud' 'gbpchf' 'gbpjpy' 'gbpusd' 'has' 'nik' 'nq'
 'nzdjpy' 'nzdusd' 'sp' 'usdcad' 'usdchf' 'usdjpy' 'xagusd' 'xauusd']


## <a id='toc1_2_'></a>[Cleaning the Data](#toc0_)

Firstly, I am going to make a function that will change the date formatting for Open Trades and Closed Trades datasets. Aside from that, the function also changes the date column datatype into datetime.

It is necessary as the date that comes with the Closed Trades and Open Trades datasets includes the time of transaction. But as I am going to group by the date only, I will need to delete the time using the **.strftime** function

In [27]:
def convert_datetime(df):
    df =  pd.to_datetime(df).dt.strftime('%Y-%m-%d')
    df = pd.to_datetime(df)
    return df

In [28]:
#changing the date format
ct["Open Time"] =  convert_datetime(ct["Open Time"])
ct["Close Time"] =  convert_datetime(ct["Close Time"])
ot["Open Time"] =  convert_datetime(ot["Open Time"])
signal['Signal Date'] = pd.to_datetime(
                          signal['Signal Date'],
                          format='%d %b %Y')

## <a id='toc1_3_'></a>[Working with the Data](#toc0_)

I need to know the total volume that is being traded each day to compare and analyze when there is a signal and normal days (no signal). That is why I grouped the Closed Trades and Open Trades dataframes (already filtered by the signal's symbol) based on the Open Time and sum all the volume for each day.

In [29]:
ct_grouped = ct[(ct['Symbol'] == signal['Symbol'][1].lower())]\
    .groupby(['Open Time']).agg({'Volume':'sum'})
ot_grouped = ot[(ot['Symbol'] == signal['Symbol'][1].lower())]\
    .groupby(['Open Time']).agg({'Volume':'sum'})

Then, I merged the Closed Trades and Open Trades based on the Open Time column into a new dataframe called margin table. I also made a new column called Total Volume to calculate the sum of Closed Trades and Open Trades volume for each day.

In [30]:
margin_table = pd.merge(ot_grouped, ct_grouped, on='Open Time', how='outer').rename(columns={"Volume_x":"OT","Volume_y":"CT"})
margin_table = margin_table.fillna(0)
margin_table['Total Volume'] = margin_table['OT'] +margin_table['CT']
margin_table=margin_table.sort_values(by='Open Time',ascending=True).reset_index()

I made a for loop that will iterate through the dates in the Signal Data dataframe and compare it to the dates in the margin table. I also get the index of the date out of the margin_table. As I wanted to compare the volume when there is a signal and any other days, I made a time range between 2 days before and 2 days after the signal came out. 


Then I filtered the margintable to fit the time range for each of the signal dates and store it to a variable. I will then append the variable to an empty list.

In [31]:
list=[]
for date in signal['Signal Date']:
    index=margin_table[(margin_table['Open Time'] == date)].index.item()
    before=index-2
    after=index+3
    filtered_df=margin_table.iloc[before:after]
    list.append(filtered_df)

## <a id='toc1_4_'></a>[Visualization](#toc1_4_1)  [&#8593;](#toc0_)

I wanted to visualize the comparison of total trading volume when there is a signal and when there is not. If there is a significant difference between the size of the volume, then I can conclude that there might be some correlation between the occurance of signal and the volume of trades.

For plotly, I made a subplot to illustrate each of the trading signal's graphs and combine it into one graph.
I also made a title list to create different titles for each of the graphs. It will also contain the date of the signal.

In [32]:
title_list2=[]
i=1
for date in signal['Signal Date']:
    date =  date.strftime('%d %B %Y')
    title = f'XAUUSD - Signal {i} <br><sup> {date}'
    title_list2.append(title)
    i +=1

I also created a function to make the subplot. But the difference between plotly and matplotlib is that I can further specify the colors for one specific bar. Then, I decided to access the second index for each subplot (which represent the day when the signal is out) and make an if condition to determine the color. 

I am going to filter based on the outcome of the signal for the day. There are 3 possible outcomes:
1. Win (Green) : the outcome is profitable
2. Lose (Red) : the outcome is not profitable
3. No trade (Yellow) : the expected buy/sell price is never reached throughout the day.

Now as I can differentiate the subplots based on the outcome, I wanted to analyze the graphs where the signal turns out to be profitable. I wanted to see whether if a profitable outcome from the signal will influence people to trade for the next two days by comparing the size of volume.

In [33]:
def subplot_plotly(a,col):
    #determining the colors
    colors = ['lightslategray',] * 5
    if signal['Profit/Loss'][a] == 'Profit':
        colors[2] = 'green'
    elif signal['Profit/Loss'][a] == 'Loss':
        colors[2] = 'crimson'
    else:
        colors[2] = 'yellow'
    fig.add_trace(go.Bar(x=list[a]['Open Time'],y=list[a]['Total Volume'],
                marker_color=colors, 
                text = [int(i) for i in list[a]['Total Volume']]),
                row=i, col=col)
    a+=1
    return a

In [34]:
fig = make_subplots(rows=5, cols=3,
        subplot_titles=title_list2)

a=0

for i in range (1,6):
        a=subplot_plotly(a,1) 
        a=subplot_plotly(a,2) 
        a=subplot_plotly(a,3)

fig.update_layout(height=1400, width=1300, showlegend = False,
        title=dict(
        text="Trading Signal",
        font=dict(size=24),
        x=0.5))
fig.show()

## <a id='toc1_5_'></a>[Insights](#toc0_)

1. Can the trading signal influence clients to open a position/start trading? \
Based on the graphs, there are some days where the volume is high when there is a signal however there are more days where the volume is not the highest. So I can conclude that **there is no significant increase on the size of volume during the day where the signal came out**. 

2. Is the trading signal profitable for the clients? \
If we calculate the number of Profit and Lose outcome of the signal, there are 4 Profit outcome and 5 Lose outcome. The result states that **there are more Lose outcome compared to Profit**. <u>However, we still need to consider the fact that the number of samples is fairly small, with only a total of 15 signals data (5 being No trade), so it might not produce the most accurate result</u>.

3. Does the result of the trading signal effect the volume of trades for the next 2 days? \
I am going to analyze the Profit and Lose results only, and ignore the days where the result is No trade. Based on the graphs, **a Profit outcome of a signal tend to result in a higher volume of trades for the next 2 days**. On the contrary, **a Lose outcome tend to decrease the size of volume for the next 2 days**, compared to the day where the signal came out.