Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Forex Price Spikes #668

Open
jbonilla-tao opened this issue May 10, 2024 · 4 comments
Open

Incorrect Forex Price Spikes #668

jbonilla-tao opened this issue May 10, 2024 · 4 comments
Assignees
Labels
bug Something isn't working data

Comments

@jbonilla-tao
Copy link

jbonilla-tao commented May 10, 2024

Forex Websocket prices ("CAS.*") and Agg Data have occasional price spikes. These price spikes don't align with TradingView chart or Forex.com.

I found the reason why:
Polygon.io is using exchange bid price instead of trade price to populate websocket and agg data.

Here I am printing prices of CADJPY. The price 112.735 is an outlier that never happened. That price came from a stink bid on the exchange.

Print outs of Aggs and Quotes for the surrounding timestamps:

agg Agg(open=113.7, high=113.7, low=113.7, close=113.7, volume=1, vwap=113.7, timestamp=1715288494000, transactions=1, otc=None) dt 8999 ms
agg Agg(open=112.735, high=112.735, low=112.735, close=112.735, volume=1, vwap=112.735, timestamp=1715288495000, transactions=1, otc=None) dt 7999 ms
agg Agg(open=112.735, high=112.735, low=112.735, close=112.735, volume=1, vwap=112.735, timestamp=1715288497000, transactions=1, otc=None) dt 5999 ms
agg Agg(open=113.53, high=113.53, low=113.53, close=113.53, volume=1, vwap=113.53, timestamp=1715288500000, transactions=1, otc=None) dt 2999 ms
agg Agg(open=113.682, high=113.682, low=113.682, close=113.682, volume=1, vwap=113.682, timestamp=1715288503000, transactions=1, otc=None) dt -1 ms
agg Agg(open=112.735, high=112.735, low=112.735, close=112.735, volume=1, vwap=112.735, timestamp=1715288504000, transactions=1, otc=None) dt -1001 ms
agg Agg(open=113.62, high=113.62, low=113.62, close=113.62, volume=1, vwap=113.62, timestamp=1715288505000, transactions=1, otc=None) dt -2001 ms
agg Agg(open=113.7, high=113.7, low=113.7, close=113.7, volume=1, vwap=113.7, timestamp=1715288509000, transactions=1, otc=None) dt -6001 ms
agg Agg(open=113.63, high=113.63, low=113.63, close=113.63, volume=1, vwap=113.63, timestamp=1715288510000, transactions=1, otc=None) dt -7001 ms

quote Quote(ask_exchange=48, ask_price=113.66, ask_size=None, bid_exchange=48, bid_price=113.62, bid_size=None, conditions=None, indicators=None, participant_timestamp=1715288505000000000, sequence_number=None, sip_timestamp=None, tape=None, trf_timestamp=None)
quote Quote(ask_exchange=48, ask_price=114.358, ask_size=None, bid_exchange=48, bid_price=112.735, bid_size=None, conditions=None, indicators=None, participant_timestamp=1715288504000000000, sequence_number=None, sip_timestamp=None, tape=None, trf_timestamp=None)
quote Quote(ask_exchange=48, ask_price=113.805, ask_size=None, bid_exchange=48, bid_price=113.682, bid_size=None, conditions=None, indicators=None, participant_timestamp=1715288503000000000, sequence_number=None, sip_timestamp=None, tape=None, trf_timestamp=None)
quote Quote(ask_exchange=48, ask_price=113.57, ask_size=None, bid_exchange=48, bid_price=113.53, bid_size=None, conditions=None, indicators=None, participant_timestamp=1715288500000000000, sequence_number=None, sip_timestamp=None, tape=None, trf_timestamp=None)


Attaching images of these crazy spikes when monitoring websocket data:

image

image

image

Possible solutions:

  1. Get actual trade data from data providers and use this to populate websocket and agg data instead of faking it.

  2. If trade data is not available, i recommend averaging the bid and ask price instead of just using the bid price. If the delta between bid and ask is too high, discard the price point. This seems to produce reasonable results.

Without a solution, there seems to be no point to having the ""CAS.*" and "Agg" data when it merely uses the bid price in the existing Quote data. I implemented solution #2 already in my branch if you want to reference it, Solution #1 is the proper and preferred fix tho

@jbonilla-tao jbonilla-tao added the bug Something isn't working label May 10, 2024
@justinpolygon
Copy link
Contributor

justinpolygon commented May 14, 2024

Hey @jbonilla-tao. sorry for the delay here. I've pinged the backend team to check this out and I'll let you know. Thanks for the detailed write up and images too. This really helps.

@justinpolygon justinpolygon self-assigned this May 14, 2024
@jbonilla-tao
Copy link
Author

Hi @justinpolygon thanks for responding. I found that filtering quote data when the bid and ask are more than .5% apart filters these weird spikes

@justinpolygon
Copy link
Contributor

justinpolygon commented May 15, 2024

Thanks @jbonilla-tao. I was able to confirm with the engineering and data teams that we are indeed building aggs for forex off of BBO (Quote) data. The aggs are generated using the bid side of these quotes. I'll get the documentation updated to make sure this is clear. There is also active research/engineering work happening on our end to look at this. Again, sorry this wasn't more clear and thank you for such a detailed report it really help to track things down.

@jbonilla-tao
Copy link
Author

jbonilla-tao commented May 15, 2024

Thanks @justinpolygon
EDIT: sorry disregard. All quotes were used.

I see the aggs are built from the quotes and do some kind of filtering.
For example, this Agg

{"ticker":"C:USDCAD","queryCount":1,"resultsCount":1,"adjusted":true,"results":[{"v":3,"vw":1.3692,"o":1.36922,"c":1.3692,"h":1.36922,"l":1.36918,"t":1715092830000,"n":3}],"status":"OK","request_id":"96d9cb37dc237a63c1dd601d65ade068","count":1}

Is built from these quotes
{"results":[{"ask_exchange":48,"ask_price":1.3697,"bid_exchange":48,"bid_price":1.3692,"participant_timestamp":1715092830000000000},{"ask_exchange":48,"ask_price":1.36931,"bid_exchange":48,"bid_price":1.36918,"participant_timestamp":1715092830000000000},{"ask_exchange":48,"ask_price":1.36926,"bid_exchange":48,"bid_price":1.36922,"participant_timestamp":1715092830000000000}],"status":"OK","request_id":"7544ca2a6d35f6f6ab277082e5db7090"}

What is the algorithm to filter out quotes? Some kind of statistical analysis? One of the three quotes is not reflect in the agg.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data
Projects
None yet
Development

No branches or pull requests

2 participants