This is a slightly revised version of the thesis submitted in December 2022 in partial fulfillment of the requirements for the degree of Master of Science in Business Administration. All errors and omissions are my own responsibility.
This thesis provides an overview of the recent advances in reinforcement learning in pricing and hedging financial instruments, with a primary focus on a detailed explanation of the Q-Learning Black-Scholes approach, introduced by Halperin (2017). This reinforcement learning approach bridges the traditional Black and Scholes (1973) model with novel artificial intelligence algorithms, enabling option pricing and hedging in a completely model-free and data-driven way. This paper also explores the algorithm’s performance under different state variables and scenarios for a European put option. The results reveal that the model is an accurate estimator under different levels of volatility and hedging frequency. Moreover, this method exhibits robust performance across various levels of option’s moneyness. Lastly, the algorithm incorporates proportional transaction costs, indicating diverse impacts on profit and loss, affected by different statistical properties of the state variables.
Full thesis: Applying Reinforcement Learning to Option Pricing and Hedging.pdf
This thesis is also available at https://ssrn.com/abstract=4546371 and https://arxiv.org/abs/2310.04336