Skip to content
This repository was archived by the owner on Dec 22, 2023. It is now read-only.

Scrape algo trading articles from quant finance websites #384

Merged
merged 16 commits into from
Oct 5, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions Scripts/Web_Scrappers/Algo_Trading_Articles_Scrapper/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
## Algorithmic Trading & Mathematical Finance Articles Scrapper
### What this script does?
Scrapes top articles of algorithmic trading and mathematical finance from quantopian, quantstart and quantocarcy.<br>

**Quantopian** - Quantopian is a Boston-based company that aims to create a crowd-sourced hedge fund by letting freelance quantitative analysts develop, test, and use trading algorithms to buy and sell securities.<br>
**Quantstart** - Quantocracy is a curated mashup of trading blogs that deal in the quantitative and the empirical.<br>
**Quantocracy** - QuantStart is an online portal for mathematical finance articles and tutorials on derivatives pricing, primarily to help prospective quants gain a role in quantitative finance.

### How to use this script?
- Run the below command to make sure libraries used in this script are installed:<br>
`pip install -r requirements.txt`

- Run the following command:<br>
`python scraper.py`

### Script in Action
![script in action](output/script_in_action.PNG)

### Script Output Text File
![output file](output/output.PNG)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
Here's the top 10 articles from Quantopian:
Welles Wilder's ADX - Average Directional Index (technical indicator implementation)
https://www.quantopian.com/posts/welles-wilders-adx-average-directional-index-technical-indicator-implementation
Learning SDEs in Python
https://www.quantopian.com/posts/learning-sdes-in-python
Futures Data Now Available in Research
https://www.quantopian.com/posts/futures-data-now-available-in-research
New Strategy - Presenting the “Quality Companies in an Uptrend” Model
https://www.quantopian.com/posts/new-strategy-presenting-the-quality-companies-in-an-uptrend-model-1
Global Equity Pricing and Fundamental Data
https://www.quantopian.com/posts/global-equity-pricing-and-fundamental-data
New book on Quantopian/Zipline backtesting and modeling
https://www.quantopian.com/posts/new-book-on-quantopian-slash-zipline-backtesting-and-modeling
Pairs Trading with Machine Learning
https://www.quantopian.com/posts/pairs-trading-with-machine-learning
The Next Quantopian-Based Paper on Uncovering Momentum
https://www.quantopian.com/posts/the-next-quantopian-based-paper-on-uncovering-momentum
Analyzing the relationship between investor attention and the predictability of arbitrage strategies for the US market
https://www.quantopian.com/posts/analyzing-the-relationship-between-investor-attention-and-the-predictability-of-arbitrage-strategies-for-the-us-market
The 101 Alphas Project
https://www.quantopian.com/posts/the-101-alphas-project


Here's the top 10 articles from Quantocracy:
Exploring the PMFG Portfolios for Covid-19 Robustness [Hudson and Thames]
https://hudsonthames.org/exploring-the-pmfg-portfolios-for-covid-19-robustness/
The Next 5 Weeks All Are Among The Weakest – And Strongest – Of The Year [Quantifiable Edges]
https://quantifiableedges.com/the-next-5-weeks-all-are-among-the-weakest-and-strongest-of-the-year/
Lottery Preferences and Their Relationship with Factor Investing [Alpha Architect]
https://alphaarchitect.com/2020/10/01/lottery-preferences-and-anomalies/
Using strength to exit a mean reversion trade [Alvarez Quant Trading]
https://alvarezquanttrading.com/blog/using-strength-to-exit-a-mean-reversion-trade/
Safe Withdrawal Rates for Tactical Asset Allocation vs Buy & Hold [Allocate Smartly]
https://allocatesmartly.com/safe-withdrawal-rates-for-tactical-asset-allocation-vs-buy-hold/?aff=634
Does Financial Leverage Make Stocks Riskier? [Factor Research]
https://www.factorresearch.com/research-does-financial-leverage-make-stocks-riskier
September update, paper trading with IB [Regressionist]
http://www.regressionist.com/2020/09/25/september-update-paper-trading-with-ib/
Writing conundrums [OSM]
https://osm.netlify.app/post/writing-conundrums/
API Algo Trading Landscape [Alpaca]
https://alpaca.markets/learn/algo-trading-landscape/
Petra on Programming: The Gann Hi-Lo Activator [Financial Hacker]
https://financial-hacker.com/petra-on-programming-the-gann-hi-lo-activator/


Here's the top 10 articles from Quantstart Systematic Trading:
Connecting to the Interactive Brokers Native Python API
https://www.quantstart.com/articles/connecting-to-the-interactive-brokers-native-python-api/
Generating Synthetic Histories for Backtesting Tactical Asset Allocation Strategies
https://www.quantstart.com/articles/generating-synthetic-histories-for-backtesting-tactical-asset-allocation-strategies/
The 60/40 Benchmark Portfolio
https://www.quantstart.com/articles/the-6040-benchmark-portfolio/
Systematic Tactical Asset Allocation: An Introduction
https://www.quantstart.com/articles/systematic-tactical-asset-allocation-an-introduction/
Capital Raising for Early Stage Quant Fund Managers - Part I
https://www.quantstart.com/articles/capital-raising-for-early-stage-quant-fund-managers-part-i/
High Frequency Trading III: Optimal Execution
https://www.quantstart.com/articles/high-frequency-trading-iii-optimal-execution/
High Frequency Trading II: Limit Order Book
https://www.quantstart.com/articles/high-frequency-trading-ii-limit-order-book/
High Frequency Trading I: Introduction to Market Microstructure
https://www.quantstart.com/articles/high-frequency-trading-i-introduction-to-market-microstructure/
Backtesting Systematic Trading Strategies in Python: Considerations and Open Source Frameworks
https://www.quantstart.com/articles/backtesting-systematic-trading-strategies-in-python-considerations-and-open-source-frameworks/
What are the Different Types of Quant Funds?
https://www.quantstart.com/articles/what-are-the-different-types-of-quant-funds/
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
beautifulsoup4==4.9.3
requests==2.24.0
87 changes: 87 additions & 0 deletions Scripts/Web_Scrappers/Algo_Trading_Articles_Scrapper/scraper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
import requests
from bs4 import BeautifulSoup

# function to get top 10 articles from quantopian
def get_quantopian_articles():
res = requests.get("https://www.quantopian.com/posts")
soup = BeautifulSoup(res.text, "html.parser")
posts = soup.select("#search-results")[0]
top10 = posts.findAll('div', {'class': 'post-title'})[:10]

lines = []
for i in top10:
lines.append(i.text.strip())

hrefs = []
for i in top10:
for a in i.find_all('a', href = True):
hrefs.append(a['href'].strip())

links = []
for href in hrefs:
link = 'https://www.quantopian.com' + href
links.append(link)

top10 = "Here's the top 10 articles from Quantopian:\n"
for i in range(0, 10):
top10 += lines[i] + "\n" + links[i] + "\n"
return top10

# function to get top 10 articles from quantocracy
def get_quantocracy_articles():
# using user agent to bypass the site block for bots
headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'}
res = requests.get("https://quantocracy.com/", headers = headers)
soup = BeautifulSoup(res.text, "html.parser")
posts = soup.select("#qo-mashup")[0]
top10 = posts.findAll('a', {'class': 'qo-title'})[:10]

lines = []
for i in top10:
lines.append(i.text.strip())

links = []
for i in top10:
links.append(i['href'])

top10 = "Here's the top 10 articles from Quantocracy:\n"
for i in range(0, 10):
top10 += lines[i] + "\n" + links[i] + "\n"
return top10

# function to get top 10 articles from quantstart systematic trading
def get_quantstart_articles():
res = requests.get("https://www.quantstart.com/articles/topic/systematic-trading/")
soup = BeautifulSoup(res.text, "html.parser")
posts = soup.select("body > div > section.mb-2 > div")[0]

lines = []
for post in posts.findAll('p')[:10]:
lines.append(post.text)

hrefs = []
for href in posts.findAll('a')[:10]:
hrefs.append(href['href'])

links = []
for href in hrefs:
link = 'https://www.quantstart.com' + href
links.append(link)

top10 = "Here's the top 10 articles from Quantstart Systematic Trading:\n"
for i in range(0, 10):
top10 += lines[i] + "\n" + links[i] + "\n"
return top10


def main():
quantopian = get_quantopian_articles()
quantocracy = get_quantocracy_articles()
quantstart = get_quantstart_articles()
file = open('output.txt', 'w')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything else is cool, just use the 'with' statement to open the file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, doin' it

file.write(f"{quantopian}\n\n{quantocracy}\n\n{quantstart}")
file.close()
print("Article Links saved in the output file")

if __name__ == "__main__":
main()