This repository was archived by the owner on Dec 22, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 263
Scrape algo trading articles from quant finance websites #384
Merged
Merged
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
91b29f1
Create README.md
shivanshsinghal107 df6b5dc
Create requirements.txt
shivanshsinghal107 06f4472
Uploaded script to scrape top algo trading articles
shivanshsinghal107 a5f1a19
Uploaded screenshot of the scraping script
shivanshsinghal107 867e426
Added instructions to run the script
shivanshsinghal107 9ce12de
Added comments for some difficult parts of code
shivanshsinghal107 830a88d
Added details about script and some useful information
shivanshsinghal107 b1fb27b
Added functionality to store article links into a output.txt file
shivanshsinghal107 edebe1f
Delete script_in_action.PNG
shivanshsinghal107 cf5a561
Uploaded screenshots and sample output files
shivanshsinghal107 79261bf
Update README.md
shivanshsinghal107 817861d
Delete script_in_action.PNG
shivanshsinghal107 cffe074
Moved output.txt to output folder
shivanshsinghal107 12ccef4
Delete output.PNG
shivanshsinghal107 c442383
Uploaded screenshots of script in action and output
shivanshsinghal107 f50d5bc
Added screenshots of the script and output
shivanshsinghal107 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
20 changes: 20 additions & 0 deletions
20
Scripts/Web_Scrappers/Algo_Trading_Articles_Scrapper/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
## Algorithmic Trading & Mathematical Finance Articles Scrapper | ||
### What this script does? | ||
Scrapes top articles of algorithmic trading and mathematical finance from quantopian, quantstart and quantocarcy.<br> | ||
|
||
**Quantopian** - Quantopian is a Boston-based company that aims to create a crowd-sourced hedge fund by letting freelance quantitative analysts develop, test, and use trading algorithms to buy and sell securities.<br> | ||
**Quantstart** - Quantocracy is a curated mashup of trading blogs that deal in the quantitative and the empirical.<br> | ||
**Quantocracy** - QuantStart is an online portal for mathematical finance articles and tutorials on derivatives pricing, primarily to help prospective quants gain a role in quantitative finance. | ||
|
||
### How to use this script? | ||
- Run the below command to make sure libraries used in this script are installed:<br> | ||
`pip install -r requirements.txt` | ||
|
||
- Run the following command:<br> | ||
`python scraper.py` | ||
|
||
### Script in Action | ||
 | ||
|
||
### Script Output Text File | ||
 |
Binary file added
BIN
+118 KB
Scripts/Web_Scrappers/Algo_Trading_Articles_Scrapper/output/output.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
67 changes: 67 additions & 0 deletions
67
Scripts/Web_Scrappers/Algo_Trading_Articles_Scrapper/output/sample.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
Here's the top 10 articles from Quantopian: | ||
Welles Wilder's ADX - Average Directional Index (technical indicator implementation) | ||
https://www.quantopian.com/posts/welles-wilders-adx-average-directional-index-technical-indicator-implementation | ||
Learning SDEs in Python | ||
https://www.quantopian.com/posts/learning-sdes-in-python | ||
Futures Data Now Available in Research | ||
https://www.quantopian.com/posts/futures-data-now-available-in-research | ||
New Strategy - Presenting the “Quality Companies in an Uptrend” Model | ||
https://www.quantopian.com/posts/new-strategy-presenting-the-quality-companies-in-an-uptrend-model-1 | ||
Global Equity Pricing and Fundamental Data | ||
https://www.quantopian.com/posts/global-equity-pricing-and-fundamental-data | ||
New book on Quantopian/Zipline backtesting and modeling | ||
https://www.quantopian.com/posts/new-book-on-quantopian-slash-zipline-backtesting-and-modeling | ||
Pairs Trading with Machine Learning | ||
https://www.quantopian.com/posts/pairs-trading-with-machine-learning | ||
The Next Quantopian-Based Paper on Uncovering Momentum | ||
https://www.quantopian.com/posts/the-next-quantopian-based-paper-on-uncovering-momentum | ||
Analyzing the relationship between investor attention and the predictability of arbitrage strategies for the US market | ||
https://www.quantopian.com/posts/analyzing-the-relationship-between-investor-attention-and-the-predictability-of-arbitrage-strategies-for-the-us-market | ||
The 101 Alphas Project | ||
https://www.quantopian.com/posts/the-101-alphas-project | ||
|
||
|
||
Here's the top 10 articles from Quantocracy: | ||
Exploring the PMFG Portfolios for Covid-19 Robustness [Hudson and Thames] | ||
https://hudsonthames.org/exploring-the-pmfg-portfolios-for-covid-19-robustness/ | ||
The Next 5 Weeks All Are Among The Weakest – And Strongest – Of The Year [Quantifiable Edges] | ||
https://quantifiableedges.com/the-next-5-weeks-all-are-among-the-weakest-and-strongest-of-the-year/ | ||
Lottery Preferences and Their Relationship with Factor Investing [Alpha Architect] | ||
https://alphaarchitect.com/2020/10/01/lottery-preferences-and-anomalies/ | ||
Using strength to exit a mean reversion trade [Alvarez Quant Trading] | ||
https://alvarezquanttrading.com/blog/using-strength-to-exit-a-mean-reversion-trade/ | ||
Safe Withdrawal Rates for Tactical Asset Allocation vs Buy & Hold [Allocate Smartly] | ||
https://allocatesmartly.com/safe-withdrawal-rates-for-tactical-asset-allocation-vs-buy-hold/?aff=634 | ||
Does Financial Leverage Make Stocks Riskier? [Factor Research] | ||
https://www.factorresearch.com/research-does-financial-leverage-make-stocks-riskier | ||
September update, paper trading with IB [Regressionist] | ||
http://www.regressionist.com/2020/09/25/september-update-paper-trading-with-ib/ | ||
Writing conundrums [OSM] | ||
https://osm.netlify.app/post/writing-conundrums/ | ||
API Algo Trading Landscape [Alpaca] | ||
https://alpaca.markets/learn/algo-trading-landscape/ | ||
Petra on Programming: The Gann Hi-Lo Activator [Financial Hacker] | ||
https://financial-hacker.com/petra-on-programming-the-gann-hi-lo-activator/ | ||
|
||
|
||
Here's the top 10 articles from Quantstart Systematic Trading: | ||
Connecting to the Interactive Brokers Native Python API | ||
https://www.quantstart.com/articles/connecting-to-the-interactive-brokers-native-python-api/ | ||
Generating Synthetic Histories for Backtesting Tactical Asset Allocation Strategies | ||
https://www.quantstart.com/articles/generating-synthetic-histories-for-backtesting-tactical-asset-allocation-strategies/ | ||
The 60/40 Benchmark Portfolio | ||
https://www.quantstart.com/articles/the-6040-benchmark-portfolio/ | ||
Systematic Tactical Asset Allocation: An Introduction | ||
https://www.quantstart.com/articles/systematic-tactical-asset-allocation-an-introduction/ | ||
Capital Raising for Early Stage Quant Fund Managers - Part I | ||
https://www.quantstart.com/articles/capital-raising-for-early-stage-quant-fund-managers-part-i/ | ||
High Frequency Trading III: Optimal Execution | ||
https://www.quantstart.com/articles/high-frequency-trading-iii-optimal-execution/ | ||
High Frequency Trading II: Limit Order Book | ||
https://www.quantstart.com/articles/high-frequency-trading-ii-limit-order-book/ | ||
High Frequency Trading I: Introduction to Market Microstructure | ||
https://www.quantstart.com/articles/high-frequency-trading-i-introduction-to-market-microstructure/ | ||
Backtesting Systematic Trading Strategies in Python: Considerations and Open Source Frameworks | ||
https://www.quantstart.com/articles/backtesting-systematic-trading-strategies-in-python-considerations-and-open-source-frameworks/ | ||
What are the Different Types of Quant Funds? | ||
https://www.quantstart.com/articles/what-are-the-different-types-of-quant-funds/ |
Binary file added
BIN
+20.4 KB
Scripts/Web_Scrappers/Algo_Trading_Articles_Scrapper/output/script_in_action.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions
2
Scripts/Web_Scrappers/Algo_Trading_Articles_Scrapper/requirements.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
beautifulsoup4==4.9.3 | ||
requests==2.24.0 |
87 changes: 87 additions & 0 deletions
87
Scripts/Web_Scrappers/Algo_Trading_Articles_Scrapper/scraper.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
import requests | ||
from bs4 import BeautifulSoup | ||
|
||
# function to get top 10 articles from quantopian | ||
def get_quantopian_articles(): | ||
res = requests.get("https://www.quantopian.com/posts") | ||
soup = BeautifulSoup(res.text, "html.parser") | ||
posts = soup.select("#search-results")[0] | ||
top10 = posts.findAll('div', {'class': 'post-title'})[:10] | ||
|
||
lines = [] | ||
for i in top10: | ||
lines.append(i.text.strip()) | ||
|
||
hrefs = [] | ||
for i in top10: | ||
for a in i.find_all('a', href = True): | ||
hrefs.append(a['href'].strip()) | ||
|
||
links = [] | ||
for href in hrefs: | ||
link = 'https://www.quantopian.com' + href | ||
links.append(link) | ||
|
||
top10 = "Here's the top 10 articles from Quantopian:\n" | ||
for i in range(0, 10): | ||
top10 += lines[i] + "\n" + links[i] + "\n" | ||
return top10 | ||
|
||
# function to get top 10 articles from quantocracy | ||
def get_quantocracy_articles(): | ||
# using user agent to bypass the site block for bots | ||
headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'} | ||
res = requests.get("https://quantocracy.com/", headers = headers) | ||
soup = BeautifulSoup(res.text, "html.parser") | ||
posts = soup.select("#qo-mashup")[0] | ||
top10 = posts.findAll('a', {'class': 'qo-title'})[:10] | ||
|
||
lines = [] | ||
for i in top10: | ||
lines.append(i.text.strip()) | ||
|
||
links = [] | ||
for i in top10: | ||
links.append(i['href']) | ||
|
||
top10 = "Here's the top 10 articles from Quantocracy:\n" | ||
for i in range(0, 10): | ||
top10 += lines[i] + "\n" + links[i] + "\n" | ||
return top10 | ||
|
||
# function to get top 10 articles from quantstart systematic trading | ||
def get_quantstart_articles(): | ||
res = requests.get("https://www.quantstart.com/articles/topic/systematic-trading/") | ||
soup = BeautifulSoup(res.text, "html.parser") | ||
posts = soup.select("body > div > section.mb-2 > div")[0] | ||
|
||
lines = [] | ||
for post in posts.findAll('p')[:10]: | ||
lines.append(post.text) | ||
|
||
hrefs = [] | ||
for href in posts.findAll('a')[:10]: | ||
hrefs.append(href['href']) | ||
|
||
links = [] | ||
for href in hrefs: | ||
link = 'https://www.quantstart.com' + href | ||
links.append(link) | ||
|
||
top10 = "Here's the top 10 articles from Quantstart Systematic Trading:\n" | ||
for i in range(0, 10): | ||
top10 += lines[i] + "\n" + links[i] + "\n" | ||
return top10 | ||
|
||
|
||
def main(): | ||
quantopian = get_quantopian_articles() | ||
quantocracy = get_quantocracy_articles() | ||
quantstart = get_quantstart_articles() | ||
file = open('output.txt', 'w') | ||
file.write(f"{quantopian}\n\n{quantocracy}\n\n{quantstart}") | ||
file.close() | ||
print("Article Links saved in the output file") | ||
|
||
if __name__ == "__main__": | ||
main() |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything else is cool, just use the 'with' statement to open the file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, doin' it