Scrapy(Stocks Scraper)

This crawler uses puppeteer to crawl tradingview.com, and collects information on stocks in various sectors including healthcare, technology, energy, and more. It collects information such as the rating, name, price, change in price, etc. of stocks and stores them locally in a xlsx file on an automated schedule (just after the markets close for the day).

Motivation

The reason I created this repository was primarily to gain experience in automated web scraping to retrieve and store data from the web into a xlsx file. A secondary goal was to learn a bit about investing and the stock market along the way in order to plan for my financial future.

Future Steps

I may in the future attempt to analyze news articles for sentiment analysis, and see if I can use that to observe some kind of trends in stock prices. In addition, I also want to create a data pipeline to analyze my crawler's logs to ensure the automation is continually running smoothly. For now, this has simply been a good introduction into web scraping, and I am currently looking for a project I am more interested in to apply these concepts.

What I've Learned

How to call API and css selectors to extract information from specific web pages.
How to automate running programs using NodeJS
File System and Path - how to create directories and use them according to our needs
Web scraping basics, such as:

Limiting page requests to prevent bot detection or removal (and to be nice to the target website)
Webpages contain robots.txt files that provide valuable information on their web-crawling policies
Difficulties around crawling dynamically loaded content
Websites create pitfalls to detect bots such as hidden tags that would only be visible to bots
How to schedule program through NODE-CRON so that it automates our task

Run it on your machine

Requirements

node

1.clone the project and install dependencies

npm module --> cheerio, puppeteer, node-cron, fs, path to install it type/copy in cmd or teminal npm init

2.then in the same directory run node tradingview.js

20210920_1080p.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitIgnore		.gitIgnore
README.md		README.md
makeDir.js		makeDir.js
package-lock.json		package-lock.json
storeData.js		storeData.js
tradingView.js		tradingView.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitIgnore

.gitIgnore

README.md

README.md

makeDir.js

makeDir.js

package-lock.json

package-lock.json

storeData.js

storeData.js

tradingView.js

tradingView.js

Repository files navigation

Scrapy(Stocks Scraper)

Motivation

Future Steps

What I've Learned

Run it on your machine

Requirements

node

About

Releases

Packages

Languages

supersharmapunit/Scrapy

Folders and files

Latest commit

History

Repository files navigation

Scrapy(Stocks Scraper)

Motivation

Future Steps

What I've Learned

Run it on your machine

Requirements

node

About

Resources

Stars

Watchers

Forks

Languages