web-crawler

Thing done by this web-crawler

Recursively crawls https://stackoverflow.com/questions using Node.js based crawler, harvests all questions on Stack Overflow and stores them in MongoDB Database.

What exactly will be stored

Every unique URL (Stack Overflow question).
The total reference count for every URL (How many time this URL was encountered).
Total # of upvotes and total # of answers for every question.

Finally it dumps the data in a CSV file when the user kills the script.

Project Setup

Install npm package required by the project using the command
```
npm install 
```
Create a config.env file in root folder of the project and add these line with connection of your mongoDB database
```
DATABASE=YOUR_MONGODB_DATABASE_CONNECTION_URI
```
To start the script
```
npm start
```

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
dist		dist
files		files
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
app.js		app.js
package-lock.json		package-lock.json
package.json		package.json
server.js		server.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dist

dist

files

files

models

models

utils

utils

.gitignore

.gitignore

README.md

README.md

app.js

app.js

package-lock.json

package-lock.json

package.json

package.json

server.js

server.js

Repository files navigation

web-crawler

Thing done by this web-crawler

Project Setup

About

Releases

Packages

Languages

madannaik/web-crawler

Folders and files

Latest commit

History

Repository files navigation

web-crawler

Thing done by this web-crawler

Project Setup

About

Resources

Stars

Watchers

Forks

Languages