Skip to content

Trinitui/wiki-spider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wiki-spider

A personal project to generate large datasets

Utilizing web scraping packages for python and cheap database hosting at AWS, this script crawls fandom wikis for relevant links to other pages in that wiki, captures them in a python list, then goes one-by-one through that list capturing the HTML of the page and extracting relevant information. Github Actions runs the script.

About

Wiki Crawling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages