wiki-spider

A personal project to generate large datasets

Utilizing web scraping packages for python and cheap database hosting at AWS, this script crawls fandom wikis for relevant links to other pages in that wiki, captures them in a python list, then goes one-by-one through that list capturing the HTML of the page and extracting relevant information. Github Actions runs the script.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
.DS_Store		.DS_Store
README.md		README.md
general_spider.py		general_spider.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

pycache

pycache

.DS_Store

.DS_Store

README.md

README.md

general_spider.py

general_spider.py

requirements.txt

requirements.txt

Repository files navigation

wiki-spider

A personal project to generate large datasets

About

Releases

Packages

Languages

Trinitui/wiki-spider

Folders and files

Latest commit

History

Repository files navigation

wiki-spider

A personal project to generate large datasets

About

Resources

Stars

Watchers

Forks

Languages