This repo will contain drafts and updates towards making a modern, efficient webcrawler and text processor in Python. TextProcessorRoughDraft.py contains a class and is an overall program that will read a file and store every unique word (stripping trailing punctuation and converting to the same case) into an object that will store its frequencies in the file That object (usually a python dictionary object) will then be put into a sorted list order by frequency (HIGHEST FIRST) then alphabetically FileComparator.py does the same but with two files and stores all the words they have in common as well as frequency, sorting in the previously stated fashion as well.
-
Notifications
You must be signed in to change notification settings - Fork 0
joshpas64/Building-Python-WebCrawler
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
About
This repo will contain drafts and updates towards making a modern, efficient webcrawler and text processor in Python.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published