This repository contains a spider and a probe tool.
The spider tool takes an url as an input alongside a keyword and prints out urls that contain that input while keeping in mind that no duplicated url is present in the output.
^ The successful spider run
In the urls.txt file, we have pasted the output from spider.py so that we can use that file as an input for the probe.py.
Now, for probe.py, we will run it alongside cat command.
cat urls.txt | python3 probe.py
All the responsive urls will be stored in a file called filtered_urls.txt and all the bad urls will be discarded.
^ The successful probe run
^ Output file successfully generated