BASHkrawler

1. Description

Bash Web Crawler to find URLs by parsing the HTML source code and the found javascript links on homepage of a required specific website domain. It is also possible to use a pattern word as optional argument to customize the URLs extraction.

2. Install

➜ git clone https://github.com/torsh4rk/BASHkrawler.git
➜ cd BASHkrawler/ && chmod +x bashkrawler.sh
➜ ./bashkrawler.sh

3. Example Usage

Fig.1 - Displaying banner

3.1. Making HTML parsing without using a pattern word to match

Fig.2 - Chosing the option 1 to find all URLs at target domain www.nasa.gov via HTML parsing

Fig.3 - Finding all URLs at target domain www.nasa.gov via HTML parsing

3.3. Finding all JS links at target domain and parsing them without using a pattern word to match

Fig.4 - Chosing the option 2 to find all JS links at target domain www.nasa.gov and extract all URLs from this found JS links

3.3. Making a full web crawling by running the option 1 and 2 without using a pattern word to match

Fig.5 - Chosing the option 3 to find all URLs at target domain www.nasa.gov via option 1 and 2 without using a pattern word to match

Fig.6 - Finishing the full web crawling at target domain www.nasa.gov

3.4. Making HTML parsing by using a pattern word to match

Fig.7 - Make web crawling at a target domain and find all URLs with the word ".nasa"

Fig.8 - Chosing the option 3 to find all URLs with the word "nasa" at target domain www.nasa.gov via option 1 and 2

Fig.9 - Finishing the full web crawling at target domain www.nasa.gov by using the word ".nasa"

4. References

https://medium.datadriveninvestor.com/what-is-a-web-crawler-and-how-does-it-work-b9e9c2e4c35d

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
images		images
user_agents		user_agents
README.md		README.md
bashkrawler.sh		bashkrawler.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BASHkrawler

1. Description

2. Install

3. Example Usage

3.1. Making HTML parsing without using a pattern word to match

3.3. Finding all JS links at target domain and parsing them without using a pattern word to match

3.3. Making a full web crawling by running the option 1 and 2 without using a pattern word to match

3.4. Making HTML parsing by using a pattern word to match

4. References

About

Releases

Packages

Languages

torsh4rk/BASHkrawler

Folders and files

Latest commit

History

Repository files navigation

BASHkrawler

1. Description

2. Install

3. Example Usage

3.1. Making HTML parsing without using a pattern word to match

3.3. Finding all JS links at target domain and parsing them without using a pattern word to match

3.3. Making a full web crawling by running the option 1 and 2 without using a pattern word to match

3.4. Making HTML parsing by using a pattern word to match

4. References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages