WayBackup Finder

This Python script fetches URLs from the Wayback Machine and filters them based on specified file extensions. It also checks if archived snapshots are available for each URL and saves the filtered URLs to files.

Read more: Medium
Watch Tool in action: Medium

Community Blogs:
WayBackupFinder Passive Recon

Features

Fetches URLs from the Wayback Machine using the CDX API.
Filters the fetched URLs by specific file extensions (e.g., .pdf, .zip).
Checks if a Wayback snapshot is available for each URL.
Saves the filtered URLs to text files.
Customizable file extensions to filter or use default extensions from extensions.txt.

Use Case: Finding Archived Backups

This tool can be especially useful for finding backups of websites or files that may no longer be available on the live site. If a resource (e.g., a PDF or image) was previously available on the website but has since been removed or is temporarily unavailable, there may still be an archived snapshot of it in the Wayback Machine.

By using this script, you can:

Identify URLs that may have once been accessible but no longer are.
Check for backups on the Wayback Machine that might not be available on the current live site.
Retrieve historical versions of files or content that have been deleted or moved.

The script attempts to retrieve URLs from the Wayback Machine. For each URL found, it checks if an archived snapshot is available. If a snapshot exists, the script provides a link to the backup.

Requirements

Python 3.x
The following Python packages:
- requests
- colorama
- termcolor

You can install the required packages using the following command:

pip3 install requests colorama termcolor

How to Use

Clone the repository or download the script.
Ensure you have a file named extensions.txt in the same directory as the script, or specify custom file extensions.
Run the script:

python wayBackupFinder.py

When prompted, enter the target domain (e.g., example.com) and specify whether to use custom file extensions or load them from the extensions.txt file.
The script will:
- Fetch URLs from the Wayback Machine.
- Filter the URLs by the provided file extensions.
- Save the filtered URLs to separate files.
- Check if archived snapshots are available for each URL.

Example

Input:

Enter the target domain (e.g., example.com): example.com
Would you like to use custom file extensions or load from extensions.txt? (custom/load): load

Output:

The script will print the progress and save the filtered URLs to files such as:

Filtered URLs for .pdf saved to: content/example.com/example.com_pdf_filtered_urls.txt
Found possible backup: https://web.archive.org/web/20200101000000/https://example.com/sample.pdf

File Extensions

You can specify custom file extensions to filter by, separated by commas, for example: .zip,.pdf,.jpg. If you choose to load extensions from extensions.txt, the script will use those.

File Structure

The script will create a folder called content and store the filtered URLs for each extension in subfolders named after the target domain.

Community Resources

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
LICENSE		LICENSE
README.md		README.md
extensions.txt		extensions.txt
wayBackupFinder.py		wayBackupFinder.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WayBackup Finder

Features

Use Case: Finding Archived Backups

Requirements

How to Use

Example

Input:

Output:

File Extensions

File Structure

Community Resources

License

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

License

anmolksachan/WayBackupFinder

Folders and files

Latest commit

History

Repository files navigation

WayBackup Finder

Features

Use Case: Finding Archived Backups

Requirements

How to Use

Example

Input:

Output:

File Extensions

File Structure

Community Resources

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages