Skip to content

Josue87/MetaFinder

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 

MetaFinder

Search for documents in a domain through Search Engines. The objective is to extract metadata.


Installation:

> pip3 install metafinder

Upgrades are also available using:

> pip3 install metafinder --upgrade

Usage

MetaFinder can be used in 2 ways:

CLI

metafinder -d domain.com -l 20 -o folder [-t 10] -go -bi -ba

Parameters:

  • d: Specifies the target domain.
  • l: Specify the maximum number of results to be searched in the searchs engines.
  • o: Specify the path to save the report.
  • t: Optional. Used to configure the threads (4 by default).
  • v: Show Metafinder version.
  • Search Engines to select (Google by default):
    • go: Optional. Search in Google.
    • bi: Optional. Search in Bing.
    • ba: Optional. Search in Baidu. (Experimental)

In Code

import metafinder.extractor as metadata_extractor

documents_limit = 5
domain = "target_domain"
result = metadata_extractor.extract_metadata_from_google_search(domain, documents_limit)
# result = metadata_extractor.extract_metadata_from_bing_search(domain, documents_limit)
# result = metadata_extractor.extract_metadata_from_baidu_search(domain, documents_limit)
authors = result.get_authors()
software = result.get_software()
for k,v in result.get_metadata().items():
    print(f"{k}:")
    print(f"|_ URL: {v['url']}")
    for metadata,value in v['metadata'].items():
        print(f"|__ {metadata}: {value}")

document_name = "test.pdf"
try:
    metadata_file = metadata_extractor.extract_metadata_from_document(document_name)
    for k,v in metadata_file.items():
        print(f"{k}: {v}")
except FileNotFoundError:
    print("File not found")

Example

image

Author

This project has been developed by:

Contributors

Disclaimer!

The software is designed to leave no trace in the documents we upload to a domain. The author is not responsible for any illegitimate use.

About

Search for documents in a domain through Search Engines (Google, Bing and Baidu). The objective is to extract metadata

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages