A compilation of links to datajournalism & OSINT tools, guides and resources I find useful to keep at hand. PRs welcomed!
🌐= online tool/service/resource 💻= software 📖= guide/tutorial 📝= list of tools/resources 🐍= Python module 💲= paid or paid-only tool/service
- Breached Data
- Data Analysis & Manipulation
- Lists of tools & resources
- Location, Maps, Satellite Imagery
- Multi-purpose tools
- Phone numbers
- Pictures, Photos, Videos
- Social Media
- Text & Documents
- Whistleblowing software
- Public APIs
📝- A categorized list of APIs.
💻- API development environment offering useful tools for crafting and debugging API requests.
📝- A good API directory.
- Internet Archive Wayback Machine
🌐- Saves pages as screenshots, useful for websites the WayBack Machine can't handle.
🌐 💲- Web capture tool designed for online Investigations ($129.99/y).
- How to Archive Open Source Materials
- Firefox Screenshots
💻- Firefox can take a screenshot of a full page (i.e. 'scrolling' screenshot).
- Have I Been Pwned?
🌐- Check if an email appears in a breach, set up alerts.
🌐- Check if an email appears in a breach. Shows the first 3 characters of the password for free.
🌐 💲- Find cleartext & hashed password from data breaches (paid, $4/week, $11/mo).
💻- Find passwords through different breach and reconnaissance services. Can also search the BreachedCompilation torrent.
- Breach Data Search Engines Comparison
- OCCRP Data
🌐- Fantastic search tool & resources made available by OCCRP. Public records, leaks, scraped business registers, and more.
🌐- A very comprehensive companies database. Has an API
- ICIJ's Offshore Leaks Database
🌐- Data on offshore companies, foundations and trusts from the Panama Papers, the Offshore Leaks, the Bahamas Leaks and the Paradise Papers investigations.
- List of company registers
📝(Wikipedia) - A list of all companies registers, by country.
- CompaniesHouse Short Guide
📖- (Bellingcat) A guide about the UK online company registry.
Data Analysis & Manipulation
💻- Clean & transform messy data.
💻- A suite of command-line tools for converting to and working with CSV files.
🐍- Powerful Python data analysis library. Best used in a Jupyter notebook.
💻- Python command-line tool to search several search engines for mail addresses from a particular domain.
- The most complete guide to finding anyone's email
Lists of tools & resources
📝- Resources, links & OSINT tools, organized by target data.
📖(Bellingcat) - OSINT & Datajournalism how-tos.
- Online Investigation Toolkit
📝- A curated list of open source intelligence tools and resources.
- OSINT framework
📝- Tree list of OSINT tools & resources.
- OSINT Collection
📝- Collection of OSINT related resources.
- I-Intelligence's Open Source Intelligence Tools and Resources Handbook 2018
📝- Very complete list of OSINT tools & resources, organized by category. No descriptions.
📖- A Blog about automating OSINT techniques using Python.
📝- Custom search forms and lists of resources by theme.
Location, Maps, Satellite Imagery
Mapping services & software
- Google Earth
- DigitalGlobe Discover
🌐- Search for satellite imagery of a particular location. Ability to download images (low-resolution compared to Google Earth).
- Yandex Maps
🌐- Google Streetview alternative.
🌐- Satellite imagery, historical data from several sources, vegetation infrared & index, image exports & comparison. 2 products:
Tools & techniques
- Geographic Bounding Box Drawing Tool
🌐- Draw a rectangle over a map and get the coordinates of its points & center.
- Shadows and Angles: Measuring Object Heights from Satellite Imagery
🌐- Historical solar data (sun orientation & elevation, shadow length, etc).
User generated content
- Social media (see category)
- Tourism & review websites: Foursquare, TripAdvisor, Yelp...
🌐- User-generated locations & descriptions. Has an API. Also allows to switch between satellite imagery from Google, Bing, OSM.
🌐- User-generated locations & maps. Use taginfo and/or overpass-turbo.eu - To search a location by key/value tags (see OSM's Wiki)
near:<coordinates>in a search.
🌐 💲- Search and analyze social media data based on location. ($499/mo)
💻- Geolocation information gathering through social networking platforms (discontinued).
- Identify Burnt Villages on Satellite Imagery
- Using Time Lapse Satellite Imagery to Detect Infrastructure Changes
- Photo Interpretation Student Handbook
📖(US Defense Mapping Agency, 1996) - Old unclassified handbook on analyzing aerial & satellite imagery. General principles & specifics for buildings, industries, transportation & communication facilities.
💻- Command-line OSINT tool with whois, subdomain enumeration, mail harvesting, and more.
- Maltego CE
💻- Interactive data mining & mapping tool.
💻- A collection of python scripts which automate open source intelligence searches about domain names, email addresses, IP addresses and usernames.
💻- A very handy VM with plenty of pre-installed & pre-configured OSINT tools.
💻- Open source intelligence automation tool. Gathers intelligence about a given target, which may be an IP address, domain name, hostname, network subnet, ASN, e-mail address or person's name.
🌐- International directory of white pages and yellow pages phone books.
💻- Information gathering & OSINT reconnaissance tool for phone numbers.
Pictures, Photos, Videos
🌐- Face-recognition matching search engine
🌐- Social Media Mapping Tool that correlates profiles via facial recognition. Supports LinkedIn, Facebook, Twitter, Instagram, VKontakte, Weibo, Douban.
- How to Conduct Comprehensive Video Collection (Bellingcat)
- Youtube Geo Search Tool
🌐- Search YT videos by location & time frame.
🌐- Various video search & reverse search tools and lists of resources.
- Exif Viewer (Firefox/Chrome)
- Jeffrey's Image Metadata Viewer
💻- Read and edit metadata. Linode Tutorial
🌐- Search the web for pictures taken with a specific camera serial number
💻- Automated image forensics tool
🌐- Various image search & reverse search tools and lists of resources.
- Google Images
- Bing Images
🌐- Can search part of the image
🌐- Google Images reverse search on Youtube thumbnails.
Verification & Analysis
- InVID Verification Plugin
💻- Verification “Swiss army knife” Firefox extension.
- Photo Verification Cheatsheet & Video Verification Cheatsheet
- How to verify photos and videos on social media networks
- Advanced Guide on Verifying Video Content
- Verification Handbook
📖- Handbook by the European Journalism Centre about verifying digital content in emergency coverage.
- Verification 101
📖- Storyful’s advice for checking out material from social media, and putting it into practice.
- How to Digitally Verify Combatant Affiliation in Middle East Conflicts
🌐- Camouflage encyclopedia. Search & compare camouflage patterns.
- ICUS Camouflage Index
- International Encyclopedia of Uniform Insignia
- List of Comparative Military Ranks
- Small Arms Survey’s Weapon ID database
🌐- Search for small arms by caliber, type, location, etc.
🌐 💲- Pictures auto-tagging tool (Demos + Free API Plan 2000 images/mo, or 14$/mo)
🌐- A blog about weapons & their uses in Middle East conflicts.
💻- Download all pictures & videos from an Instagram profile. No API key needed.
💻- Linkedin information gathering tool. Extracts employee data for a given company.
- Reddit Investigator
🌐- Collect info on a Reddit profile.
- Reddit Insight
🌐- Collect info on a Reddit profile, list all posts & comments.
- Tweetdeck Location Search Tutorial
- Tweets Analyzer
💻- Twitter profile analyzer: tweet activity, locations, most used hashtags, etc. Can save tweets to JSON. Requires a Twitter API key.
- TWINT (Twitter Intelligence Tool)
💻- Advanced Twitter scraping tool, no API key needed. Can export to text, CSV, JSON, SQLite, Elasticsearch. Can detect emails, phone numbers, profiles.
💻 🐍- A command line tool and Python library for archiving Twitter in JSON format.
💻- Very complete open-source tool for Twitter intelligence analysis. Needs API credentials.
Text & Documents
Indexing & searching
💻- A toolkit for data search, management and analysis in investigative reporting.
💻- Open source Solr user interface discovery platform.
- ICIJ Extract
💻- A command line tool for parallelized, distributed content-extraction.
💻- A simple out-of-the-box web interface to search through thousands of unstructured documents using Solr.
🌐- Recognizes several languages, can resize images, shortcuts to Google & Bing Translate.
💻- Open-source OCR engine.
Natural Language Processing
🐍- Python module to determine important terms within a given piece of content.
- PDF Text Extraction with PyPDF2, Tika & PDF Miner.
- Google Sheets, Google MyMaps, Google Fusion Tables, Google Earth...
- KML Interactive Sampler
🌐- Lots of KML templates
- Excel Powermap Plugin Tutorial
💻 💲- mapping & analysis software (proprietary, paid, 21-day trial)
💻- Free & open-source alternative to ArcGis.
Mindmaps & Network graphs
- Tik Tok
- Wunderground History
🌐- Weather history
- Wolfram Alpha
🌐- Weather history. ("What was the weather in New York on January 1st 2017 ?")
Searches, info, related entities
- Advanced Google searches
🌐/ 💻- Get registrar, owner info.
🌐- Search by URL, IP address, analytics codes. API with free plan. See this Belligcat how-to for automation.
- NerdyData Search
🌐- Search the source code of pages.
🌐- Search the source code of pages.
- Unveiling hidden site connections with Google Analytics IDs
💻- Subdomains enumeration tool.
- BeautifulSoup, Selenium, Scrapy
🐍- Python scraping libraries.
💻- Crawl a website (or its archive from the WayBack machine) and extract URLs, emails, social media accounts, files, keys, subdomains, etc.
- Scrape Interactive Geospatial Data
Dark Web & Onion services
🌐- Internet of Things search engine
🌐- Search open Amazon S3 buckets content.
📝- A list of Free Software network services and web applications which can be hosted locally
This list is under the Creative Commons Attribution-NonCommercial 4.0 International Public License License.