A simple desktop tool enabled by AI that scans websites for news articles and compiles them into a CSV. Built for policy teams who need to monitor large numbers of stakeholder sites without manual checking.
Enter a list of URLs and the tool will scan each one for news articles, collecting the title, link, date published, date retrieved, and source URL into a single CSV file.
To use this tool you will need:
- Python 3.10 or newer
- A Google API key (see Getting an API Key below)
- A PEM certificate file (for secure connections to your organization's network, see Contact for support if you work for NRCan)
- Download the latest
EvergreenUpdates.exefrom the Releases page - Double-click to open
- Enter your Google API key, the path to your
.pemcertificate, and your list of URLs (one per line) - Click Run - your CSV will be saved to the same folder as the executable
-
Clone this repository
-
Create and activate a virtual environment:
Windows
python -m venv .venv .venv\Scripts\Activate.ps1
macOS / Linux
python -m venv .venv source .venv/bin/activate -
Install dependencies
pip install -r requirements.txt
-
Run the script:
python src/scraper.py
This tool uses the Google API for article discovery. To get a free API key:
- Go to https://aistudio.google.com/api-keys
- Sign in with your Google account
- Click Create API key
- Copy the key and paste it into the app when prompted
The tool produces a CSV file named news_results.csv by default — you can change this in the UI before running. It contains the following columns:
| Column | Description |
|---|---|
| title | Article headline |
| url | URL to the article |
| published_date | Date the article was published |
| retrieved_on | Date the tool found the article |
| source_url | The website URL you provided |
As with all AI generated content, please be aware that the results may be inaccurate.
- URLs should be entered one per line in the text box
- The tool works best with news and press release pages rather than homepages
- Your API key and certificate path are never saved or transmitted
Valerie Gies @ valerie.gies@nrcan-rncan.gc.ca