Python Script to Scrape Pastebin with Regex. This is by far NOT a 'finished project' and plan to improve this over time. My goal is to make PastaBean as flexible as I can and simple to run with minimal requirements to capture data.
Created script to learn Python and capture data on the popular site https://Pastebin.com.
- Scrape Pastebin, 100 queries per 60 seconds.
- Write matches to text file in same directory.
- Temp removed:
E-mail alert. Have to manually add credentials for sender and receiver into script(Gmail Only).
- Logging - pasta.log
- Pastebin PRO account to use the API to scrape and whitelist your Internet IP (https://pastebin.com/doc_scraping_api).
sudo apt-get install python python-pip
pip install requests
- Run on VPS
- Run script as background process:
python PastaBean.py &
- Improve current RegeX
- Add more Regex Matches!!!
- Enable to allow script to write to custom file path
- Reduce duplication in e-mail alerts.
Decreased status output to one line. Generate log file for each alarm to replace e-mail alerts
- Expand to other similar sites like pastebin.
Feel free to contact me for any advice, ideas or queries.
- Twitter @Tu5k4rr
- E-mail: Tu5k4rr@protonmail.com