A Python scripts run you multiple (No. keywords * No. days) browsers with auto-login and auto-scroll-down features for crawling historical data (tweets) by keywords
- Check out the tweetsearch.py and read the comments for understanding
- Create username.txt and password.txt for storing credentials
- Setup the keywords in the code as you like, current keywords are 'Android' and 'iOS'
- Setup the Start date and End date in the code
Python3 pip (Optional) Firefox Mozilla geckodriver Selenium
Python3 Link - https://www.python.org/downloads/ pip Link - https://pip.pypa.io/en/stable/installing/ Firefox Link - https://www.mozilla.org/en-US/firefox/new/?f=116 Mozilla geckodriver Link - https://github.com/mozilla/geckodriver/releases Selenium Link - https://pypi.python.org/pypi/selenium
Some more guidance:
-
For Selenium installation, follow the Selenium Link above or get pip installed and type command below For Windows user, pip install -U selenium For OS X user, $ pip3 install -U selenium (or use brew)
-
For geckodriver, please put the executable in a System Path (e.g. under the Python3 folder)
- Command Line python tweesearch.py (Windows) $ python3 tweetsearch.py (Linux or OS X)
- Sublime Text3 + Anaconda (My favorite for now)
- Go Code - Initial work - Co Code
This project is licensed under the MIT License - see the LICENSE.md file for details
Marco Randy Daw-Ran Liou Craig Trim
Please feel free to pull request for improvement or questions