Scripts for extracting comments and articles from the website of the newspaper Il Giornale.
- Extract_article_ilgiornale.py: script for extracting the text of an article given its URL.
- Extract_comments_ilgiornale.py: script for extracting the comments of an article given its URL.
A copy of all the files can be downloaded by cloning the git repository:
git clone https://github.com/ffedox/ilgiornale_scraping
- Install BeautifulSoup
pip install beautifulsoup4
- Install Tkinter
pip install tk
- Install Selenium
pip install selenium
- Download ChromeDriver or install Chromedriver-Autoinstaller
pip install chromedriver-autoinstaller
- Add ChromeDriver to system's PATH or include the path when instantiating webdriver.Chrome
driver = webdriver.Chrome(executable_path='C:/path/to/chromedriver.exe'