Skip to content

This project is a Python application that scrapes news headlines from the BBC website, performs keyword and n-gram analysis, and visualizes the most frequent topics. It showcases skills in web scraping, data processing, natural language processing, and data visualization.

License

Notifications You must be signed in to change notification settings

moose25/pythonScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BBC Headline Scraper and Analysis

Project Overview

This project is a Python application that scrapes news headlines from the BBC website, performs keyword and n-gram analysis, and visualizes the most frequent topics. It showcases skills in web scraping, data processing, natural language processing, and data visualization.

Features

  • Web Scraping: Extracts headlines from the BBC News website.
  • Keyword Analysis: Identifies the most common words in the headlines.
  • N-gram Analysis: Extracts common phrases (ranging from 3 to 5 words) for more context.
  • Data Visualization: Displays the results through bar charts for easy comprehension.

Technologies Used

  • Python
  • BeautifulSoup
  • NLTK
  • Matplotlib
  • Seaborn

Setup and Installation

Ensure you have Python installed on your system. Follow these steps to set up the project:

  1. Clone the Repository
    git clone [Your Repository URL]
    cd [Your Repository Name]
  2. Install Required Packages
    pip install -r requirements.txt

How to Run the Application

To run the application, execute the following command in your terminal:

```bash
python main.py

Output

The application will output:

  • Top 10 keywords in BBC headlines.
  • Top 10 n-grams in BBC headlines.
  • Bar charts visualizing the frequency of these keywords and n-grams

Author

Chris Williams

About

This project is a Python application that scrapes news headlines from the BBC website, performs keyword and n-gram analysis, and visualizes the most frequent topics. It showcases skills in web scraping, data processing, natural language processing, and data visualization.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages