Skip to content

This web scraper app analyzes and compares top NBA players, extracting data from Wikipedia to identify the top three players based on points, assists, and rebounds per team. Compare performance within teams and across the league through user-friendly plots.

License

Notifications You must be signed in to change notification settings

ahmetugsuz/WEB-Scraper

Repository files navigation

Web-Scraper for Multi-Sport Analytics

The HTML Reader for Multi-Sport Analytics is a comprehensive data analysis program designed specifically for analyzing and comparing the performance of top NBA players. It offers a robust platform for extracting player data from Wikipedia and identifying the top three players based on crucial performance metrics such as points, assists, and rebounds during the regular NBA season. This functionality enables users to conduct in-depth comparisons of player performance both within individual teams and across the league.

Features:

  • Data extraction from Wikipedia for NBA player statistics.
  • Identification of top players based on points, assists, and rebounds.
  • User-friendly visualization of player performance through plots.
  • Integration with external sports data sources for comprehensive analysis.

Requirements

Make sure you have python (version > 3.x) installed on your machine before moving on.

Dependencies

Installation

Cloning

To install the HTML Reader for Multi-Sport Analytics, simply clone the Git repository containing the source code:

git clone https://github.com/ahmetugsuz/HTML-Reader   

Once installed, navigate to the root directory:

cd HTML-Reader   

Installing Requirements

To run this project, you'll need to install the required Python packages listed in the requirements.txt file.

Locally

If you're not using a virtual environment, you can install the project dependencies directly using pip:

pip install -r requirements.txt

This command will install all the necessary packages globally on your system.

Using a Virtual Environment

It's recommended to use a virtual environment to manage project dependencies and isolate them from other projects. If you haven't already, you can create a virtual environment using venv:

python3 -m venv myenv   
  • On macOS/Linux:
source myenv/bin/activate
  • On Windows:
myenv\Scripts\activate

Once the virtual environment is activated, you can install the project dependencies using pip:

pip install -r requirements.txt

This will install the required packages only within the virtual environment, keeping your system's Python installation clean.

Running

Statistics of the NBA players

Execute the following command to run NBA_player_statistics program:

python3 fetch_player_statistics.py  

There is already images of the plots on the directory: NBA_player_statistics wheras you can check out

  • points
  • assists
  • rebounds
    of the players from the last season. It is also possible to run the program
  • python3 fetch_player_statistics.py
    to fetch the latest data about the regular season, who finds the top 3 best players for each team.
    etc. Top assists for the season is given by statistics like on the image down below

alt text

Ski Sports (Alpine ski World cup)

To run the program (different from NBA_player_statistics) simply run the code from the root:

python3 time_planner.py   

We are extracting the information

  • Date
  • Venue
  • Type
    From the calender on wikipedia, with the url = https://en.wikipedia.org/wiki/20{year}–{year+1}_FIS_Alpine_Ski_World_Cup.

This is a pure example of web scrapping, where we extract the data from a website as wikipedia to make a content/table of it, further more can this data be utilized, to build a finer or bigger application.

Example data resulted in the terminal shown below:

alpine_image

Wiki Race

Run:

python3 wiki_race_challenge.py   

to run from the given start link: "https://en.wikipedia.org/wiki/Python_(programming_language)" and finish link: "https://en.wikipedia.org/wiki/Peace", finding the shortest path with BFS. Note that this can take some time, it may be useful to change the urls.

Please note that this process may take some time, and users may opt to modify the URLs for their specific requirements.

Running Tests

Unit Tests

To run unit tests, navigate to the root directory and execute the following command:

python3 -m pytest    

!Note: There might be some data changes during the season, may not all test pass because of that.

Contributing:

Contributions to the project are welcome! Please contact me on my website: ahmettu.com

License:

This project is licensed under the MIT License. See the LICENSE file for more details.

About

This web scraper app analyzes and compares top NBA players, extracting data from Wikipedia to identify the top three players based on points, assists, and rebounds per team. Compare performance within teams and across the league through user-friendly plots.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages