Skip to content

🏀 Finding the interconnections of NBA players from the past 40 seasons within 6 degrees. Developed with React frontend and flask backend with neo4j database.

Notifications You must be signed in to change notification settings

eevanwong/6-Degrees-of-NBA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Six Degrees of NBA

Six Degrees of NBA showcases that players are interconnected no matter what decade and the team they play in, from Michael Jordan to Kyle Lowry to Lebron James. This project was inspired by Emily Louie's Six Degrees of Spotify and Fanatics.com's Social Network of the NBA.

Data used for this site was all web scraped (over 16000 relationships in database) from basketball-reference.com using Puppeteer as of March 2021. The site was created with:

  • React.js
  • Node.js
  • Express.js (redid backend in flask)
  • Neo4j
  • Axios

At the moment, I am trying to find a reasonable hosting platform for a neo4j database; however, many are very complex, or too expensive in comparison to the small size of this database ( < 1GB). You can (if you'd like) check out the frontend which is connected to the server; however, at the moment, the database is confined to my local machine.

The frontend is hosted with netlify (no database, so unable to perform its wanted functionality)

Quickstart

  • Clone repo
  • Run App
    • Run npm run start
  • Run seeding app
    • Run npm run seed (this will take a bit of time, ~2 minutes)
  • Once you're done, make sure to run close the docker containers by ctrl+c the terminal, and remove them with docker compose down.

Inspiration

The Six Degrees of Separation was a super interesting concept to me. How could I apply this to basketball? Basketball was a growing interest of mine, I had thought about how I could show connections between different NBA players and how it would be able to find connections between the most unlikely of players.

While I was researching this topic, I had come across fanatics.com's version of this. However, they did not go in-depth in how each of the players was connected, in regards to which year they played on in which team.

Design

I had designed some components on Figma before I started any development to get a feel for what I was going to be working on. I based the design off of my favourite team's website, the Miami Heat.

image

Once they inputted their players, it would either display the results or result in no connection.

image

image

Scraping Player Names and Team Info with Puppeteer

After I had a decent idea of the project I was going to make, I needed to question myself on the specifics. Will I do only active players? Will I do players from the past x number of years? How do I determine which team they were on? How do I grab pictures?

I had realized that APIs were inadequate, as they would only hold the active players in the current season. Furthermore, they often would not hold a player's previous info (i.e previous teams) and would not have pics of inactive players.

Through the fanatics.com website, I found basketball-reference.com. Looking through the website, I was super excited. From headshots to previous teams, they had everything I needed. I began working on a web scraper that would grab all players on each team.

With Puppeteer, I grabbed all the links of each team, then went through all past iterations of each team and accounted for each player. I stored all of the players in each team in each season as an object like so:

  "ATL-2021": [
    "John Collins",
    "Kevin Huerter",
    "Solomon Hill",
    "Trae Young",
    "Clint Capela",
    "Danilo Gallinari",
    "Tony Snell",
    "Brandon Goodwin",
    "Cam Reddish",
    "Onyeka Okongwu",
    "Bruno Fernando",
    "Bogdan Bogdanović",
    "Skylar Mays",
    "De'Andre Hunter",
    "Nathan Knight",
    "Lou Williams",
    "Kris Dunn",
    "Rajon Rondo"
  ] ...

I counted trades as automatically being a part of the team. If 2 players were traded for each other, they counted as having played together on both teams. For example, although Lou Williams was traded from the Clippers for Rajon Rondo, they are still on the same team.

Luckily, all of the pages were .html pages, meaning there was little to no javascript, so there was little time required for the rendering of each page. Nevertheless, it took about 30 minutes to scrape all of the information (~1 min per team).

Working with NERN Stack

I wanted to program with what I was comfortable with. I was familiar with React and Node/Express from previous projects but I wasn't sure of the database. I knew graph theory was important for determining how to find players within 6 connections, so I didd some digging and found graph databases, specfically neo4j. I read through neo4j documentation to figure out how best to achieve my goal.

Now, why not try using SQL with their relationships? With what I understood, with so many relationships, the complexity and length of SQL queries increase drastically. This is in contrast to Neo4j, with their cypher language, it was much easier:

match (m:Team {name: $Team}) MERGE (n:Player {name: $name}) CREATE (n)-[:PLAYED_ON]->(m)

and querying it from the backend was as simple as:

match (m:Player {name: $Player1 }), (n:Player {name: $Player2 }), p=shortestPath((m)-[*..6]-(n)) return p

It was very interesting to see the nodes of all the different players and how they connect, there were connections that ranged from the old, legendary players to the up and coming rookies.

image

Here's how the past 29 Atlanta Hawks (2021-1982) teams are interconnected

In terms of getting the information, I used Axios for post and get requests from the frontend to the server, which communicated with the database.

Issues along the way

I encountered many issues that I hadn't considered until I encountered them head-on.

The biggest issue was hosting. I have been searching for ways to host but it's very infuriating to be met with very complex systems (Google Cloud Platform gave me a headache) for such a relatively small database. I've still been searching for alternatives.

There were a lot of things that required finetuning. For example, as I was getting past players and teams from the past 38 years, I encountered a lot of teams that I didn't recognize because they were teams that rebranded into newer teams. The names were different which caused slight altercations in the frontend.

While I was testing, a big problem was that I often had to search up names to figure out how to spell them correctly. This was fixed by utilizing react-select and react-window to make an efficient search bar with all of the names to make searching quick and easy.

Final Results

Video of usage (no sound)

Six.Degrees.of.NBA.Mozilla.Firefox.2021-04-06.19-45-00.mp4

Pictures

image

image

image

image

Future Updates

  1. Hosting so that users can try it out for themselves (Firebase?).

  2. Add legacy pictures to showcase older brands that were rebranded.

  3. Automation of web scraping to update database (requires hosting of database though)

About

🏀 Finding the interconnections of NBA players from the past 40 seasons within 6 degrees. Developed with React frontend and flask backend with neo4j database.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published