https://docs.google.com/presentation/d/1pIlKpr7nOltnwT3XbpoCl8N39OrOgfPFrHyOg4bVkvo/edit?usp=sharing
For the first group project of the UC Berkeley Data Analytic Bootcamp, we were assigned to groups of 5 or 6 students, and given the freedom to select a dataset of our choosing. We then needed to clean the dataset and perform an analysis using questions that we as a group formulated while researching datasets.
- Do Songs Released by a single Artist perform better than songs released by collaborating artists?
- What is the most popular day and month to release a track?
- Is there a correlation between the amount a song is streamed and the amount a song is Shazamed?
- How many streams have the top charted songs received?
- Is there a correlation between a danceability score, and how many streams a song will receive
- If a song has a higher acoustic score, will it receive a higher stream count?
- More songs are released by single artists
- The most popular day and month to release a song is January 1st and May 1st
- There is no correlation between stream and Shazam count
- There is no correlation between danceability score and stream count
- There is no correlation between acoustic score and stream count

The final analysis of music release trends reveals intriguing insights into industry dynamics and audience preferences. January emerges as the favored month for new song releases, likely capitalizing on the positive energy of the new year, while May also sees a surge in releases, likely tied to the onset of warmer weather and increased outdoor activities. Of those songs released, solo artists dominate the most-streamed Spotify songs, highlighting the success of individual careers in the music industry. Surprisingly, both Shazam lookups and Spotify chart placements show weak correlations with streaming numbers, suggesting that these metrics may serve more as badges of honor than reliable predictors of success. Furthermore, song attributes like danceability percentage score and acoustic score, also demonstrate no significant impact on streaming numbers, allowing for full creative freedom for artists.
Bootcamp challenges that we heavily drew inspiration from include the following:
-
The Scatterpy Challenge
- Demonstrates linear regression, and how to generate scatter plots
-
The Family travel challenge
- Demonstrates how to combine and rename data frames using several CSV files
-
The Correlation Conundrum Challenge
- Demonstrates the pearson r score and how to calculate a correlation coefficient
-
The Py Pie challenge
- Demonstrates how to generate a pie chart
For syntax that the group was not familiar with, we used resources such as the Xpert Learning Assistant, provided by UC Berkely Bootcamp, Pandas Documentation, StackOverflow, and ChatGPT.










