# COGS 108 - Project Proposal

## Authors

This is a modified [CRediT taxonomy of contributions](https://credit.niso.org). For each group member please list how they contributed to this project using these terms:
> Analysis, Background research, Conceptualization, Data curation, Experimental investigation, Methodology, Project administration, Software, Visualization, Writing – original draft, Writing – review & editing

- Emily Vega: Project administration
- Mohan Dong: Background research, Writing
- James Bartelloni: Background research, Methodology

## Research Question

How did the popularity of music genres change from the pre-COVID period (2017–2019) to the post-COVID period (2022–2024), and are changes in genre popularity associated with measurable audio characteristics of tracks (tempo, energy, valence, danceability)?


## Background and Prior Work

Music has long served as a window into cultural links and emotional states. People’s choices in music reflect not only individual taste but also broader societal conditions, such as social environment and collective experience. The outbreak of the COVID-19 pandemic in early 2020 brought unprecedented disruption to everyday life worldwide, prompting researchers to examine how such a global shock altered human behaviour — including music listening and creation — and whether these changes persisted beyond lockdowns into the post-COVID era. Studies show that during acute pandemic periods, listeners turned to music not just for entertainment, but as an emotional regulation strategy, with distinct preferences emerging for nostalgic and emotionally positive songs as a means of coping with stress and isolation.<a name="cite_ref-1"></a>[<sup>1</sup>](#cite_note-1)

Early empirical work on music consumption during the pandemic found that lockdowns significantly changed listening behaviour. For example, analysis of Last.fm users’ streaming records revealed reduced variety and novelty in music consumption during the initial shock of the pandemic, coupled with a greater preference for mainstream artists and familiar tracks, suggesting that uncertainty led listeners to gravitate toward familiar, comforting music.<a name="cite_ref-2"></a>[<sup>2</sup>](#cite_note-2) Another investigation in the UK documented a surge in listening to older ‘nostalgic’ music during lockdown, particularly positive-toned songs, indicating that both nostalgia and emotional valence influenced choices during COVID-19.<a name="cite_ref-1"></a>[<sup>1</sup>](#cite_note-1)

Beyond behavioural changes, researchers have also analysed the structure of musical content itself over time. Some studies have explored how measurable audio features — such as tempo, energy, danceability, and valence — relate to listening contexts or activities. One such study found that Spotify’s audio features like valence and energy are associated with listeners’ motivational contexts (e.g., dance, relaxation), suggesting that higher energy and valence are linked to more active listening situations.<a name="cite_ref-3"></a>[<sup>3</sup>](#cite_note-3) Another data-driven project using Spotify’s API showed that tempo, energy, and valence are among the features most strongly correlated with track popularity across genres, forming a basis for linking audio characteristics with success metrics.<a name="cite_ref-4"></a>[<sup>4</sup>](#cite_note-4)

Recent longitudinal work has explicitly contextualized music trends across COVID and post-COVID periods. A comprehensive analysis of the top 1,000 Spotify songs from 2011 to 2023 reported significant shifts in dominant musical features over time, with pre-pandemic years showing rising trends in energy and loudness, and post-pandemic years characterized by increases in acoustic and introspective qualities.<a name="cite_ref-5"></a>[<sup>5</sup>](#cite_note-5) Though focusing broadly on feature evolution, such work underscores that major global events can coincide with measurable changes in the sonic qualities of popular music, supporting the idea that cultural and technological factors shape music trends.

Other research frames these shifts within broader emotional and societal trends. For instance, sentiment analysis of popular music lyrics across the pre-, during-, and post-COVID periods found thematic changes reflecting anxiety and introspection during the pandemic — suggesting that external stressors are mirrored not only in listening behaviour, but in the creative output of music itself.<a name="cite_ref-6"></a>[<sup>6</sup>](#cite_note-6) Collectively, this prior work reveals that music both reflects and shapes human experience during major societal shocks, and that measurable audio features can serve as quantifiable indicators of broader shifts in cultural mood and preference. As such, your project — which investigates how genre popularity and audio characteristics changed in the post-COVID era — builds directly on these existing streams of research, extending them with a structured comparison of popularity dynamics and audio features over clearly defined time periods.

<a name="cite_note-1"></a> [^](#cite_ref-1) Yeung, T. Y. C. (2023). Revival of positive nostalgic music during the first Covid-19 lockdown. *Nature*. https://www.nature.com/articles/s41599-023-01614-0#:~:text=Abstract,nostalgia%20and%20positivity%20in%20music.

<a name="cite_note-2"></a> [^](#cite_ref-2) Ghaffari, M. et al. (2024). The impact of COVID-19 on online music listening behaviours. *Springer*. https://doi.org/10.1007/s11042-023-16079-1

<a name="cite_note-3"></a> [^](#cite_ref-3) Duman, D. (2022). Music we move to: Spotify audio features and reasons. *Plos.org*. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0275228

<a name="cite_note-4"></a> [^](#cite_ref-4) Singh, A. (2025) A Data-Driven Approach Using Spotify API. *Research-Archive.org*. https://research-archive.org/index.php/rars/preprint/view/2615/3674

<a name="cite_note-5"></a> [^](#cite_ref-5) Tan, E. E. L., & Ko, A. M. S. (2026). Dynamics of Popular Music Over a Decade. *JSR*. https://www.jsr.org/hs/index.php/path/article/view/8744

<a name="cite_note-6"></a> [^](#cite_ref-6) Hemmati, H. & Frohock, B. (2025). Evolving Human Emotions Under a Global Crisis. *WUSS Proceedings*. https://www.wuss.org/proceedings/2025/WUSS-2025-Paper-174.pdf <a href="#ref7"></a>


## Hypothesis


In the post-COVID era, music genres with higher energy, faster tempo, and more positive emotional valence increased in popularity compared to lower-energy or more melancholic genres, as listeners gravitated toward more upbeat and emotionally uplifting music.

## Data

Instructions: REPLACE the contents of this cell with your work

1. Explain what the **ideal** dataset you would want to answer this question. (This should include: 
   1. What variables? 
   2. How many observations are needed? 
   3. Who/what/how would these data be collected? 
   4. How would these data be stored/organized?
2. Search for potential **real** datasets that could provide you with something useful for this project.  For each dataset that you find write 3-5 sentences describing 
   1. where the data is located (e.g., URL) and anything you need to do to use it (e.g., ask for permission, fill out an application)
   2. what the important variables are in this dataset that you might use
  

## Ethics 

Instructions: Keep the contents of this cell. For each item on the checklist
-  put an X there if you've considered the item
-  IF THE ITEM IS RELEVANT place a short paragraph after the checklist item discussing the issue.
  
Items on this checklist are meant to provoke discussion among good-faith actors who take their ethical responsibilities seriously. Your teams will document these discussions and decisions for posterity using this section.  You don't have to solve these problems, you just have to acknowledge any potential harm no matter how unlikely.

Here is a [list of real world examples](https://deon.drivendata.org/examples/) for each item in the checklist that can refer to.

[![Deon badge](https://img.shields.io/badge/ethics%20checklist-deon-brightgreen.svg?style=popout-square)](http://deon.drivendata.org/)

### A. Data Collection
 - [X] **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent?

> Example of how to use the checkbox, and also of how you can put in a short paragraph that discusses the way this checklist item affects your project.  Remove this paragraph and the X in the checkbox before you fill this out for your project

 - [X] **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those?
 - [X] **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis?
 - [X] **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)?

### B. Data Storage
 - [X] **B.1 Data security**: Do we have a plan to protect and secure data (e.g., encryption at rest and in transit, access controls on internal users and third parties, access logs, and up-to-date software)?
 - [X] **B.2 Right to be forgotten**: Do we have a mechanism through which an individual can request their personal information be removed?
 - [X] **B.3 Data retention plan**: Is there a schedule or plan to delete the data after it is no longer needed?

### C. Analysis
 - [X] **C.1 Missing perspectives**: Have we sought to address blindspots in the analysis through engagement with relevant stakeholders (e.g., checking assumptions and discussing implications with affected communities and subject matter experts)?
 - [X] **C.2 Dataset bias**: Have we examined the data for possible sources of bias and taken steps to mitigate or address these biases (e.g., stereotype perpetuation, confirmation bias, imbalanced classes, or omitted confounding variables)?
 - [X] **C.3 Honest representation**: Are our visualizations, summary statistics, and reports designed to honestly represent the underlying data?
 - [X] **C.4 Privacy in analysis**: Have we ensured that data with PII are not used or displayed unless necessary for the analysis?
 - [X] **C.5 Auditability**: Is the process of generating the analysis well documented and reproducible if we discover issues in the future?

### D. Modeling
 - [X] **D.1 Proxy discrimination**: Have we ensured that the model does not rely on variables or proxies for variables that are unfairly discriminatory?
 - [X] **D.2 Fairness across groups**: Have we tested model results for fairness with respect to different affected groups (e.g., tested for disparate error rates)?
 - [X] **D.3 Metric selection**: Have we considered the effects of optimizing for our defined metrics and considered additional metrics?
 - [X] **D.4 Explainability**: Can we explain in understandable terms a decision the model made in cases where a justification is needed?
 - [X] **D.5 Communicate limitations**: Have we communicated the shortcomings, limitations, and biases of the model to relevant stakeholders in ways that can be generally understood?

### E. Deployment
 - [X] **E.1 Monitoring and evaluation**: Do we have a clear plan to monitor the model and its impacts after it is deployed (e.g., performance monitoring, regular audit of sample predictions, human review of high-stakes decisions, reviewing downstream impacts of errors or low-confidence decisions, testing for concept drift)?
 - [X] **E.2 Redress**: Have we discussed with our organization a plan for response if users are harmed by the results (e.g., how does the data science team evaluate these cases and update analysis and models to prevent future harm)?
 - [X] **E.3 Roll back**: Is there a way to turn off or roll back the model in production if necessary?
 - [X] **E.4 Unintended use**: Have we taken steps to identify and prevent unintended uses and abuse of the model and do we have a plan to monitor these once the model is deployed?


## Team Expectations 

Instructions: REPLACE the contents of this cell with your work
  
Read over the [COGS108 Team Policies](https://github.com/COGS108/Projects/blob/master/COGS108_TeamPolicies.md) individually. Then, include your group’s expectations of one another for successful completion of your COGS108 project below. Discuss and agree on what all of your expectations are. Discuss how your team will communicate throughout the quarter and consider how you will communicate respectfully should conflicts arise. By including each member’s name above and by adding their name to the submission, you are indicating that you have read the COGS108 Team Policies, accept your team’s expectations below, and have every intention to fulfill them. These expectations are for your team’s use and benefit — they won’t be graded for their details.

* *Team Expectation 1*
* *Team Expectation 2*
* *Team Expecation 3*
* ...

## Project Timeline Proposal

Instructions: REPLACE the contents of this cell with your work

Specify your team's specific project timeline. An example timeline has been provided. Changes the dates, times, names, and details to fit your group's plan.

If you think you will need any special resources or training outside what we have covered in COGS 108 to solve your problem, then your proposal should state these clearly. For example, if you have selected a problem that involves implementing multiple neural networks, please state this so we can make sure you know what you’re doing and so we can point you to resources you will need to implement your project. Note that you are not required to use outside methods.



| Meeting Date  | Meeting Time| Completed Before Meeting  | Discuss at Meeting |
|---|---|---|---|
| 1/20  |  1 PM | Read & Think about COGS 108 expectations; brainstorm topics/questions  | Determine best form of communication; Discuss and decide on final project topic; discuss hypothesis; begin background research | 
| 1/26  |  10 AM |  Do background research on topic | Discuss ideal dataset(s) and ethics; draft project proposal | 
| 2/1  | 10 AM  | Edit, finalize, and submit proposal; Search for datasets  | Discuss Wrangling and possible analytical approaches; Assign group members to lead each specific part   |
| 2/14  | 6 PM  | Import & Wrangle Data (Ant Man); EDA (Hulk) | Review/Edit wrangling/EDA; Discuss Analysis Plan   |
| 2/23  | 12 PM  | Finalize wrangling/EDA; Begin Analysis (Iron Man; Thor) | Discuss/edit Analysis; Complete project check-in |
| 3/13  | 12 PM  | Complete analysis; Draft results/conclusion/discussion (Wasp)| Discuss/edit full project |
| 3/20  | Before 11:59 PM  | NA | Turn in Final Project & Group Project Surveys |