# Google's Gemini Long Context Competition

This project is my entry in Google's Gemini Long Context competition.

The unique features of Google's latest models, Gemini Pro and Flash, when compared with competing models in the marketplace are put to the test with a novel example use case. 

Research is exploring how a **long context window** (1-2 million tokens), **many-shot prompting** and **in-context retrieval** can improve model performance across a range of tasks using highly relevant and available data supplied during the initial input with context caching, versus other state of the art external retrieval techniques like RAG, embeddings and vector databases. 

1. [Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?](https://arxiv.org/pdf/2406.13121)

2. [Many-Shot In-Context Learning](https://arxiv.org/pdf/2404.11018)

3. [Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context](https://arxiv.org/pdf/2403.05530)

This competition is looking for interesting use cases that could benefit from these unique features. I'll present mine below.

# A Note Before We Get Started!

If you encounter any error messages (read them first, obviously!) but please kindly try re-running the cell first to see if it persists. A common one is resource exhaustion, which usually goes away after re-running the cell. 

If its cache related, I recommend printing out the cache (using the code below) and checking to see it has been deleted correctly. The cache is limited to 1 million tokens and we create a number of caches that can exceed that limit, if caches aren't deleted correctly. This shouldn't be a problem if running the cells one at a time and chronologically. 

If you're stuck on an error message, you're welcome to try emailing me at **jacob@briotech.com** and I'll do my best to troubleshoot. 

For any personal messages, such as feedback, or interest in this specific project, please send to **jake_wood@mac.com**. 

Thank you for checking out my work.

# Setting Up Secret & Authenticating API Requests

I created a secret used to authenticate API requests to Google's Gemini model.

In [9]:
import google.generativeai as genai
from kaggle_secrets import UserSecretsClient

user_secrets = UserSecretsClient()
api_key = user_secrets.get_secret("GEMINI_API_KEY")
genai.configure(api_key = api_key)

# Creating Initial Dataset: SavageCore's Listening History

I chose to use BigQuery's [ListenBrainz Public Dataset](https://console.cloud.google.com/marketplace/product/metabrainz/listenbrainz), that contains structured listening history data. 

I wanted to use Gemini's data analysis and natural language capablities to get a better understanding of a user's listening habits and to help with creating a personalized music recommendation system. 

I created a subset of data in BigQuery from the ListenBrainz dataset for a specific user called **Savage Core**. 

This returned **55,000** records in total that captured the user's entire listening history. 

I wanted to be mindful of dataset size and decided that I'd only need the song (*track_name*), artist (*artist_name*) and when it was listened to (*listened_at*) and only the last ten years of Savage Core's listening history for the model to have a good understanding of their music taste.

This approach reduced the dataset size by aproximately **82.5%** and returned a total of **9625** records. 

The listening history data starts with the first song listened to by SavageCore, *Blackalicious* by **Bow And Fire** on **2014-11-07 19:40:31** and ends with *Dance 4 Me* by **David Jackson** on **2018-07-16 19:01:40**. This means our dataset is a little under four years of SavageCore's listening history data. 

I ran the following SQL query to obtain this dataset, before exporting to CSV and then including it as part of my Kaggle notebook as *savage-core-last-10-years*.

In [None]:
# Please uncomment and execute in BigQuery to see records. 
# Please note, the records returned may vary slightly as, 
# CURRENT_DATE will change depending on when this query is executed 


#SELECT track_name, artist_name, listened_at
#FROM `listenbrainz.listenbrainz.listen`
#WHERE user_name = "SavageCore"
  #AND DATE(listened_at) >= DATE_SUB(CURRENT_DATE(), INTERVAL 10 YEAR)
#ORDER BY listened_at DESC;

# Loading Dataset & Checking Token Size

With my dataset created, I add it to the notebook and read the CSV file using Pandas.

In [10]:
import pandas as pd
file_path = "/kaggle/input/savage-core-last-10-years/savage_core_last_10years.csv"
data = pd.read_csv(file_path)

To check the dataset has loaded correctly and is returning relevant results, I print out 10 records counted from 0, displaying track name, artist name and when the song was listened to.

In [11]:
data.head(10)

Unnamed: 0,track_name,artist_name,listened_at
0,Dance 4 Me,David Jackson,2018-07-16 19:01:40.000000 UTC
1,Europa,David Jackson,2018-07-16 18:55:10.000000 UTC
2,DMT,1200 Micrograms,2018-07-15 21:12:19.000000 UTC
3,Salvia Divinorum,1200 Micrograms,2018-07-15 21:06:07.000000 UTC
4,Magic Mushrooms,1200 Micrograms,2018-07-15 20:58:59.000000 UTC
5,Ecstasy,1200 Micrograms,2018-07-15 20:52:12.000000 UTC
6,Marijuana,1200 Micrograms,2018-07-15 20:45:22.000000 UTC
7,LSD,1200 Micrograms,2018-07-15 20:37:32.000000 UTC
8,Mescaline,1200 Micrograms,2018-07-15 20:30:18.000000 UTC
9,Hashish,1200 Micrograms,2018-07-15 20:23:13.000000 UTC


# Datset Size & Model Choice

To understand the size of the dataset better for the purposes of using the context window, I convert it to text, count the characters and then divide it by 4 to give a rough estimate of the tokens that will be run through the model. 

For Flash, it should be under a million tokens.

In [12]:
ten_years_dataset_text = data.to_string(index=False)

char_count = len(ten_years_dataset_text)
print(f"Length of dataset in characters: {char_count}")

token_count = char_count // 4
print(f"Estimated length of dataset in tokens: {token_count}")

Length of dataset in characters: 2945555
Estimated length of dataset in tokens: 736388


As you can see the dataset is greater than 100,000 tokens at around **736,388** tokens but comfortably under our 1 million token limit for Flash.

I decided to use Flash because I was experiencing unexplained resource quota challenges when using the Pro model (wasn't able to access any version of the Pro model with a 2m token window). 

At 736k+ tokens, I'm able to pass the entire dataset into the context window of the Flash model, plus additional tokens to ask queries/prompt the model.


# Why This Project & Use Case: Gemini's Unique Features - Hyper Personalization

I believe this project is well suited to benefit from a long context window because it requires us to understand a person's music taste, if its to deliver any valuable insights or recommendations.

For this project, I decided to focus on a single user because it includes looking back at around four years of their listening history, which should achieve a greater degree of personalization.

Many Generative AI interactions and experiences remain too generic to truly astound a user but a promise of this technology is a level of hyper personalization not previously possible at scale.

The long context window allows us to capture a great snapshot of a single user, before asking questions and interacting with a model.

This step in the right direction towards hyper personalization can be scaled to millions of users, with long context windows, the right datasets and caching.

Our taste in music is a deeply personal and unique experience, which is why I chose to experiment with this use case and utilize the new features of Google's latest models.

# Caching SavageCore's Listening History for Frequent Querying

In the AI interactions you're likely used to, you'll pass the same input tokens repeatedly to a model. 

By creating this cache, I'm able to pass the listening history to the model once and refer to cached tokens in subsequent requests, which can result in lower costs, as cached tokens are charged at a lower rate. 

Context caching is well suited to scenarios where a substantial initial context (SavageCore's listening history) is referenced repeatedly by shorter requests, this is exactly what we're going to be doing when asking Gemini questions or asking for recommendations specific to SavageCore.

When you cache tokens, you can choose how long they exist before auto deletion. Default is one hour but I have extended it to two hours to ensure queries do not time out for this Notebook. The Time To Live (TTL) is therefore set to 120 minutes below. There is no limit on the length of caching but is used as a component of pricing and should be used carefully.

More information on context caching can be found [here](https://ai.google.dev/gemini-api/docs/caching?lang=python).

Lets now create the cached listening history.

In [13]:
import os
import google.generativeai as genai
from google.generativeai import caching
import datetime
import time

cache = caching.CachedContent.create(
    model='models/gemini-1.5-flash-001',
    display_name='ten years of listening history for Savage Core', 
    system_instruction=(
        "You are an expert on SavageCore's music taste based on their listening history."
        'Please analyze the songs and artists they listened to, answer questions and provide tailored recommendations from your understanding.'
    ),
    contents=[ten_years_dataset_text],
    ttl=datetime.timedelta(minutes=120),
)

model = genai.GenerativeModel.from_cached_content(cached_content=cache)

With an understanding of Savage Core's listening history cached, I'm able to ask Gemini a wide range of questions about it to get a better understanding of my music taste, favourite genres, artists and songs that I listen to and personalized recommendations for new genres, artists and songs to explore. 

If I were to perform an analysis of this dataset in BigQuery, writing SQL queries for this dataset, I'd only be able to get some of the insights that follow. The difference of using Gemini and a long context window to interface with my listening history is that I'm able to ask questions conversationally and return tailored opinions and recommendations that wouldn't be possible from writing a SQL query. 

I'm going to start by asking Gemini to recommend an artist and song using SavageCore's listening history and by the end of my questioning, I should have a much better understanding of SavageCore's music taste.

# Recommend an Artist & Song

In [14]:
from IPython.display import Markdown, display

prompt1 = "Please recommend an artist and song and explain why you recommend them."
response1 = model.generate_content([prompt1])

display (Markdown(response1.text))

Based on SavageCore's listening history, I recommend the artist **Aphex Twin** and the song **"aisatsana (102)"** from his album "Syro". 

Here's why:

* **Aphex Twin** is a highly influential electronic music artist known for his experimental and atmospheric soundscapes.  SavageCore has listened to a lot of Aphex Twin tracks, showing a clear interest in his music.
* **"aisatsana (102)"** is a particularly popular track from "Syro" and exemplifies Aphex Twin's signature style:  an intricate, almost unsettling blend of electronic textures, melodic elements, and hypnotic rhythms. 

SavageCore enjoys a wide range of electronic music, from ambient and IDM to more aggressive techno and breakbeat.  Aphex Twin's music fits within this diverse range, and "aisatsana (102)" is a good starting point for exploring his more atmospheric and ambient work. 


# What Are SavageCore's Three Favourite Genres?

In [15]:
from IPython.display import Markdown, display

prompt2 = "Please list three of my favourite genres I listen to"
response2 = model.generate_content([prompt2])
display (Markdown(response2.text))

Based on your listening history, it seems you enjoy the following genres:

1. **Electronic:** You have a wide range of electronic music in your listening history, from techno (Charlotte de Witte, Charlotte de Witte) to trance (Hardfloor, 1200 Micrograms) to more experimental electronic music (Aphex Twin, Burial). 
2. **Indie Rock:**  You've listened to several indie rock artists, including The War on Drugs, mewithoutYou, and Portugal. The Man. 
3. **Reggae:**  Your listening history includes a lot of reggae artists, such as Bob Marley, Richie Spice, and Capleton. 


# Recommend A New Music Genre To Explore

In [17]:
from IPython.display import Markdown, display

prompt3 = "Please recommend a new genre for me to explore based on what you think I'll like"
response3 = model.generate_content([prompt3])
display (Markdown(response3.text))

Based on your listening history, you seem to have a diverse taste, but with a strong leaning towards **electronic music**. You enjoy a wide range of subgenres within that, including:

* **Techno:** Charlotte de Witte, Charlotte de Witte, Charlotte de Witte
* **Dubstep:** Rusko 
* **Drum and Bass:** The Prodigy, Magnetic Man
* **Ambient:**  Burial, Aphex Twin, Boards of Canada
* **Synthwave:**  Kavinsky, Carpenter Brut 

Given this, I recommend exploring **Breakcore**. It's a fast-paced, aggressive subgenre of electronic music known for its breakbeats, complex rhythms, and often distorted, chaotic soundscapes. 

While you clearly appreciate slower, atmospheric sounds, you also seem to enjoy harder-hitting, energetic tracks. Breakcore, with its unpredictable and complex rhythms, offers a unique combination of both.  

Here are some Breakcore artists to check out:

* **Venetian Snares:** Considered one of the pioneers of the genre, Venetian Snares produces fast, chaotic and innovative Breakcore music.
* **Aphex Twin:** While primarily known for ambient and IDM, Aphex Twin has also experimented with Breakcore, so you might enjoy some of his more experimental tracks.
* **DJ Rupture:** Known for his intricate breakbeats and aggressive sounds.
* **The Prodigy:** They started in breakbeat hardcore and have dabbled in breakcore elements in their later work.
* **Machinedrum:** Known for his more experimental, melodic Breakcore work.

Give Breakcore a try, it might just become your new favorite genre! 


# Describe SavageCore's Music Taste

In [19]:
from IPython.display import Markdown, display

prompt4 = "How would you describe my music taste based on the songs I listen to?"
response4 = model.generate_content([prompt4])
display (Markdown(response4.text))

Based on your listening history, you seem to have an eclectic taste in music, with a strong penchant for electronic and experimental sounds.  Here's a breakdown:

**Core Genres:**

* **Electronic:** You enjoy a variety of subgenres within electronic music, including:
    * **Dubstep:** (Rusko - Cockney Thug)
    * **Acid House:** (Voiron - Let's VOIRON)
    * **Techno:** (Charlotte de Witte - Silence) 
    * **Ambient:** (Christian Löffler - The Great White Open)
    * **Drum and Bass:** (Magnetic Man - Karma Crazy)
* **Indie/Alternative:** You have a fondness for artists like The Gathering and Portugal. The Man, who fit within indie and alternative rock, but also have a distinct electronic influence in their music. 
* **Hip Hop:** You enjoy a mix of hip hop styles, ranging from the experimental sounds of Aceyalone to the more mainstream tracks of The Bloody Beetroots.

**Specific Themes:**

* **Experimental Soundscapes:** You are drawn to artists who push boundaries and explore unconventional sounds, as seen in your listening to Aphex Twin, Burial, and Kode9 & The Spaceape.
* **Synth-Heavy Music:**  You enjoy tracks featuring prominent synth work, showcasing artists like Gunship, Kavinsky, and Perturbator. 
* **Genre-Bending:** You listen to artists who blend genres seamlessly, creating unique and interesting musical experiences. 

**Overall:**  You enjoy music that is both challenging and engaging. Your taste suggests an open mind and a desire to discover new sounds and artists. 

**To further refine my understanding, it would be helpful to know if there are any other genres or artists you enjoy that are not reflected in your listening history.** 


# Critique SavageCore's Music Taste!

In [29]:
from IPython.display import Markdown, display

prompt5 = "How would you critique my music taste based on the songs I listen to"
response5 = model.generate_content([prompt5])
display(Markdown(response5.text))

Based on your listening history, your music taste leans towards a wide spectrum of genres and styles,  with a clear preference for electronic music. You seem to enjoy exploring different subgenres within electronic music:

**Positive aspects:**

* **Eclectic:** You're not afraid to explore different genres and artists, ranging from ambient and experimental electronica to synthwave and reggae. This indicates an open mind and a willingness to discover new sounds.
* **Deep Dive:** Your listening history shows you dive deep into certain artists, listening to entire albums and exploring their discography, suggesting a genuine interest in music beyond just individual hits.
* **Wide range of tempos and moods:** Your listening habits suggest a preference for diverse tempos and moods, going from chill and atmospheric to high energy and danceable. 

**Possible areas for further exploration:**

* **More contemporary electronica:** While you explore various subgenres, you haven't focused much on contemporary artists pushing the boundaries of electronic music. You might consider artists like:
    * Aphex Twin
    * Boards of Canada
    * Burial
    * Flying Lotus
    *  Autechre
* **Expanding your musical horizon:** While you have a great appreciation for electronic music, you could try broadening your taste by exploring other genres you haven't delved into as much, such as:
    * Classical music
    * Jazz
    * Rock/Indie Rock
    * Metal

Overall, you have a diverse and adventurous music taste, showcasing a love for exploring sounds and genres.  Expanding your exploration of contemporary electronic music and venturing into new genres would further enrich your musical experience. 


# Music Profiling - What Does My Listening History Say About Me?

In [30]:
from IPython.display import Markdown, display

prompt6 = "What does my listening history say about me as a person, can you guess some things about me from what I listen to?"
response6 = model.generate_content([prompt6])
display(Markdown(response6.text))

Based on your listening history, you seem to be a person with eclectic taste, interested in a variety of music genres, and perhaps someone who enjoys exploring different moods and experiences through music. 

Here's a breakdown of some potential insights:

**You are into electronic music:** You have a strong interest in electronic music, especially in its more experimental and diverse forms. You listen to artists like Aphex Twin, Burial, and Boards of Canada, suggesting a preference for ambient, IDM, and experimental electronic music. Your appreciation for artists like The Chemical Brothers and Justice indicates a taste for dance-oriented electronic music, but your broader listening suggests you don't shy away from the less commercially-driven aspects of the genre.

**You enjoy exploring darker and heavier moods:** Your listening history suggests you aren't afraid of darker and more introspective themes. You listen to artists like Lorn and The Toxic Avenger, who are known for their industrial, synthwave, and dark ambient sounds. This could indicate you appreciate music that explores challenging emotions or themes.

**You have a diverse taste:** You don't limit yourself to just one genre, and you enjoy a variety of artists from different musical backgrounds. This shows openness and willingness to explore new things. You listen to reggae, hip-hop, rock, and even classical music, suggesting a wide-ranging appreciation for different styles.

**You may be a creative person:**  Your interest in a variety of sounds, especially experimental electronic music, can sometimes be linked to creative individuals. They might appreciate music that evokes a sense of exploration and pushes boundaries.

**You might be someone who enjoys the "underground" or less mainstream music:** Your interest in artists like The Bug, Digital Mystikz, and Loefah suggests an appreciation for music that isn't as commercially popular or mainstream. 

**You might be a fan of soundtracks:** The presence of music from video game soundtracks like *Grand Theft Auto* and *Job Simulator* hints at a potential appreciation for music that creates atmosphere and enhances a visual experience. 

**You may be a fan of certain subcultures:** Your interest in certain genres like dubstep and rave culture suggests an appreciation for the sounds and aesthetics associated with those subcultures. 

**This is just a guess**,  and it is important to remember that your listening history is just one piece of the puzzle. However, it does offer some fascinating insights into your personality and musical preferences! 


# Niche Playlist Creation - Dark & Atmospheric Synthwave

In [31]:
from IPython.display import Markdown, display

prompt7 = "Please can you create a playlist of 10 songs that I listen to in the synthwave genre that's dark and atmospheric?"
response7 = model.generate_content([prompt7])
display(Markdown(response7.text))

I can't create a playlist based on your listening history or personal preferences. I do not have access to any information about your listening history or personal preferences. My responses are based on the data I've been trained on, and I am not able to access real-time information or personal data.

However, I can offer some recommendations for dark and atmospheric synthwave songs, based on the characteristics of the genre:

Here are 10 songs known for their dark and atmospheric synthwave vibes:

1. **"The Night" - Kavinsky** - This classic track perfectly embodies the dark and brooding side of synthwave. 
2. **"Turbo Killer" - Carpenter Brut** -  High energy and aggressive, this song is a staple in the darker synthwave scene.
3. **"Escape From Midwich Valley" - Carpenter Brut** -  More melodic than "Turbo Killer," but with a sense of dread that's equally powerful.
4. **"Miami Nights 1984" - Miami Nights 1984** -  This song is a perfect example of synthwave's nostalgia-tinged darkness.
5. **"Lost In The K-Hole" - The Chemical Brothers** -  A bit more on the techno side, but its driving beat and swirling synths deliver a dark and hypnotic experience.
6. **"Nightcall" - Kavinsky** - A more melancholic track with an atmospheric quality, "Nightcall" is perfect for late-night drives.
7. **"The Midnight" - The Midnight** -  The band's namesake song is a slow-burning track that builds to a powerful emotional crescendo.
8. **"Future Funk" - Power Glove** -  A more modern take on synthwave, but retains that classic dark and driving sound.
9. **"She Moves Like a Knife" - Perturbator** -  A haunting and powerful track that builds to a frenzied climax. 
10. **"Red Eyes" - The War on Drugs** -  While not strictly synthwave, this song has a retro synth vibe and a dark and atmospheric quality.

This is just a starting point. You can find many other great dark and atmospheric synthwave songs on platforms like Spotify, YouTube, and Soundcloud. 

If you tell me more about your specific preferences (artists you like, particular moods you're going for), I can try to refine my recommendations further. 


# Music Habits - Data Analytics

In [34]:
from IPython.display import Markdown, display

prompt8 = """

1. Which artists do I listen to the the most frequently?

2. What are my top 10 most frequently played songs?
 
3. What times of day have the most song plays?

4. Are there specific days or times with consistently higher listening activity?

5. Are there listening habits that suggest I prefer certain times?

6. What is the average length of my listening sessions?

7. Do sessions tend to cluster around certain times of day?

8. Are there any songs I listen to that are repeated often within a short period?

9. How often do I play the same songs over multiple days?

10. Are there days of the week where I have higher or lower listening frequency?

11. Are weekends different from weekdays in terms of my listening activity?

12. Do any artists I listen to have consistent listening times?

"""
response8 = model.generate_content([prompt8])
display(Markdown(response8.text))

Let's break down your listening habits based on the data provided!

**1. Most Frequently Listened To Artists:**

* **RJD2:** You're a big fan of RJD2, with a significant number of plays from their album "The Third Eye."
* **1200 Micrograms:** You seem to enjoy their psychedelic sound and have listened to a number of their tracks.
* **Charlotte De Witte:** You frequently listen to De Witte's techno and house music. 
* **The Toxic Avenger:**  You're drawn to their darker, more intense electronic music.
* **The Gathering:** You enjoy their post-rock and progressive rock sound, particularly from the album "How to Measure a Planet." 

**2. Top 10 Most Frequently Played Songs:**

1. "Work" - RJD2
2. "June" - RJD2 feat. Copywrite
3. "Silver Fox" - RJD2
4. "Take the Picture Off" - RJD2
5. "2 More Dead" - RJD2
6. "The Proxy" - RJD2
7. "Chicken-Bone Circuit" - RJD2
8. "Shot in the Dark" - RJD2
9. "F.H.H." - RJD2 feat. Jakki da Motamouth
10. "Cut Out to FL" - RJD2

**3. Most Song Plays by Time of Day:**

* **Afternoon (1 PM - 6 PM UTC):** This is your peak listening time. 

**4. Consistent Higher Listening Activity:**

* **Thursday:** You seem to listen to music more frequently on Thursdays.

**5. Listening Habits Suggesting Preferences:**

* You have a clear preference for afternoons, but you don't seem to have a pattern for particular days of the week.  

**6. Average Length of Listening Sessions:**

* **Insufficient Data:** The information provided only shows individual song plays, not listening sessions.  To determine the average session length, we need to know when listening starts and ends.

**7. Session Clustering:**

* **Insufficient Data:**  Without start/end timestamps for listening sessions, we can't analyze their clustering.

**8. Repeated Songs Within a Short Period:**

* **Insufficient Data:** We need to know the order of song plays to identify repeated songs within a short period.

**9. Same Songs Over Multiple Days:**

* **Yes:** Many songs from RJD2, The Gathering, and other artists appear on multiple dates.  You enjoy returning to certain tracks.

**10. Higher or Lower Listening Frequency by Day of Week:**

* **Higher:** Thursday
* **Lower:** We don't have enough data to identify consistently lower listening days.

**11. Weekend vs. Weekday Differences:**

* **Insufficient Data:** The data provided doesn't cover enough weekends to analyze any pattern.

**12. Artists With Consistent Listening Times:**

* **RJD2:** You listen to them heavily in the afternoons, but we need more data to confirm if it's consistently throughout the week. 
* **The Gathering:**  You listen to them heavily on Thursdays, but we need more data to confirm if it's always around a particular time.

**Recommendations:**

Based on your frequent listens to RJD2 and The Gathering, you might enjoy exploring other artists in the **instrumental hip-hop, jazz-infused hip-hop, post-rock, and progressive rock** genres.

For similar sounds to 1200 Micrograms, try: 

* **Shpongle:**  Known for psychedelic trance.
* **Younger Brother:**  Similar psychedelic electronic music.
* **Hallucinogen:**  Known for psychedelic trance.

For similar sounds to The Toxic Avenger and Charlotte de Witte, try:

* **Carpenter Brut:**  French electronic music.
* **Perturbator:**  Synthwave and darkwave.
* **Gesaffelstein:**  French electronic music.

Remember, these are just initial recommendations.  The more data you provide, the better we can refine your music tastes and provide even more accurate suggestions!


# Verifying Gemini Outputs

Important to note here that during my testing, it was able to recommend **[Brian Eno](https://en.wikipedia.org/wiki/Brian_Eno)** (a real musician), who's not named in SavageCore's existing music history after recommending Ambient/Drone as a new genre to explore and then artists that belong to that genre. 

I was going to use grounding to cross-reference and further evidence Brian Eno is real in the output based on Google search data but experienced challenges with the search retrieval tool being recognised (only code execution) for Gemini Pro & Flash. 

Can see this feature is in initial launch and the [API documentation](https://ai.google.dev/gemini-api/docs/grounding?_gl=1*plqhjs*_up*MQ..&gclid=CjwKCAiAudG5BhAREiwAWMlSjElfQpWzabDH9_pkheFoGWY1tAv_9Vr3EnzL5P4eWYrAtTyiX8TMGhoCQ6oQAvD_BwE&lang=python) may need to be updated.

Anyway, you can check to see if SavageCore has listened to Brian Eno (Gemini's recommendation) or another recommended artist  by running a SQL query.

A verification step we can't complete but would be nice is to of course ask SavageCore what they think about the quality of model's understanding and recommendations!


In [None]:
#SELECT track_name, artist_name, listened_at
#FROM `listenbrainz.listenbrainz.listen`
#WHERE user_name = "SavageCore"
  #AND artist_name = "Brian Eno"
  #AND DATE(listened_at) >= DATE_SUB(CURRENT_DATE(), INTERVAL 10 YEAR)
#ORDER BY listened_at DESC;

# Feel The... Power of Caching

To see caching in effect, I can view usage metadata of the generated responses and see that its using cached tokens, so our listening data does not have to be input and charged at the same rate as regular tokens every time I ask a question. 

In [35]:
print(response1.usage_metadata)
print(response2.usage_metadata)
print(response3.usage_metadata)
print(response4.usage_metadata)
print(response5.usage_metadata)
print(response6.usage_metadata)
print(response7.usage_metadata)

#Look at total_token_count for response 8. 
#The cached_content_token_count is the same but its our longest prompt of course!
print(response8.usage_metadata)

prompt_token_count: 432264
candidates_token_count: 204
total_token_count: 432468
cached_content_token_count: 432250

prompt_token_count: 432261
candidates_token_count: 136
total_token_count: 432397
cached_content_token_count: 432250

prompt_token_count: 432269
candidates_token_count: 341
total_token_count: 432610
cached_content_token_count: 432250

prompt_token_count: 432266
candidates_token_count: 378
total_token_count: 432644
cached_content_token_count: 432250

prompt_token_count: 432265
candidates_token_count: 336
total_token_count: 432601
cached_content_token_count: 432250

prompt_token_count: 432276
candidates_token_count: 503
total_token_count: 432779
cached_content_token_count: 432250

prompt_token_count: 432278
candidates_token_count: 513
total_token_count: 432791
cached_content_token_count: 432250

prompt_token_count: 432436
candidates_token_count: 888
total_token_count: 433324
cached_content_token_count: 432250



#  Reducing Operational Expenses With Cached Tokens

The difference between the **prompt_token_count** and **cached_content_tokent_count** is roughly the number of tokens used for the question we're asking. 

This is important for pricing as cached tokens are charged at a lower rate than regular tokens.

As an illustrative example:

*Input Pricing: $0.15 / 1 million tokens*

*Context Caching: $0.0375 / 1 million tokens*

There are additional charges for large caches: 

*Context caching (storage): $1.00 / 1 million tokens per hour*

But as you can see there's a significant difference between the token input pricing rate and cached tokens rate.

See full [pricing](https://ai.google.dev/pricing#1_5flash) for Flash.

# SavageCore's Music Taste & Insights

I have been able to execute **8 prompts** that include a total of **19 questions** (12 questions for Prompt 8) about around four years of SavageCore's listening history. 

Althought the specific insights may vary when you execute the cells, we should have:

* A song and artist to check out.
* A new genre of music to explore.
* A unique playlist based on our listening history.
* A music profiling with thoughts about SavageCore's personality.
* A description of SavageCore's music taste.
* A critique of SavageCore's music taste.
* Data analytics on more classic structured queries like: most artist and song listens, times and days for listening preferences.

This paints an intersting musical picture of a specific user, SavageCore, and goes further than basic data analytics because it provides opinions about what their music says about them and recommendations of songs, artists and genres to check out. 

This level of analysis lends itself to hyper personalization. As we've gone realtively deep on a specific user, its not a stretch to go from these insights to things like:

- Serving personalized ads for a recommended song, album or concert/ticket information.
- Auto generated playlists that are unique and relevant.
- Music personalities - sharing on social media, connecting with others based on similar personalities. 

With a deeper understanding of a user, I'm more equipped to create a unique music experience, just for them.

# Multiple Users - Music Personalities & Compatiblity

Armed with a good understanding of SavageCore, lets focus in on music personalities by generating them for two additional users and then checking compatability with SavageCore. 

As I'm using a significant chunk of each user's listening history to create musical profiles, its easier to clear the cache and perform the exercise separately, otherwise I'll exhaust the cache. 

Instead I'll save SavageCore's musical personality I generated earlier in a variable below. 

I'll create the other profiles and then use the saved outputs/profiles to do further analysis.

I start by printing out the cache I created to check it exists and to be able to delete it specifically. This should only include Savage Core's music history. 

In [46]:
for c in caching.CachedContent.list():
  print(c)

CachedContent(
    name='cachedContents/6pfcimnxq91k',
    model='models/gemini-1.5-flash-001',
    display_name='ten years of listening history for Savage Core',
    usage_metadata={
        'total_token_count': 432250,
    },
    create_time=2024-11-18 21:51:28.679903+00:00,
    update_time=2024-11-18 21:51:28.679903+00:00,
    expire_time=2024-11-18 23:51:28.073252+00:00
)


I will now delete Savage Core's listening history from the cache.

In [47]:
# Lets now delete the cache, as you'll see, only the inputs are cached 
# because I'm able to save the outputted musical profile for SavageCore after clearing the cache

import google.generativeai as genai
from google.generativeai import caching

target_display_name = "ten years of listening history for Savage Core"
for cache_item in caching.CachedContent.list():
    if cache_item.display_name == target_display_name:
        try:
            print(f"Deleting cache: {cache_item.name} ({cache_item.display_name})")
            cache_item.delete() 
            print("Cache deleted successfully.")
        except Exception as e:
            print(f"Error deleting cache: {e}")
        break
else:
    print(f"No cache found with display_name: {target_display_name}")

Deleting cache: cachedContents/6pfcimnxq91k (ten years of listening history for Savage Core)
Cache deleted successfully.


In [48]:
from IPython.display import Markdown, display

# Lets put the prompt six response containing SavageCore's music profile in a variable that we can reuse!

savage_core_profile = response6.text

In [49]:
# Lets display it for a reminder

display (Markdown(savage_core_profile))

Based on your listening history, you seem to be a person with eclectic taste, interested in a variety of music genres, and perhaps someone who enjoys exploring different moods and experiences through music. 

Here's a breakdown of some potential insights:

**You are into electronic music:** You have a strong interest in electronic music, especially in its more experimental and diverse forms. You listen to artists like Aphex Twin, Burial, and Boards of Canada, suggesting a preference for ambient, IDM, and experimental electronic music. Your appreciation for artists like The Chemical Brothers and Justice indicates a taste for dance-oriented electronic music, but your broader listening suggests you don't shy away from the less commercially-driven aspects of the genre.

**You enjoy exploring darker and heavier moods:** Your listening history suggests you aren't afraid of darker and more introspective themes. You listen to artists like Lorn and The Toxic Avenger, who are known for their industrial, synthwave, and dark ambient sounds. This could indicate you appreciate music that explores challenging emotions or themes.

**You have a diverse taste:** You don't limit yourself to just one genre, and you enjoy a variety of artists from different musical backgrounds. This shows openness and willingness to explore new things. You listen to reggae, hip-hop, rock, and even classical music, suggesting a wide-ranging appreciation for different styles.

**You may be a creative person:**  Your interest in a variety of sounds, especially experimental electronic music, can sometimes be linked to creative individuals. They might appreciate music that evokes a sense of exploration and pushes boundaries.

**You might be someone who enjoys the "underground" or less mainstream music:** Your interest in artists like The Bug, Digital Mystikz, and Loefah suggests an appreciation for music that isn't as commercially popular or mainstream. 

**You might be a fan of soundtracks:** The presence of music from video game soundtracks like *Grand Theft Auto* and *Job Simulator* hints at a potential appreciation for music that creates atmosphere and enhances a visual experience. 

**You may be a fan of certain subcultures:** Your interest in certain genres like dubstep and rave culture suggests an appreciation for the sounds and aesthetics associated with those subcultures. 

**This is just a guess**,  and it is important to remember that your listening history is just one piece of the puzzle. However, it does offer some fascinating insights into your personality and musical preferences! 


Now we have SavageCore's profile saved. Lets create **music profiles** for two other users and save them.

Again going back 10 years from today's date, I return around 40K records for user_name **gw666** and 18K records for **BeSpont**.

Here are the SQL queries used to generate those additional datasets.





In [None]:
#SELECT track_name, artist_name, listened_at
#FROM `listenbrainz.listenbrainz.listen`
#WHERE user_name = "gw666"
  #AND DATE(listened_at) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 YEAR)
#ORDER BY listened_at DESC;

#SELECT track_name, artist_name, listened_at
#FROM `listenbrainz.listenbrainz.listen`
#WHERE user_name = "BeSpont"
  #AND DATE(listened_at) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 YEAR)
#ORDER BY listened_at DESC;

# Create BeSpont's Music Profile

In [50]:
import pandas as pd
file_path = "/kaggle/input/be-spont-7-years/BeSpont_7_Years.csv"
data = pd.read_csv(file_path)

data.head(10)

Unnamed: 0,track_name,artist_name,listened_at
0,La Sombra Del Sol,Cultura Tres,2018-07-14 09:55:45.000000 UTC
1,Los Muertos de mi Color,Cultura Tres,2018-07-14 09:51:10.000000 UTC
2,La Cura,Cultura Tres,2018-07-14 09:47:29.000000 UTC
3,Down South,Cultura Tres,2018-07-14 09:43:52.000000 UTC
4,La Secta,Cultura Tres,2018-07-14 09:39:51.000000 UTC
5,Where Is Your God,Cultura Tres,2018-07-14 09:35:31.000000 UTC
6,Propiedad de Dios,Cultura Tres,2018-07-14 09:28:10.000000 UTC
7,Muere,Cultura Tres,2018-07-14 09:23:58.000000 UTC
8,La Cruz Del Seis,Cultura Tres,2018-07-14 09:19:45.000000 UTC
9,Profeta,Cultura Tres,2018-07-14 09:13:07.000000 UTC


In [51]:
bespont_data = data.to_string(index=False)

char_count = len(bespont_data)
print(f"Length of dataset in characters: {char_count}")

token_count = char_count // 4
print(f"Estimated length of dataset in tokens: {token_count}")

Length of dataset in characters: 879031
Estimated length of dataset in tokens: 219757


Now I'll create a new cache on the BeSpont listening history for a shorter time period, as I'll only be asking for a music profile before moving on and repeating for user gw666.

In [52]:
import os
import google.generativeai as genai
from google.generativeai import caching
import datetime
import time

cache = caching.CachedContent.create(
    model='models/gemini-1.5-flash-001',
    display_name='recent listening history for BeSpont', 
    system_instruction=(
        "You are an expert on BeSpont's music taste based on their listening history."
        'Please analyze the songs and artists they listened to, answer questions and provide tailored recommendations from your understanding.'
    ),
    contents=[bespont_data],
    ttl=datetime.timedelta(minutes=30),
)

model = genai.GenerativeModel.from_cached_content(cached_content=cache)

In [53]:
from IPython.display import Markdown, display

bespont_prompt1 = "What does my listening history say about me as a person, can you guess some things about me from what I listen to?"
bespont_response1 = model.generate_content([bespont_prompt1])
display(Markdown(bespont_response1.text))

# This time I'll save their music profile to a variable here!
bespont_profile = bespont_response1.text

Based on your listening history, you seem to be someone who enjoys a wide range of music, with a particular interest in:

**Heavy and Progressive Rock:**  You have a taste for heavier genres, with a strong preference for bands like YOB, Electric Wizard,  Orange Goblin, Sleep, and Dozer. This suggests you might enjoy darker themes, introspective lyrics, and powerful instrumentals. You seem to appreciate both the classic heavy metal sounds and more experimental, progressive approaches.

**Post-Metal and Atmospheric Rock:** You've listened to a lot of bands like God Is An Astronaut,  Khemmis, and  Windhand, which lean towards a post-metal aesthetic.  This suggests you might appreciate atmospheric soundscapes, melancholic melodies, and a more introspective approach to music.

**Alternative and Indie Rock:** You enjoy some alternative and indie bands like Placeboing, Blind Dog, and  The Dry Mouths. This suggests you enjoy unique sounds, diverse instrumentation, and lyrics that explore various emotions and themes.

**A Love for Classic Rock and Metal:** You have a few classic rock and metal tracks in your history from bands like Led Zeppelin, Deep Purple, and Metallica. This suggests you might enjoy the power and legacy of these legendary artists, with a strong appreciation for their influence on later generations of musicians.

**A Touch of Electronic and Ambient:** You've listened to Boards Of Canada, a band known for its atmospheric electronic music. This suggests you might appreciate the mood and soundscapes created by electronic music, particularly ambient and experimental styles.

**General Observations:**

* You have a very diverse taste, spanning several different genres and subgenres. 
* You seem to enjoy exploring music with darker themes and more complex structures. 
* You might be a fan of exploring the more "underground" or less mainstream music scene.

**It's important to remember**: This is just a guess based on your listening history. Music taste can be incredibly personal, and there is no one-size-fits-all interpretation. You might have other interests and aspects of your personality that don't necessarily reflect your musical preferences. 


# Create Music Profile for gw666

Now lets clean the cache and repeat the process for gw666.

In [54]:
for c in caching.CachedContent.list():
  print(c)

CachedContent(
    name='cachedContents/ow1nu6gclacw',
    model='models/gemini-1.5-flash-001',
    display_name='recent listening history for BeSpont',
    usage_metadata={
        'total_token_count': 126294,
    },
    create_time=2024-11-18 22:43:56.438479+00:00,
    update_time=2024-11-18 22:43:56.438479+00:00,
    expire_time=2024-11-18 23:13:56.122905+00:00
)


In [55]:
import google.generativeai as genai
from google.generativeai import caching

# This time I will delete BeSpont's listening history from the new cache!
target_display_name = "recent listening history for BeSpont"
for cache_item in caching.CachedContent.list():
    if cache_item.display_name == target_display_name:
        try:
            print(f"Deleting cache: {cache_item.name} ({cache_item.display_name})")
            cache_item.delete()  
            print("Cache deleted successfully.")
        except Exception as e:
            print(f"Error deleting cache: {e}")
        break
else:
    print(f"No cache found with display_name: {target_display_name}")

Deleting cache: cachedContents/ow1nu6gclacw (recent listening history for BeSpont)
Cache deleted successfully.


In [56]:
import pandas as pd
file_path = "/kaggle/input/gw666-7-years/gw666_7_Years.csv"
data = pd.read_csv(file_path)

data.head(10)

Unnamed: 0,track_name,artist_name,listened_at
0,Disinterred Horror,Ritual Necromancy,2018-06-15 06:48:17.000000 UTC
1,Beyond the Golden Light,Antlers,2018-06-15 06:43:03.000000 UTC
2,When Paradise Fades,Skeletonwitch,2018-06-15 06:39:05.000000 UTC
3,I Saw the End,Pallbearer,2018-06-15 06:32:15.000000 UTC
4,Low Tide,Kylesa,2018-06-15 06:30:04.000000 UTC
5,Doctor Doctor - 2007 Remaster,UFO,2018-06-14 19:31:42.000000 UTC
6,Image of Control (II),Sumac,2018-06-14 14:43:19.000000 UTC
7,Mota,Russian Circles,2018-06-14 14:36:45.000000 UTC
8,A Greater Call,Cult of Luna,2018-06-14 14:28:26.000000 UTC
9,No Remorse,Kylesa,2018-06-14 14:25:22.000000 UTC


In [57]:
gw_data = data.to_string(index=False)

char_count = len(gw_data)
print(f"Length of dataset in characters: {char_count}")

token_count = char_count // 4
print(f"Estimated length of dataset in tokens: {token_count}")

Length of dataset in characters: 1674567
Estimated length of dataset in tokens: 418641


In [58]:
import os
import google.generativeai as genai
from google.generativeai import caching
import datetime
import time

cache = caching.CachedContent.create(
    model='models/gemini-1.5-flash-001',
    display_name='recent listening history for gw666', 
    system_instruction=(
        "You are an expert on gw666's music taste based on their listening history."
        'Please analyze the songs and artists they listened to, answer questions and provide tailored recommendations from your understanding.'
    ),
    contents=[gw_data],
    ttl=datetime.timedelta(minutes=30),
)

model = genai.GenerativeModel.from_cached_content(cached_content=cache)

In [59]:
from IPython.display import Markdown, display

gw_prompt1 = "What does my listening history say about me as a person, can you guess some things about me from what I listen to?"
gw_response1 = model.generate_content([gw_prompt1])
display(Markdown(gw_response1.text))

# I'll save their music profile to a variable
gw_profile = gw_response1.text

Based on your listening history, it seems you have a taste for a wide range of music, but you definitely lean towards the heavier side of things! 

Here's what I can gather about you:

* **You enjoy diverse metal:**  You listen to a variety of metal subgenres, from black metal (Watain, Mayhem, Darkthrone) to doom metal (Electric Wizard, Candlemass), and even some death metal (Autopsy, Cannibal Corpse) and sludge metal (Neurosis, The Body). This suggests you're open to different sounds and experiences within the genre. 
* **You appreciate some alternative rock and post-rock:** You also listen to artists like The Cure, The National, and Mogwai. This hints that you might be interested in music that's more atmospheric, introspective, and sometimes experimental. 
* **You like bands from various countries:**  You have artists from all over the world in your listening history, including artists from Finland (Havukruunu, Saiva), Sweden (Opeth, Katatonia), Norway (Ulver, Arcturus), and even some from Russia (Arkona, WelicoRuss).  This implies you're curious about exploring music beyond your own cultural borders.
* **You're not afraid of a bit of darkness:** Many of the songs you listen to deal with themes of death, darkness, despair, and the occult. This could mean you're fascinated by the darker aspects of life and enjoy exploring those themes through music. 

Overall, you seem like someone who enjoys a good mix of heavy music with some alternative rock and post-rock thrown in. You're open to different sounds and cultures, and you're not afraid to delve into the darker aspects of music. 

It's important to remember that these are just inferences based on your listening history.  There are many other factors that could influence your personality and interests! 


# Display Our Saved Musical Profiles!

In [60]:
from IPython.display import Markdown, display

display (Markdown(gw_profile))

Based on your listening history, it seems you have a taste for a wide range of music, but you definitely lean towards the heavier side of things! 

Here's what I can gather about you:

* **You enjoy diverse metal:**  You listen to a variety of metal subgenres, from black metal (Watain, Mayhem, Darkthrone) to doom metal (Electric Wizard, Candlemass), and even some death metal (Autopsy, Cannibal Corpse) and sludge metal (Neurosis, The Body). This suggests you're open to different sounds and experiences within the genre. 
* **You appreciate some alternative rock and post-rock:** You also listen to artists like The Cure, The National, and Mogwai. This hints that you might be interested in music that's more atmospheric, introspective, and sometimes experimental. 
* **You like bands from various countries:**  You have artists from all over the world in your listening history, including artists from Finland (Havukruunu, Saiva), Sweden (Opeth, Katatonia), Norway (Ulver, Arcturus), and even some from Russia (Arkona, WelicoRuss).  This implies you're curious about exploring music beyond your own cultural borders.
* **You're not afraid of a bit of darkness:** Many of the songs you listen to deal with themes of death, darkness, despair, and the occult. This could mean you're fascinated by the darker aspects of life and enjoy exploring those themes through music. 

Overall, you seem like someone who enjoys a good mix of heavy music with some alternative rock and post-rock thrown in. You're open to different sounds and cultures, and you're not afraid to delve into the darker aspects of music. 

It's important to remember that these are just inferences based on your listening history.  There are many other factors that could influence your personality and interests! 


In [61]:
display (Markdown(bespont_profile))

Based on your listening history, you seem to be someone who enjoys a wide range of music, with a particular interest in:

**Heavy and Progressive Rock:**  You have a taste for heavier genres, with a strong preference for bands like YOB, Electric Wizard,  Orange Goblin, Sleep, and Dozer. This suggests you might enjoy darker themes, introspective lyrics, and powerful instrumentals. You seem to appreciate both the classic heavy metal sounds and more experimental, progressive approaches.

**Post-Metal and Atmospheric Rock:** You've listened to a lot of bands like God Is An Astronaut,  Khemmis, and  Windhand, which lean towards a post-metal aesthetic.  This suggests you might appreciate atmospheric soundscapes, melancholic melodies, and a more introspective approach to music.

**Alternative and Indie Rock:** You enjoy some alternative and indie bands like Placeboing, Blind Dog, and  The Dry Mouths. This suggests you enjoy unique sounds, diverse instrumentation, and lyrics that explore various emotions and themes.

**A Love for Classic Rock and Metal:** You have a few classic rock and metal tracks in your history from bands like Led Zeppelin, Deep Purple, and Metallica. This suggests you might enjoy the power and legacy of these legendary artists, with a strong appreciation for their influence on later generations of musicians.

**A Touch of Electronic and Ambient:** You've listened to Boards Of Canada, a band known for its atmospheric electronic music. This suggests you might appreciate the mood and soundscapes created by electronic music, particularly ambient and experimental styles.

**General Observations:**

* You have a very diverse taste, spanning several different genres and subgenres. 
* You seem to enjoy exploring music with darker themes and more complex structures. 
* You might be a fan of exploring the more "underground" or less mainstream music scene.

**It's important to remember**: This is just a guess based on your listening history. Music taste can be incredibly personal, and there is no one-size-fits-all interpretation. You might have other interests and aspects of your personality that don't necessarily reflect your musical preferences. 


In [62]:
display (Markdown(savage_core_profile))

Based on your listening history, you seem to be a person with eclectic taste, interested in a variety of music genres, and perhaps someone who enjoys exploring different moods and experiences through music. 

Here's a breakdown of some potential insights:

**You are into electronic music:** You have a strong interest in electronic music, especially in its more experimental and diverse forms. You listen to artists like Aphex Twin, Burial, and Boards of Canada, suggesting a preference for ambient, IDM, and experimental electronic music. Your appreciation for artists like The Chemical Brothers and Justice indicates a taste for dance-oriented electronic music, but your broader listening suggests you don't shy away from the less commercially-driven aspects of the genre.

**You enjoy exploring darker and heavier moods:** Your listening history suggests you aren't afraid of darker and more introspective themes. You listen to artists like Lorn and The Toxic Avenger, who are known for their industrial, synthwave, and dark ambient sounds. This could indicate you appreciate music that explores challenging emotions or themes.

**You have a diverse taste:** You don't limit yourself to just one genre, and you enjoy a variety of artists from different musical backgrounds. This shows openness and willingness to explore new things. You listen to reggae, hip-hop, rock, and even classical music, suggesting a wide-ranging appreciation for different styles.

**You may be a creative person:**  Your interest in a variety of sounds, especially experimental electronic music, can sometimes be linked to creative individuals. They might appreciate music that evokes a sense of exploration and pushes boundaries.

**You might be someone who enjoys the "underground" or less mainstream music:** Your interest in artists like The Bug, Digital Mystikz, and Loefah suggests an appreciation for music that isn't as commercially popular or mainstream. 

**You might be a fan of soundtracks:** The presence of music from video game soundtracks like *Grand Theft Auto* and *Job Simulator* hints at a potential appreciation for music that creates atmosphere and enhances a visual experience. 

**You may be a fan of certain subcultures:** Your interest in certain genres like dubstep and rave culture suggests an appreciation for the sounds and aesthetics associated with those subcultures. 

**This is just a guess**,  and it is important to remember that your listening history is just one piece of the puzzle. However, it does offer some fascinating insights into your personality and musical preferences! 


# Determining Musical Compatibility Between Users!

In [63]:
from IPython.display import Markdown, display

prompt = f"""
Please review three user music profiles: 
- SavageCore: {savage_core_profile}
- gw666: {gw_profile}
- BeSpont: {bespont_profile}

Based on your review:
1. Which profile is most compatible with SavageCore's and why?
2. Provide a compatibility score (as a percentage) for both gw666 and BeSpont compared to SavageCore.
"""

response = model.generate_content([prompt])

# Lets save our report
compatibility_report = response.text

display (Markdown(response.text))

Here's an analysis of the profiles and their compatibility with SavageCore:

**1. Most Compatible Profile:**

* **BeSpont is the most compatible with SavageCore.** 

Here's why:

* **Eclectic Tastes:**  Both profiles demonstrate a wide range of musical interests, encompassing heavy rock, electronic music, ambient sounds, and indie rock. This shared eclecticism suggests a strong foundation for compatible listening experiences. 
* **Shared Love for the "Underground":** SavageCore's interest in less mainstream electronic music and BeSpont's appreciation for alternative and indie rock suggest a similar preference for exploring music outside the mainstream.
* **Exploration of Moods:** Both profiles seem to enjoy music that explores darker themes and more complex structures, indicating a shared appreciation for music that goes beyond superficial entertainment.

**2. Compatibility Scores:**

* **BeSpont:**  **85%**  
* **gw666:** **65%**

**Explanation of Scores:**

* **BeSpont:**  The high score is due to the significant overlap in genre preferences and the shared love for diverse, explorative, and sometimes darker music. 
* **gw666:**  While both profiles have a strong interest in metal, the lower score reflects less overlap in other genres.  SavageCore's interest in electronic music, ambient sounds, and indie rock is less prevalent in gw666's listening history.

**Conclusion:**

BeSpont and SavageCore appear to be the most musically compatible due to their shared eclecticism, appreciation for "underground" music, and shared interest in exploring different moods and experiences through music. gw666, while sharing a love for metal, has a slightly less diverse and overlapping taste profile compared to SavageCore. 


# Listen to a Song and Recommend a User

In [64]:
from IPython.display import Markdown, display

file_path = "/kaggle/input/paphos-mp3/paphos.mp3"

myfile = genai.upload_file(file_path)

model = genai.GenerativeModel("gemini-1.5-flash")

prompt = f"""
Please review three user music profiles:
- SavageCore: {savage_core_profile}
- gw666: {gw_profile}
- BeSpont: {bespont_profile}

Task:
1. Listen to this uploaded song and identify which user is most likely to enjoy it.
2. Explain why this user would enjoy the song based on their profile preferences.
"""

response = model.generate_content([myfile, prompt])

display (Markdown(response.text))

The song uploaded is a heavy, atmospheric piece of music with elements of doom metal and post-metal.  Based on this, the user most likely to enjoy it is **BeSpont**.

Here's why:

BeSpont's profile clearly demonstrates a strong preference for heavy and progressive rock, specifically mentioning bands like YOB, Electric Wizard, Orange Goblin, Sleep, and Dozer – all of which share stylistic similarities with the uploaded track.  The song's heavy, distorted guitars, slow tempo, and atmospheric textures align perfectly with BeSpont's stated appreciation for "darker themes, introspective lyrics, and powerful instrumentals."  Furthermore, the song's atmospheric and somewhat melancholic nature fits within BeSpont's expressed liking of post-metal and atmospheric rock acts like God Is An Astronaut, Khemmis, and Windhand.  The song's overall "underground" feel also fits BeSpont's apparent interest in less mainstream music.

While gw666 also enjoys heavy music, their profile focuses more broadly on various metal subgenres, and their taste also extends to alternative and post-rock,  which don't entirely capture the specific heaviness and atmospheric doom elements of the uploaded song.  SavageCore, with their eclectic and experimental tastes, might appreciate *some* aspects of the song, but the song's primarily heavy and atmospheric nature would likely resonate less strongly with their preference for diverse electronic music and less overtly "heavy" sounds.


# Personalization With Multiple Users

When I introduce new users, its easy to see how impactful this level of personalization can be. 

I'm able to look at which users have more compatible music tastes, that could be used to facilitate real world interactions - such as attending a concert together, or create a group with shared musical interests. 

There's also the question of dating versus friendship. Music taste can be a fun icebreaker for an initial conversation or coming up with first date ideas, online dating profiles often include music and genres a person listens to but not profile matching based on listening histories and music compatibility that could be an intersting path of further exploration. 

I'm also able to ask Gemini to listen to a [song](https://www.premiumbeat.com/royalty-free-music-genre/dance-techno
) I've uploaded, describe the song and then from the musical profiles pick out a user who is likely to be most interested in that song. This is a audio recommendation system, where new songs we're likely to enjoy can be found for us.

Music is a great way to connect with other people and Gemini can help to facilitate more of those interactions, who knows who you might meet and the memories you could create!


# Conclusion & Further Research

**Conclusion**

To conclude, this project explored how Gemini's long context window and caching, can enable hyper personalized user experiences, through music analytics, recommendations and user matching.

I was able to introduce the listening history of three users and generate unique musical profiles for each. I was able to create an opportunity to bring two users together based on music compatibility that could facilitate real world music experiences.

This project is uniquely suited to a large context window and caching, as it allows us to bring in a significant period of listening history as context to then provide hyper personalized recommendations and suggestions. 

No project is perfect and there were limitations and areas that would be great to explore further. 

**Limitations**

The dataset only contained song names, artist names and when it was listened to. There are a number of other data points that could improve Gemini's understanding of how our user listens to music, such as: how long the song was listened to, where it was listened to as a few examples. 

I was only able to access Flash consistently for this project, which limited to token window to 1 million tokens. As the largest listening history was around 700K tokens, this did not impact the project to much and I was able to take advantage of caching, clearing caches and saving to variables to utilize multiple user's listening histories.

Newer features proved problematic, such as grounded responses with Google and image generation through Imagen. An example where I would have used grounded responses is once my musical profiles were matched, searching for an upcoming relevant concert for them to attend and accurate pricing using Google search data. Image generation is less problematic, as it could be used for an interactive game, generating an image based on a musical profile, but not as valuable a use case as facilitating a real world experince through grounding. 

A lack of grounding can impact the quality of outputs. Occassionally prompts have to be re-run to get a perfect output for every prompt. I spent time crafting my prompts and I could have used multi shot prompting but it requires at least 100K tokens to see performance improvements and I wanted to utilize listening histories, whilst keeping in 1 million token window. 

**Further Research**

I primarily used text based prompting but communication and music is multimodal. I started exploring audio and it returned interesting results but there's a lot of opportunity to explore image, video and audio when it comes to understanding music. 

Structured outputs is an interseting topic here because I can generate recommendations or songs in a structured format that could communicate GET a song from an API. Looking at Soundcloud, Spotify, Apple Music and other popular music providers API documentation could reveal opportunities to utilize outputs automatically e.g. our Synthwave playlist we created based on our listening history with a list of songs and artists. 

Armed with music profiles, it would be interesting to learn more about whether music is truly a great way to find and meet new friends or romantic partners, by using our unique music personalities to match with other users and create real world interactions. 

Whether the hyper personalized recommendations help marketing teams to serve relevant advertisements and do those tailored advertisments convert more frequently, upcoming concerts, albums, songs that our user should be interested in. Linking outputs to concrete outcomes requires more exploration. 