### __Utilizing Chartmetric API & Foundational Data (Max Vo):__ https://api.chartmetric.com/apidoc/

Alternative:  
      
https://www.viberate.com/  
https://app.soundcharts.com/  
https://spotipy.readthedocs.io/en/2.24.0/  
https://docs.songstats.com/  

#### utilize a cache

In [2]:
import requests
import requests_cache
from sys import exit

import lxml.html as lx
import re

__https://api.chartmetric.com/apidoc/#api-Authorization-GetAccessToken__

#### Example Code for Token Generation in Private Script

In [None]:
HOST = 'https://api.chartmetric.com'
REFRESH_TOKEN = '...'  # replace with the refresh token

def get_access_token():
    '''
    Function to fetch access token using refresh token
    '''
    res = requests.post(f'{HOST}/api/token', json={"refreshtoken": REFRESH_TOKEN})
    if res.status_code == 200:
        return res.json().get('token')  
    else:
        print(f"ERROR: Failed to fetch access token. Status code: {res.status_code}")
        print("Response:", res.json())  # Debugging info
        exit(1)

Accessing Hidden Keys + Pipeline

In [3]:
with open('config.txt', 'r') as f:
    access_token = f.read()

In [4]:
HOST = 'https://api.chartmetric.com'
session = requests_cache.CachedSession('MUSIC_TRENDS_VISUALIZATION') # session object

# Function to make authorized GET requests ~ stemmed from documentation
def Get(uri, access_token):
    res = session.get(f'{HOST}{uri}', headers={'Authorization': f'Bearer {access_token}'})
    if res.status_code != 200:
        print(f"ERROR: Failed GET request. Status code: {res.status_code}")
        print("Response:", res.json())  # Debugging info
        exit(1)
    return res.json()

#### ENDPOINTS: https://api.chartmetric.com/apidoc/

#### Possibilities:  
__Get List of city for a particular country to be used in city parameters.__
https://api.chartmetric.com/api/charts/shazam/:country_code/cities

__Charts - Spotify (Artists)__
Get data insights for artists on Spotify Charts, including historical chart positions trend and listener counts.
Given the appropriate filters, get data specific to interval and chart type (e.g. Monthly Popularity).
This endpoint is useful for identifying trending artists on various Spotify Charts.
This data will be cached daily.

https://api.chartmetric.com/api/charts/spotify/artists


__Get top 100 artists with maximum count for source in the city__  
https://api.chartmetric.com/api/city/:id/:source/top-artists

__Get top 100 tracks with maximum count for source in the city__  
https://api.chartmetric.com/api/city/:id/:source/top-tracks

__Spotify data for top 50 cities available prior to Aug 12, 2024__  
https://api.chartmetric.com/api/artist/:id/where-people-listen

__Search Functionalities__  
https://api.chartmetric.com/apidoc/#api-Search-SearchCity

#### Using: https://api.chartmetric.com/api/cities Endpoint

In [13]:
cityIDs = Get('/api/cities?country_code=US', access_token)

In [14]:
cityIDs['obj'][1]

{'city_id': 6351,
 'city_name': 'Grand Island',
 'latitude': 40.9214,
 'longitude': -98.3584,
 'population': 53424,
 'province': 'Nebraska',
 'locality': 'Grand Island',
 'country': 'United States of America',
 'image_url': None,
 'iso3': 'USA'}

In [15]:
# Sorting By Population (Extract ID information)
sorted_cities = sorted(cityIDs['obj'], key=lambda x: x['population'], reverse=True)
#
top_50_cities = sorted_cities[:50]

# Create a list of dicts containing only city_name, city_id, and population
top_50_cities_filtered = [
    {'city_name': city['city_name'], 'city_id': city['city_id'], 'population': city['population']}
    for city in top_50_cities
]
top_50_cities_filtered

[{'city_name': 'New York', 'city_id': 7060, 'population': 19354922},
 {'city_name': 'Los Angeles', 'city_id': 7058, 'population': 12815475},
 {'city_name': 'Chicago', 'city_id': 7057, 'population': 8675982},
 {'city_name': 'Miami', 'city_id': 7055, 'population': 6381966},
 {'city_name': 'Dallas', 'city_id': 7046, 'population': 5733259},
 {'city_name': 'Philadelphia', 'city_id': 7049, 'population': 5637884},
 {'city_name': 'Houston', 'city_id': 7054, 'population': 5446468},
 {'city_name': 'Washington D.C.', 'city_id': 16028, 'population': 5289420},
 {'city_name': 'Atlanta', 'city_id': 7056, 'population': 5228750},
 {'city_name': 'Boston', 'city_id': 7047, 'population': 4637537},
 {'city_name': 'Phoenix', 'city_id': 7042, 'population': 4081849},
 {'city_name': 'Seattle', 'city_id': 7041, 'population': 3643765},
 {'city_name': 'San Francisco', 'city_id': 7052, 'population': 3603761},
 {'city_name': 'Detroit', 'city_id': 7050, 'population': 3522206},
 {'city_name': 'San Diego', 'city_id': 

In [None]:
# For Use in CityDemographics
import pickle

top_50_cities_provinces = [
    {'city_name': city['city_name'], 'state': city['province'], 'population': city['population']}
    for city in top_50_cities
]
top_50_cities_provinces[1] # Example info for  Los Angeles

with open('top_50_cities_provinces.pkl', 'wb') as file: # moved to Data Folder
    pickle.dump(top_50_cities_provinces, file)

In [None]:
# ALSO WANT LATITUDE AND LONGITUDE PAIRS FOR VISUALIZATION PURPOSES
import pickle

top_50_cities_location = [
    {'city_name': city['city_name'], 'state': city['province'], 'latitude': city['latitude'], 'longitude': city['longitude']}
    for city in top_50_cities
]
with open('top_50_cities_location.pkl', 'wb') as file: # moved to Data Folder
    pickle.dump(top_50_cities_location, file)

#### Using https://api.chartmetric.com/api/city/:id/:source/top-artists Endpoint

In [14]:
# Foundation: Get(f'/api/city/{city_id}/spotify/top-artists')
import time

def fetch_top_artists(top_50_cities_filtered, access_token):
    """Fetches the top artists for each city and retains the city information."""
    city_artist_data = []

    for city in top_50_cities_filtered:
        city_id = city['city_id']
        city_name = city['city_name']
        
        # Call the API for the current city's top Spotify artists
        uri = f'/api/city/{city_id}/spotify/top-artists'
        try:
            response_data = Get(uri, access_token)
            city_artist_data.append({
                'city_name': city_name,
                'city_id': city_id,
                'population': city['population'],
                'top_artists': response_data  # Include the API's response for top artists
            })
        except Exception as e:
            print(f"Failed to fetch data for {city_name} (ID: {city_id}): {e}")
            
        time.sleep(2) # Sleep for 2 seconds to avoid rate limiting
    
    return city_artist_data

In [15]:
fetch_city_artist_data = fetch_top_artists(top_50_cities_filtered, access_token)

In [65]:
# Example Path for the Data of the top Artist of NY
fetch_city_artist_data[0]['top_artists']['obj'][0]

{'count': 993992,
 'id': 3380,
 'name': 'Drake',
 'image_url': 'https://i.scdn.co/image/ab676161000051744293385d324db8558179afd9',
 'isni': '000000012032246X',
 'code2': 'CA',
 'hometown_city': 'Toronto',
 'verified': True,
 'current_city': None,
 'sp_followers': 93985366,
 'sp_popularity': 96,
 'sp_monthly_listeners': 73378027,
 'deezer_fans': 23285258,
 'cm_artist_rank': 10,
 'cm_artist_score': 474567,
 'spotify_artist_ids': ['0bxzG5POkPAzYmEhoNfgtO',
  '2Iqn8dbh4BvojpUyzWYnHg',
  '3TVXtAsR1Inumwj472S9r4',
  '7yy0l7fZhBHr7wyfd01gRl',
  '65Hl58L46lxt5QBI4AGsGW',
  '2pTaxwkFpeMa0hF114k5pa',
  '7GstH8EF1DS4SjGtNyUANW'],
 'itunes_artist_ids': [1396256855, 1227259884, 1603482457, 332601109, 271256],
 'deezer_artist_ids': ['5215664', '246791', '5113766'],
 'amazon_artist_ids': ['B000QJRIHS',
  'B001RDCRPA',
  'B008LALN98',
  'B0025NKBDQ',
  'B07MXLSWXN',
  'B008L6SPES',
  'B001E72M5Y',
  'B003FMPP32',
  'B07MXLC7Q3',
  'B07MXL39XS'],
 'tags': ['hip-hop/rap', 'pop', 'r&b/soul', 'pop rap'],


In [16]:
def transform_city_artist_data(city_artist_data):
    for city in city_artist_data:
        top_artists = city['top_artists']['obj']
        transformed_artists = [{'name': artist['name'], 'sp_followers': artist['sp_followers']} for artist in top_artists]
        city['top_artists'] = transformed_artists
    return city_artist_data

# Assuming city_artist_data is already defined
transformed_city_artist_data = transform_city_artist_data(fetch_city_artist_data)

In [17]:
def keep_top_10_artists(city_artist_data):
    for city in city_artist_data:
        city['top_artists'] = city['top_artists'][:10]
    return city_artist_data

transformed_city_top10artist_data = keep_top_10_artists(transformed_city_artist_data)

In [18]:
# ex data for NY
transformed_city_top10artist_data[0]

{'city_name': 'New York',
 'city_id': 7060,
 'population': 19354922,
 'top_artists': [{'name': 'Drake', 'sp_followers': 93985366},
  {'name': 'Kendrick Lamar', 'sp_followers': 34970428},
  {'name': 'Kanye West', 'sp_followers': 28273751},
  {'name': 'Future', 'sp_followers': 19503054},
  {'name': 'Tyler, The Creator', 'sp_followers': 18178276},
  {'name': 'Lil Wayne', 'sp_followers': 15755661},
  {'name': 'Metro Boomin', 'sp_followers': 9819503},
  {'name': '21 Savage', 'sp_followers': 20875641},
  {'name': 'Playboi Carti', 'sp_followers': 12452046},
  {'name': 'Chappell Roan', 'sp_followers': 4157919}]}

In [70]:
import json

# FILE NAME
text_file = 'transformed_city_top10artist_data.txt'

# Write in JSON format
with open(text_file, 'w', encoding='utf-8') as file:
        json.dump(transformed_city_top10artist_data, file, indent=4)

In [19]:
from collections import defaultdict # found module from google search

# Dict. where each artist's name is a key, and the number of cities in which the artist appears is the value.
artist_count = defaultdict(int) 

# Iterate through transformed_city_top10artist_data to count the number of cities in which each artist appears
for city in transformed_city_top10artist_data:
    for artist in city['top_artists']:
        artist_count[artist['name']] += 1

# Sort in Descending Order
sorted_artists = sorted(artist_count.items(), key=lambda x: x[1], reverse=True)
# Get the top 10 artists
top_10_artists = [artist[0] for artist in sorted_artists[:10]]

In [None]:
# ttps://docs.python.org/3/library/pickle.html
# accessed in Genre_By_State&Selenium.ipynb

import pickle

with open('top_10_artists.pkl', 'wb') as f: # moved to Data Folder
    pickle.dump(top_10_artists, f)

#### DEFUNCT CODE

In [None]:
# access list from 50cities.ipynb

import nbformat
from IPython import get_ipython

# Load the notebook
with open('50cities.ipynb') as f:
    nb = nbformat.read(f, as_version=4)

# Execute the notebook
ip = get_ipython()
for cell in nb.cells:
    if cell.cell_type == 'code':
        ip.run_cell(cell.source)

top50 = citylistfinal # list of top 50 cities by population according to Wikipedia

In [None]:
city_matches = []

# Normalize city names to lowercase for case-insensitive matching
for city_name in top50:
    match = next(
        (city for city in cityIDs['obj'] if city['city_name'].lower() == city_name.lower()), 
        None
    )
    if match:
        city_matches.append({'city_name': match['city_name'], 'city_id': match['city_id']})
        
city_matches