# <center>📦🧹🔍ETL 🔍🧹📦<center>

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#What-does-ETL-stand-for?" data-toc-modified-id="What-does-ETL-stand-for?-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>What does ETL stand for?</a></span></li><li><span><a href="#What-are-we-going-to-do?" data-toc-modified-id="What-are-we-going-to-do?-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>What are we going to do?</a></span></li><li><span><a href="#We-import-libraries" data-toc-modified-id="We-import-libraries-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>We import libraries</a></span></li><li><span><a href="#Mission-1:-Obtain-the-Spotify-token-to-use-its-API-🗝" data-toc-modified-id="Mission-1:-Obtain-the-Spotify-token-to-use-its-API-🗝-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Mission 1: Obtain the Spotify token to use its API 🗝</a></span></li><li><span><a href="#Spotify-token" data-toc-modified-id="Spotify-token-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Spotify token</a></span></li><li><span><a href="#We-save-the-token-in-our-.env" data-toc-modified-id="We-save-the-token-in-our-.env-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>We save the token in our .env</a></span></li><li><span><a href="#📚-Recap-so-far:" data-toc-modified-id="📚-Recap-so-far:-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>📚 Recap so far:</a></span></li><li><span><a href="#Mission-2:-Get-the-list-of-songs-and-artists-from-the-Spotify-API-🔥" data-toc-modified-id="Mission-2:-Get-the-list-of-songs-and-artists-from-the-Spotify-API-🔥-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Mission 2: Get the list of songs and artists from the Spotify API 🔥</a></span><ul class="toc-item"><li><span><a href="#Spotify-API-call" data-toc-modified-id="Spotify-API-call-8.1"><span class="toc-item-num">8.1&nbsp;&nbsp;</span>Spotify API call</a></span><ul class="toc-item"><li><span><a href="#We-need-token-and-headers" data-toc-modified-id="We-need-token-and-headers-8.1.1"><span class="toc-item-num">8.1.1&nbsp;&nbsp;</span>We need token and headers</a></span></li><li><span><a href="#Url-+-endpoint" data-toc-modified-id="Url-+-endpoint-8.1.2"><span class="toc-item-num">8.1.2&nbsp;&nbsp;</span>Url + endpoint</a></span></li><li><span><a href="#We-make-the-request" data-toc-modified-id="We-make-the-request-8.1.3"><span class="toc-item-num">8.1.3&nbsp;&nbsp;</span>We make the request</a></span></li><li><span><a href="#Function-to-extract-info-from-dictionaries" data-toc-modified-id="Function-to-extract-info-from-dictionaries-8.1.4"><span class="toc-item-num">8.1.4&nbsp;&nbsp;</span>Function to extract info from dictionaries</a></span></li></ul></li><li><span><a href="#We're-taking-it-to-Pandas!-🐼" data-toc-modified-id="We're-taking-it-to-Pandas!-🐼-8.2"><span class="toc-item-num">8.2&nbsp;&nbsp;</span>We're taking it to Pandas! 🐼</a></span></li></ul></li><li><span><a href="#Mission-3:-Get-song-lyrics" data-toc-modified-id="Mission-3:-Get-song-lyrics-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>Mission 3: Get song lyrics</a></span><ul class="toc-item"><li><span><a href="#store-it-in-a-variable" data-toc-modified-id="store-it-in-a-variable-9.1"><span class="toc-item-num">9.1&nbsp;&nbsp;</span>store it in a variable</a></span></li><li><span><a href="#We-are-looking-for-some-lyrics" data-toc-modified-id="We-are-looking-for-some-lyrics-9.2"><span class="toc-item-num">9.2&nbsp;&nbsp;</span>We are looking for some lyrics</a></span></li><li><span><a href="#Many-lyrics!" data-toc-modified-id="Many-lyrics!-9.3"><span class="toc-item-num">9.3&nbsp;&nbsp;</span>Many lyrics!</a></span></li></ul></li></ul></div>

![data](https://media.giphy.com/media/xT9C25UNTwfZuk85WP/giphy.gif)

# Process to follow during the 3 jupyter notebook
**Jupyter ETL I - Extraction and transformation**

* I get spotify token
* I make request to the API:
    - Be careful if the lists have more than 100 songs
    - Functions that automate the process
* I get the lyrics of the songs (enrichment) with another library

**Jupyter ETL II - Loading**

* I connect with the databases
* I create my schema in MySQL
* I clean the data a bit
* Check functions before putting my data in the database
* I load my data to MySQL
* I export json to load it in MongoDB

**Jupyter NLP**
* I get the data I want from the databases
* Tokenize and remove stop words
* I do Sentiment Analysis with the libraries
* draw conclusions

## What does ETL stand for?
**Extract**, **Transform** and **Load** is the process that enables organizations to move data from multiple sources, reformat and cleanse it, and load it into another database, data mart, or data warehouse to analyze, or in another operational system to support a business process.

## What are we going to do?
- Learn to use the Spotify API with Oauth
- Extract data from said API
- Clean and transform them
- Enrich them
- Save them in different databases
- Make queries to do sentiment analysis with NLP libraries

## We import libraries

In [1]:
import os

import json
import requests
import pyjsonviewer

from functools import reduce
import operator

import pandas as pd
import numpy as np

import lyricsgenius
from lyrics_extractor import SongLyrics

from dotenv import load_dotenv
load_dotenv()

True

https://open.spotify.com/playlist/6DaKU06d03UDpVjPZhF19X?si=4865e6ee629d488b&pt=0ae09e3cb50938ee8f2fc768c208c1c2

## Mission 1: Obtain the Spotify token to use its API 🗝

## Spotify token
https://developer.spotify.com/dashboard/login
We need to have the CLIENT_ID and SECRET_ID tokens, and we store them in the `.env`.


1. Sign up as a devloper
2. Create an APP/project on spotify
3. Get client_id & client_secret

To call the spotify api and get the token that will allow us to make a request, we are going to have to save a few variables.
Read the [Documentation](https://developer.spotify.com/documentation/general/guides/authorization-guide/)

In [None]:
# 1. Auth for app
    # CLIENT_ID
    # CLIENT_SECRET

# 2. Auth for queries    
    # token

In [13]:
# demo-scraping: send

# result: what I get

In [14]:
body_params = {"grant_type":"client_credentials"}

url = "https://accounts.spotify.com/api/token"

In [15]:
response = requests.post(url, data=body_params, auth=(CLIENT_ID,CLIENT_SECRET))

In [16]:
token = response.json()["access_token"]

In [None]:
#access_token
#authorization: bearer
#expiration:3600

We are going to write all this in a function because, as we have seen, the token expires and so we can call it whenever we want and reuse it. This is what functions are for!

In [None]:
# I need to hide my credentials: .env

In [34]:
def spotifyToken ():
    
    """This function refreshes a token for a given app on Spotify
    returns: token as a string
    """

    # 1. Defining: credentials fot the app
    
    CLIENT_ID = os.getenv("CLIENT_ID")
    CLIENT_SECRET = os.getenv("CLIENT_SECRET")
    
    #2. Request
    
    body_params = {"grant_type":"client_credentials"}
    url = "https://accounts.spotify.com/api/token"
    response = requests.post(url, data=body_params, auth=(CLIENT_ID,CLIENT_SECRET))
    
    try:
        token = response.json()["access_token"]
        return token

    except:
        print("The request did not go through: wrong credentials?")

In [37]:
token = spotifyToken ()

## We save the token in our .env

In [None]:
# 1. Crearte .env
# 2. Input the variables

In [None]:
import os 
os.system(variable)

In [27]:
def save_token (CLIENT_ID, CLIENT_SECRET):
    command = f"""echo "CLIENT_ID='{CLIENT_ID}'\nCLIENT_SECRET='{CLIENT_SECRET}'" > .env"""
    os.system(command)

We make a function that does the same thing to be able to reuse it

## 📚 Recap so far:

**What we have done**
- 1. We get our Spotify authentication
- 2. With that, we get our spotify `token`: issuing a request.post
- 3. We save the `token` in the .env <br>
<br>

**To remember**

- **CLIENT_ID** & **CLIENT_SECRET** is just a means to get the `token`, not the authentication itself.
- The `token` expires

## Mission 2: Get the list of songs and artists from the Spotify API 🔥
But let's not forget about the [documentation](https://developer.spotify.com/console/)
The first thing we are going to do is make an API call
Let's remember the syntax:

`request.get(url, headers)`

### Spotify API call

#### We need token and headers

In [42]:
#token
headers = {"Authorization":f"Bearer {token}"}

We take the ID of the list of the share link that comes out in spotify (Link of the song)
https://open.spotify.com/playlist/6DaKU06d03UDpVjPZhF19X?si=838e81628a2e4a23

In [50]:
url_base = "https://api.spotify.com/v1/playlists/"

The ID of the list from which we want to make the request

In [46]:
playlist_link = "https://open.spotify.com/playlist/37i9dQZF1DWTf6wqAsYxqz?si=df69ba9e488249b5"

In [49]:
playlist_id = playlist_link.split("/")[-1].split("?")[0]
playlist_id

'37i9dQZF1DWTf6wqAsYxqz'

In [102]:
playlist_link_2 = "https://open.spotify.com/playlist/50v0ITLYuxui3QRwd0CNTW?si=3176623842434386"
playlist_id_2 = "50v0ITLYuxui3QRwd0CNTW"
query_2 = url_base + playlist_id_2
response_2 = requests.get(query_2, headers=headers).json()
response_2

{'collaborative': True,
 'description': '',
 'external_urls': {'spotify': 'https://open.spotify.com/playlist/50v0ITLYuxui3QRwd0CNTW'},
 'followers': {'href': None, 'total': 5},
 'href': 'https://api.spotify.com/v1/playlists/50v0ITLYuxui3QRwd0CNTW',
 'id': '50v0ITLYuxui3QRwd0CNTW',
 'images': [{'height': 640,
   'url': 'https://mosaic.scdn.co/640/ab67616d0000b273379540c86420ea590936e25eab67616d0000b2737d57ac4e84d2c826094f9890ab67616d0000b273a00ebf57f89f4202d299976bab67616d0000b273cc4fe4fda4dd75b123f64b20',
   'width': 640},
  {'height': 300,
   'url': 'https://mosaic.scdn.co/300/ab67616d0000b273379540c86420ea590936e25eab67616d0000b2737d57ac4e84d2c826094f9890ab67616d0000b273a00ebf57f89f4202d299976bab67616d0000b273cc4fe4fda4dd75b123f64b20',
   'width': 300},
  {'height': 60,
   'url': 'https://mosaic.scdn.co/60/ab67616d0000b273379540c86420ea590936e25eab67616d0000b2737d57ac4e84d2c826094f9890ab67616d0000b273a00ebf57f89f4202d299976bab67616d0000b273cc4fe4fda4dd75b123f64b20',
   'width': 60}],

In [116]:
response_2["tracks"]["items"][0]["added_by"]["id"]

'soyarepita'

#### Url + endpoint

In [51]:
query = url_base + playlist_id

In [52]:
query

'https://api.spotify.com/v1/playlists/37i9dQZF1DWTf6wqAsYxqz'

#### We make the request

In [53]:
response = requests.get(query, headers=headers).json()

In [None]:
response

In [96]:
response.keys()

dict_keys(['collaborative', 'description', 'external_urls', 'followers', 'href', 'id', 'images', 'name', 'owner', 'primary_color', 'public', 'snapshot_id', 'tracks', 'type', 'uri'])

In [218]:
def request_playlist (link):
    token = spotifyToken ()
    headers = {"Authorization":f"Bearer {token}"}
    
    url_base = "https://api.spotify.com/v1/playlists/"
    playlist_id = link.split("/")[-1].split("?")[0]
    
    query = url_base + playlist_id
        
    try:    
        return requests.get(query, headers=headers).json()
    except:
        token = spotifyToken ()
        return requests.get(query, headers=headers).json()
    


In [161]:
response.keys()

dict_keys(['collaborative', 'description', 'external_urls', 'followers', 'href', 'id', 'images', 'name', 'owner', 'primary_color', 'public', 'snapshot_id', 'tracks', 'type', 'uri'])

In [162]:
name = response["tracks"]["items"][0]["track"]["name"]
name

'Quevedo: Bzrp Music Sessions, Vol. 52'

In [163]:
artist = response["tracks"]["items"][0]["track"]["artists"][0]["name"]
artist

'Bizarrap'

In [164]:
album_name = response["tracks"]["items"][0]["track"]["album"]["name"]
album_name

'Quevedo: Bzrp Music Sessions, Vol. 52'

In [74]:
popularity = response["tracks"]["items"][0]["track"]["popularity"]
popularity

53

In [76]:
dict_ = {
    "name":name,
    "artist": artist,
    "album_name": album_name,
    "popularity": popularity
}

In [185]:
def info_for_one_song (one_song):
    
    name = one_song["track"]["name"]
    artist = one_song["track"]["artists"][0]["name"]
    album_name = one_song["track"]["album"]["name"]
    popularity = one_song["track"]["popularity"]
    user = one_song["added_by"]["id"]
    
    dict_ = {
    "name":name,
    "artist": artist,
    "album_name": album_name,
    "popularity": popularity,
    "user": user
}
    
    return dict_

In [129]:
# Tom: we could do random of any song

import random
maximum = len(response["tracks"]["items"])
value = random.randint(0, maximum)
info_for_one_song (response["tracks"]["items"][value])

{'name': 'Take Your Time (feat. Damien Jurado & Suli Breaks)',
 'artist': 'Faithless',
 'album_name': 'All Blessed',
 'popularity': 39,
 'user': 'lewtrah'}

In [130]:
info_for_one_song (response["tracks"]["items"][3])

{'name': 'As It Was',
 'artist': 'Harry Styles',
 'album_name': "Harry's House",
 'popularity': 88,
 'user': 'ferqwertyuiop'}

In [131]:
new_list = []

for i in response_2["tracks"]["items"]:
    new_list.append(info_for_one_song(i))

new_list

[{'name': 'Canto a Caracas',
  'artist': "Billo's",
  'album_name': 'Canto a Caracas',
  'popularity': 0,
  'user': 'soyarepita'},
 {'name': 'Peces del Guaire',
  'artist': 'Desorden Público',
  'album_name': 'Descomposición',
  'popularity': 3,
  'user': 'soyarepita'},
 {'name': 'Cerro Avila',
  'artist': 'Ilan Chester',
  'album_name': 'El Comienzo en un Sotano de la Florida',
  'popularity': 0,
  'user': 'soyarepita'},
 {'name': 'Valle De Balas',
  'artist': 'Desorden Público',
  'album_name': 'Plomo Revienta',
  'popularity': 22,
  'user': 'marianamartinh'},
 {'name': 'Las Caraqueñas',
  'artist': 'Guaco',
  'album_name': 'Guaco Es Guaco',
  'popularity': 29,
  'user': 'soyarepita'},
 {'name': 'Epa Isidoro',
  'artist': "Billo's",
  'album_name': 'Canto a Caracas',
  'popularity': 0,
  'user': 'soyarepita'},
 {'name': 'Caracas Tiene Su Guaguanco',
  'artist': 'Justo Betancourt',
  'album_name': 'Pa Bravo Yo',
  'popularity': 0,
  'user': '1253681944'},
 {'name': 'Luna Caraqueña',


Popularity: according to a 1second google search
    
    Spotify's Popularity index is a 0-100 scale rating of how popular you are compared to every other artist on the platform. Some music marketing chatter suggests a PI of 20+ in the first few weeks will get you onto Release Radar and a PI of 30+ will get you onto Discover Weekly.

In [132]:
df = pd.DataFrame(new_list)
df

Unnamed: 0,name,artist,album_name,popularity,user
0,Canto a Caracas,Billo's,Canto a Caracas,0,soyarepita
1,Peces del Guaire,Desorden Público,Descomposición,3,soyarepita
2,Cerro Avila,Ilan Chester,El Comienzo en un Sotano de la Florida,0,soyarepita
3,Valle De Balas,Desorden Público,Plomo Revienta,22,marianamartinh
4,Las Caraqueñas,Guaco,Guaco Es Guaco,29,soyarepita
5,Epa Isidoro,Billo's,Canto a Caracas,0,soyarepita
6,Caracas Tiene Su Guaguanco,Justo Betancourt,Pa Bravo Yo,0,1253681944
7,Luna Caraqueña,Billo's,Canto a Caracas,0,soyarepita
8,Caminando por Caracas,Piero,Piero,0,soyarepita
9,Caracas de Noche - Original Mix,Javith,Caracas de Noche (Remixes),26,1253681944


In [177]:
# most maisntream?
df

Unnamed: 0,name,artist,album_name,popularity,user
0,Canto a Caracas,Billo's,Canto a Caracas,0,soyarepita
1,Peces del Guaire,Desorden Público,Descomposición,3,soyarepita
2,Cerro Avila,Ilan Chester,El Comienzo en un Sotano de la Florida,0,soyarepita
3,Valle De Balas,Desorden Público,Plomo Revienta,22,marianamartinh
4,Las Caraqueñas,Guaco,Guaco Es Guaco,29,soyarepita
5,Epa Isidoro,Billo's,Canto a Caracas,0,soyarepita
6,Caracas Tiene Su Guaguanco,Justo Betancourt,Pa Bravo Yo,0,1253681944
7,Luna Caraqueña,Billo's,Canto a Caracas,0,soyarepita
8,Caminando por Caracas,Piero,Piero,0,soyarepita
9,Caracas de Noche - Original Mix,Javith,Caracas de Noche (Remixes),26,1253681944


In [178]:
def group_by_popularity (df):
    return df.groupby("user").agg({"popularity": "mean"}).reset_index().sort_values(by="popularity", ascending=False)["user"][0]

In [175]:
df.groupby("user").agg({"popularity": "mean"}).reset_index().sort_values(by="popularity", ascending=False)["user"][0]

'1245001868'

In [143]:
# 1. Function that refreshes the .env
# 2. Does the request for the query: link
# 3. Processes the response into a dataframe
# 4. Groupby: name of the user by popularity

In [None]:
# input = link
# output = name 

In [194]:
def all_together (link):
    response = request_playlist (link)
    list_ = [info_for_one_song(i) for i in response["tracks"]["items"]]
    df = pd.DataFrame (list_)
    return f"The most mainstream user in this playlist is: {group_by_popularity (df)}"

In [221]:
andres = "https://open.spotify.com/playlist/50v0ITLYuxui3QRwd0CNTW?si=3176623842434386"

In [222]:
all_together(andres)

'The most mainstream user in this playlist is: 1245001868'

#### Function to extract info from dictionaries

### We're taking it to Pandas! 🐼
How? Easy....

In [225]:
df.to_csv("songs-andres.csv", index=False)

In [228]:
df = pd.read_csv("songs-andres.csv")
df

Unnamed: 0,name,artist,album_name,popularity,user
0,Canto a Caracas,Billo's,Canto a Caracas,0,soyarepita
1,Peces del Guaire,Desorden Público,Descomposición,3,soyarepita
2,Cerro Avila,Ilan Chester,El Comienzo en un Sotano de la Florida,0,soyarepita
3,Valle De Balas,Desorden Público,Plomo Revienta,22,marianamartinh
4,Las Caraqueñas,Guaco,Guaco Es Guaco,29,soyarepita
5,Epa Isidoro,Billo's,Canto a Caracas,0,soyarepita
6,Caracas Tiene Su Guaguanco,Justo Betancourt,Pa Bravo Yo,0,1253681944
7,Luna Caraqueña,Billo's,Canto a Caracas,0,soyarepita
8,Caminando por Caracas,Piero,Piero,0,soyarepita
9,Caracas de Noche - Original Mix,Javith,Caracas de Noche (Remixes),26,1253681944


# 📚 Recap so far:

**What we did**
- 0. We call the spotify API with a request of the type post so that it gives me the token
- 1. We made a call to the Spotify API: to get the list information
- 2. We get only the songs and artists from the json
- 3. We make functions that automate processes and reuse code
- 4. We have a dataframe
<br>

**To remember**

- Add the `.env` to the `.gitignore`.

In [229]:
df.sample()

Unnamed: 0,name,artist,album_name,popularity,user
11,La Vecina,Los Amigos Invisibles,Arepa 3000,42,1245001868


## Mission 3: Get song lyrics

`Approach 1: SongLyrics library`

Lyrics extractor is a library that scribes some pages of song lyrics for us.
We take a look at the [documentation](https://pypi.org/project/lyrics-extractor/) ALWAYS, especially the 

**Requirements** part that will explain the things we need to have for it to work .

We need to create the KEYs for this library
1.1.1. **[First](https://cse.google.com/cse/create/new)**, create a custom search engine: select a website of letters to scrape. for example: www.genius.com <br>
1.1.2. **Second**, `Edit your search engine > select your website > Copy your search engine ID`.
You now have your **GCS_ENGINE_ID**<br>
1.1.3. **[Then](https://developers.google.com/custom-search/v1/overview)**, `Get a key > Select project > copy search engine API key`. You now have your **GCS_API_KEY**.

```python
from lyrics_extractor import SongLyrics
extract_lyrics = SongLyrics(GCS_API_KEY, GCS_ENGINE_ID)

sheeran = get_lyrics("Ed Sheeran", "shape of you")
sheeran
```

`Approach 2: lyricsgenius`

1. Install lyrics genius
2. Sign up
3. Create an [API client](https://genius.com/api-clients)
   - Use "https://whatever.com" as your website

In [268]:
#!pip install lyricsgenius
import lyricsgenius

### store it in a variable
We save the library in a variable by calling SongLyrics and passing it the keys

In [231]:
from getpass import getpass
gen = getpass()

········


### We are looking for some lyrics

In [235]:
genius = lyricsgenius.Genius(gen)

In [240]:
example = genius.search_artist("ROSALÍA", max_songs=1)
example

Searching for songs by ROSALÍA...

Song 1: "MALAMENTE (Cap.1: Augurio)"

Reached user-specified song limit (1).
Done. Found 1 songs.


Artist(id, songs, ...)

In [245]:
example

Artist(id, songs, ...)

In [246]:
artist = genius.search_artist("ROSALÍA", max_songs=1)
cancion = "con altura"
song = artist.song(cancion)
song.lyrics

Searching for songs by ROSALÍA...

Song 1: "MALAMENTE (Cap.1: Augurio)"

Reached user-specified song limit (1).
Done. Found 1 songs.
Searching for "con altura" by ROSALÍA...
Done.


'TranslationsEnglishCon Altura Lyrics[Letra de "Con Altura" ft. El Guincho]\n\n[Intro: ROSALÍA & Mariachi Budda]\nEsto vamo\' a arrancarlo con altura\nEl dembow lo canto con hondura\nDicen: "Una estrella, una figura"\nDe Héctor aprendí la sabrosura\nNunca he visto una joya tan pura\n\n[Coro: El Guincho & ROSALÍA]\nEsto es pa\' que quede, lo que yo hago dura (Con altura)\nDemasiá\' noches de travesura (Con altura)\nVivo rápido y no tengo cura (Con altura)\nIré joven pa\' la sepultura (Con altura)\nEsto es pa\' que quede, lo que yo hago dura (Con altura)\nDemasiá\' noches de travesura (Con altura)\nVivo rápido y no tengo cura (Con altura)\nIré joven pa\' la sepultura (Con altura)\n[Verso 1: ROSALÍA]\nPongo rosas sobre el Panamera (Brrm)\nPongo palmas sobre la guantanamera\nLlevo a Camarón en la guantera (De la Isla)\nLo hago pa\' mi gente y lo hago a mi manera\n\n[Pre-Coro: ROSALÍA]\nFlores azules y quilates\nY si es mentira, que me maten\nFlores azules y quilates\nY si es mentira, que m

### Many lyrics!

In [247]:
print(song.lyrics)

TranslationsEnglishCon Altura Lyrics[Letra de "Con Altura" ft. El Guincho]

[Intro: ROSALÍA & Mariachi Budda]
Esto vamo' a arrancarlo con altura
El dembow lo canto con hondura
Dicen: "Una estrella, una figura"
De Héctor aprendí la sabrosura
Nunca he visto una joya tan pura

[Coro: El Guincho & ROSALÍA]
Esto es pa' que quede, lo que yo hago dura (Con altura)
Demasiá' noches de travesura (Con altura)
Vivo rápido y no tengo cura (Con altura)
Iré joven pa' la sepultura (Con altura)
Esto es pa' que quede, lo que yo hago dura (Con altura)
Demasiá' noches de travesura (Con altura)
Vivo rápido y no tengo cura (Con altura)
Iré joven pa' la sepultura (Con altura)
[Verso 1: ROSALÍA]
Pongo rosas sobre el Panamera (Brrm)
Pongo palmas sobre la guantanamera
Llevo a Camarón en la guantera (De la Isla)
Lo hago pa' mi gente y lo hago a mi manera

[Pre-Coro: ROSALÍA]
Flores azules y quilates
Y si es mentira, que me maten
Flores azules y quilates
Y si es mentira, que me maten
(Con altura; con altura)

[Co

In [250]:
def get_lyrics (artist, song):
    try:
        artist_ = genius.search_artist(artist, max_songs=1)
        song_ = artist_.song(song)
        return song_.lyrics
    
    except:
        return np.nan

In [251]:
get_lyrics ("Oasis", "wonderwall")

Searching for songs by Oasis...

Song 1: "Wonderwall"

Reached user-specified song limit (1).
Done. Found 1 songs.
Searching for "wonderwall" by Oasis...
Done.


"TranslationsEnglishWonderwall Lyrics[Verse 1]\nToday is gonna be the day that they're gonna throw it back to you\nAnd by now, you shoulda somehow realised what you gotta do\nI don't believe that anybody feels the way I do about you now\n\n[Verse 2]\nAnd backbeat, the word is on the street that the fire in your heart is out\nI'm sure you've heard it all before, but you never really had a doubt\nI don't believe that anybody feels the way I do about you now\n\n[Pre-Chorus]\nAnd all the roads we have to walk are winding\nAnd all the lights that lead us there are blinding\nThere are many things that I would like to say to you, but I don't know how\n[Chorus]\nBecause maybe\nYou're gonna be the one that saves me\nAnd after all\nYou're my wonderwall\n\n[Verse 3]\nToday was gonna be the day, but they'll never throw it back to you\nAnd by now, you shoulda somehow realised what you're not to do\nI don't believe that anybody feels the way I do about you now\n\n[Pre-Chorus]\nAnd all the roads that

In [252]:
df.sample()

Unnamed: 0,name,artist,album_name,popularity,user
9,Caracas de Noche - Original Mix,Javith,Caracas de Noche (Remixes),26,1253681944


In [254]:
def silly_function (x, y):
    return x + y

In [258]:
#df["lyrics_test"] = df.apply(silly_function (df["artist"], df["name"]), axis=1)

In [None]:
#df["lyrics"] = df.apply(get_lyrics (df["artist"], df["name"]), axis=1)

In [259]:
df["lyrics"] = df.apply(lambda row: get_lyrics (row["artist"], row["name"]), axis=1)

Searching for songs by Billo's...

Changing artist name to 'Billo’s Caracas Boys'
Song 1: "Navidad Que Vuelve"

Reached user-specified song limit (1).
Done. Found 1 songs.
Searching for "Canto a Caracas" by Billo’s Caracas Boys...
Done.
Searching for songs by Desorden Público...

Song 1: "Allá Cayó"

Reached user-specified song limit (1).
Done. Found 1 songs.
Searching for "Peces del Guaire" by Desorden Público...
No results found for: 'Peces del Guaire Desorden Público'
Searching for songs by Ilan Chester...

Song 1: "Palabras del alma"

Reached user-specified song limit (1).
Done. Found 1 songs.
Searching for "Cerro Avila" by Ilan Chester...
No results found for: 'Cerro Avila Ilan Chester'
Searching for songs by Desorden Público...

Song 1: "Allá Cayó"

Reached user-specified song limit (1).
Done. Found 1 songs.
Searching for "Valle De Balas" by Desorden Público...
Done.
Searching for songs by Guaco...

Song 1: "Si Usted La Viera"

Reached user-specified song limit (1).
Done. Found 1

# Export: csv

In [263]:
df

Unnamed: 0,name,artist,album_name,popularity,user,lyrics
0,Canto a Caracas,Billo's,Canto a Caracas,0,soyarepita,Scarface Script LyricsScarface\n\nBy: Oliver S...
1,Peces del Guaire,Desorden Público,Descomposición,3,soyarepita,
2,Cerro Avila,Ilan Chester,El Comienzo en un Sotano de la Florida,0,soyarepita,
3,Valle De Balas,Desorden Público,Plomo Revienta,22,marianamartinh,Valle De Balas LyricsLa ciudad se encierra a v...
4,Las Caraqueñas,Guaco,Guaco Es Guaco,29,soyarepita,Las Caraqueñas LyricsNo sé que tienen las chic...
5,Epa Isidoro,Billo's,Canto a Caracas,0,soyarepita,
6,Caracas Tiene Su Guaguanco,Justo Betancourt,Pa Bravo Yo,0,1253681944,
7,Luna Caraqueña,Billo's,Canto a Caracas,0,soyarepita,
8,Caminando por Caracas,Piero,Piero,0,soyarepita,CONGRESOS E INTELECTUALES EN LOS INICIOS DE UN...
9,Caracas de Noche - Original Mix,Javith,Caracas de Noche (Remixes),26,1253681944,


In [265]:
df.to_csv("lyrics-andres.csv", index=False)

# RECAP

- Get lyrics from a given playlist
- Signed up for the spotify API: cliend_id & client_secret
- With that: do a POST request to send those credentials and get a token
- With the token: we're able to get all the songs from a playlist
- We use the genius library for python to use its API
- We build a dataframe with all the lyrics from a given spotify playlist

- We saved the code into functions
- We also createad one to save things into .env

- Reminder: .env & .gitignore

- By putting all the functions together: we can build a script that by passing ONE link (provided you have spotify credentials): result is a pandas daraframe with songs & lryics

Input: one link
Output: a whole dataframe with info from two sources