In the first notebook [notebook_01_webscraping_Evanescence_Within_Temptation.ipynb](http://localhost:8888/notebooks/Project_Evanescence_Within_Temptation/notebook_01_webscraping_Evanescence_Within_Temptation.ipynb) of this project we used web scrapping to obtain lyrics of `Evanescence` and `Within Temptation`. After that we used Spotify API to retrieve more details about both bands including information about them, their albums, and details about their lyrics including not only metadata (e.g. ) but also audio features (e.g. valence, energy, tempo, liveness) [notebook_02_retrieve_Spotify_data-Evanescence_Within_Temptation.ipynb](http://localhost:8888/notebooks/Project_Evanescence_Within_Temptation/notebook_02_retrieve_Spotify_data-Evanescence_Within_Temptation.ipynb).

Now is time to use the data retrieved and try to explore and visualize as much as we can. Our goal is not only explore text data, but also visualize numeric and categorical features. 

In what concerns NLP (Natural Language Processing) I want to do some things:

1. Text analysis: Analyze both bands and compare them through their lyrics using some metrics and word clouds.
2. Sentiment analysis: Explore the sentiment, polarity, and subjectivity of the lyrics provided by [TextBlob](https://textblob.readthedocs.io/en/dev/index.html) to compare both bands through visualization.
3. I'll try also to connect metadata of tracks with the sentiment provided by lyrics to draw conclusions.
4. Analyse some of the audio features, in special the ones that have been pointed as mood features, i.e., valence and energy, and see if there is a relation between them and the sentiment of lyrics of a track.

Let’s get started!

# Loading all data

## Lyrics

In [1]:
import pandas as pd


In [8]:
df_lyrics_evanescence = pd.read_csv("./data/lyrics_evanescence_2020-02-16.csv")
df_lyrics_evanescence.sort_values(by='song_title').head(20)

Unnamed: 0,song_title,lyrics
3,4th of july,Shower in the dark day. Clean sparks driving d...
69,all that im living for,All that I'm living for. All that I'm dying fo...
50,angel of mine,You are everything I need to see. Smile and su...
41,anything for you,I'd give anything to give me to you. Can you f...
61,anywhere,"Dear my love, haven't you wanted to be with me..."
39,away from me,I hold my breath. as this life starts to take ...
78,before the dawn,Meet me after dark again. and I'll hold you. I...
59,bleed,How can I pretend that I don't see. What you h...
67,breathe no more,I've been looking in the mirror for so long.. ...
6,bring me to life,how can you see into my eyes. like open doors....


In [9]:
df_lyrics_within_temptation = pd.read_csv("./data/lyrics_within_temptation_2020-02-16.csv")
df_lyrics_within_temptation.sort_values(by='song_title').head(15)

Unnamed: 0,song_title,lyrics
17,a dangerous mind,Cause something is not right. I follow the sig...
64,a demons fate,"Ooh, ooh, ooh, ooh, ooh. Ooh, ooh, ooh, ooh, o..."
21,all i need,I'm dying to catch my breath. Oh why don't I e...
60,angels,Sparkling angel I believed. You were my saviou...
40,another day,I know you are going away. I take my love into...
51,aquarius,I hear your whispers. Break the silence and it...
29,bittersweet,If I tell you. Will you listen?. Will you stay...
67,blue eyes,Blue eyes wide to the world. Full of dreams an...
56,caged,These are the darkest clouds. They have surrou...
31,candles,Take away. These hands of darkness. Reaching f...


## Spotify's data

From all the data retrieved I'll concentrate on the track's information csv. I'll be using the one we saved in .csv that should have eliminated at least some duplicates from tracks.

In [5]:
df_tracks_evanescence = pd.read_csv("./data/info_tracks_evanescence_without_duplicates_2020-02-16.csv")
df_tracks_evanescence.sort_values(by='track_name').head(10)

Unnamed: 0,album_name,track_id,track_name,track_duration,track_popularity,track_preview,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo
51,Evanescence (Deluxe Version),3UkDyGtriDY7NzOJbF0rIH,a new way to bleed,226400,44,,0.378,0.895,1,-4.347,1,0.0531,5e-05,0.0252,0.15,0.258,155.946
37,The Open Door,4iDQezFTnOwgnrPYiqQ6TP,all that i am living for,228706,48,,0.514,0.809,3,-4.396,0,0.0617,0.0121,0.0,0.0763,0.385,136.881
58,Lost Whispers,2lH8hMXxuIcjpbIok9KbUj,breathe no more b side version,228809,49,,0.62,0.186,11,-8.527,0,0.0284,0.971,1e-06,0.117,0.219,96.992
19,Anywhere But Home (Live),2zn4moJkEmIVfV83iye9t5,"breathe no more live from le zénith,france/2004",213853,33,,0.562,0.431,11,-10.67,0,0.0307,0.323,0.0185,0.955,0.167,108.012
1,Fallen,0COqiPhxzoWICwFCS4eZcp,bring me to life,235893,77,,0.331,0.943,4,-3.188,0,0.0698,0.00721,2e-06,0.242,0.296,94.612
73,Synthesis Live,1rvxZ0qg96Nkr3PLhHTbCA,bring me to life live,264026,29,https://p.scdn.co/mp3-preview/87cbd661e1853b8f...,0.149,0.813,4,-5.26,0,0.056,0.346,2.1e-05,0.914,0.242,90.642
21,Anywhere But Home (Live),1AjCrY9w0edn2jAGEAkzJ7,"bring me to life live from le zénith,france/...",283760,40,,0.341,0.825,4,-7.22,0,0.0622,0.0221,0.0306,0.522,0.0398,94.992
64,Synthesis,4vHFFk4Vm9NWhGq2FAsTlj,bring me to life synthesis,257320,6,,0.362,0.785,4,-3.876,0,0.0567,0.61,1e-06,0.0722,0.16,90.904
27,The Open Door,663Karu2rvKLdnY0eo1n3M,call me when you're sober,214706,64,,0.45,0.883,7,-4.094,1,0.0524,0.00193,0.0,0.293,0.328,93.41
30,The Open Door,6Sh05fnlrLbMfSuI8Qur6a,cloud nine,262173,44,,0.125,0.893,3,-4.217,0,0.21,0.0432,8.5e-05,0.151,0.19,194.55


In [11]:
df_tracks_within_temptation = pd.read_csv("./data/info_tracks_within_temptation_without_duplicates_2020-02-16.csv")
df_tracks_within_temptation.sort_values(by='track_name').head(10)

Unnamed: 0,album_name,track_id,track_name,track_duration,track_popularity,track_preview,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo
32,The Silent Force,6D5ih8y9mKmCSkuZO2Up2Q,a dangerous mind,256533,34,https://p.scdn.co/mp3-preview/c2c47b037fe1394c...,0.365,0.894,6,-5.491,0,0.0727,0.0711,0.000378,0.135,0.476,180.2
75,The Unforgiving,6ivwIJGFnzTRPG2dHvKA07,a demon's fate,329537,40,https://p.scdn.co/mp3-preview/cd99a3cda30d3714...,0.46,0.912,5,-3.444,0,0.0596,0.000579,0.000217,0.104,0.311,134.074
42,The Heart Of Everything,0lW4J9tzxpODQ8IExSumDW,all i need,290946,24,https://p.scdn.co/mp3-preview/15ddd25586b4f62e...,0.233,0.73,10,-4.855,1,0.0449,0.201,4e-06,0.13,0.123,152.972
55,An Acoustic Night At The Theatre,1tbSP6d2KwBB2DZUJLalRZ,all i need live,320946,21,https://p.scdn.co/mp3-preview/eae98f3734badfd0...,0.368,0.674,7,-5.859,0,0.0328,0.424,0.0,0.951,0.124,149.204
91,Hydra (Special Edition),6MubsJeQrVa0k7lJSxcdaM,and we run,230067,7,https://p.scdn.co/mp3-preview/af524142f40dcacf...,0.544,0.837,6,-4.618,0,0.0465,0.0596,0.0,0.0698,0.159,128.98
99,Hydra (Special Edition),13cZ2hORsadxvc2KLUBZoA,and we run evolution track,341497,5,https://p.scdn.co/mp3-preview/fc476336bac928d0...,0.578,0.723,6,-7.949,0,0.0507,0.136,2e-06,0.196,0.165,129.054
131,Let Us Burn - Elements & Hydra Live In Concert,301osYEEEVs4EQNXZXStCi,and we run live 2014,236746,0,https://p.scdn.co/mp3-preview/3789af5d464b633f...,0.51,0.865,9,-4.793,1,0.0431,0.19,3.1e-05,0.679,0.451,129.013
27,The Silent Force,3TEwbiC0GhIRStn3Eabtu7,angels,240440,55,https://p.scdn.co/mp3-preview/1dbf69a32db3b4d2...,0.341,0.867,7,-4.727,0,0.0492,0.293,0.0,0.257,0.2,182.023
114,Let Us Burn - Elements & Hydra Live In Concert,6oQdvGElasxvHYutiewDSc,angels live 2012,252226,0,https://p.scdn.co/mp3-preview/9f9cc354c35bf303...,0.438,0.852,7,-5.567,0,0.0387,0.147,0.0,0.976,0.246,91.061
105,Enter + The Dance,4nroowkyOM1HB9BOwUVV3M,another day,348453,16,https://p.scdn.co/mp3-preview/f76426030a7cfb44...,0.15,0.637,10,-6.177,1,0.0344,0.000843,0.00302,0.357,0.174,150.038


One thing it can be noticed is that the song's titles of lyrics data have no " ' ", while the tracks's names from Spotify have. E.g.: song_title: call me when youre sober x track_name: call me when you're sober.

So I'll remove " ' " from all track_name.

In [13]:
df_tracks_evanescence["track_name"] = df_tracks_evanescence["track_name"].replace("'","")
df_tracks_within_temptation["track_name"].replace("'","")

0                          restless
1                             enter
2                   pearls of light
3                       deep within
4                        gatekeeper
                   ...             
160          in vain   instrumental
161        firelight   instrumental
162        mad world   instrumental
163     mercy mirror   instrumental
164    trophy hunter   instrumental
Name: track_name, Length: 165, dtype: object

# Text Analysis

## Word clouds

# Sentiment Analysis