![](https://upload.wikimedia.org/wikipedia/commons/thumb/3/33/Parental_Advisory_label.svg/500px-Parental_Advisory_label.svg.png)

This is going to be a long lab. You will probably start now and finish at home. The **highly suggested** browser (or, at least, the one that I'll be using) is [Firefox](https://www.mozilla.org/en-US/firefox/developer/), the developer edition (for the console mode).

The goal is to build a little collection of songs from our own preferred artist. Let's say, it's _Beastwars_ (they are great!).  
First you will work together through a full example, then I will ask you to do something more!  

The lab is organised like this:

1.  The Scraping
2.  The APIs

Part 1. regards web as an unwilling source of data (the information is out there, but it is not explicitly designed for you to harvest it);   
Part 2. is mostly the web as willing source of data.

Part 1. is more guided, in Part 2. you'll need to navigate autonomously.

#### The lab is long, so you may want to alternate between the Julia and the R part a little bit in order to get started on both during the lab time.

# Scraping

 A little kicker for the day:

In [54]:
IRdisplay::display_html('<iframe width="560" height="315" src="https://www.youtube.com/embed/M-twENDrwkk" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>')

We are going to work in the usual tidyverse framework.

In [55]:
library(tidyverse)

We also introduce a bunch of additional packages that we will start to use more.

In [56]:
#install.packages("magrittr")
#install.packages("purrr")
#install.packages("glue")
#install.packages("stringr")
library(magrittr) # better handling of pipes
library(purrr) # to work with lists and map functions
library(glue) # to paste strings
library(stringr) # to hand strings

As well as some web scraping specific libraries: [rvest](https://github.com/hadley/rvest#rvest) and [polite](https://github.com/dmi3kno/polite#polite-). Spend some minutes reading through the README files in their website.

In [57]:
#install.packages("rvest")
#remotes::install_github("dmi3kno/polite")("politer")
library(rvest) # rvest makes scraping easier
library(polite) # polite is the "polite" version of rvest

### The lyrics

We are going to extract the lyrics from here: https://www.musixmatch.com/ . Chose it because it's rather consistent, and it's from Bologna, Italy (yeah!).

The webiste offers the first 15 lyrics up front. That will do for the moment (and fixing that is not that easy). Let's take a look [here](https://www.musixmatch.com/artist/Beastwars#).

The webpage you see is **not** the webpage the browser reads: it is the "parsing" of the original page. If you want to see the original page you can (in Firefox) right click and select "View Page Source". It's a mess. It's a lot of mess. But if you scroll enough you may find that there are information about "track_id", "track_name", ... It is stored under the label "attributes".

### Titles

First thing first, we would like to get a list of those track titles, so then we'll go and scrape them. Let's see how.

In [58]:
url_titles <- "https://www.musixmatch.com/artist/Beastwars#" # this is the base url from where the scraping starts

page_title <- read_html(url_titles)
page_title

{html_document}
<html xmlns:og="http://ogp.me/ns#" class="artist-page-page">
[1] <head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# musixmatch:  ...
[2] <body spellcheck="false" class="">\n    <div id="fb-root"></div>\n    <di ...

`read_html()` reads and parses the webpage into R. What kind of object have we there?

In [59]:
page_title %>% typeof()
page_title %>% glimpse()

List of 2
 $ node:<externalptr> 
 $ doc :<externalptr> 
 - attr(*, "class")= chr [1:2] "xml_document" "xml_node"


Mmm, a list: this is how R sees the information in the page. Now it's a good moment to read through this quick introduction to R lists and vectors written by Jenny Brian: https://jennybc.github.io/purrr-tutorial/bk00_vectors-and-lists.html

In [60]:
page_title

{html_document}
<html xmlns:og="http://ogp.me/ns#" class="artist-page-page">
[1] <head prefix="og: http://ogp.me/ns# fb: http://ogp.me/ns/fb# musixmatch:  ...
[2] <body spellcheck="false" class="">\n    <div id="fb-root"></div>\n    <di ...

OK. It's a document. And in particular an XML document. That's sort of html.  
We need to parse the xml in a better way than just diving into the text.  
The library `xml2` is there excatly for parsing xml into something more humane.  
Let's see a bit more of that page: careful, long output coming!

In [61]:
page_title %>% html_structure()

<html.artist-page-page [xmlns:og]>
  <head [prefix]>
    <script>
      {cdata}
    <script>
      {cdata}
    <script>
      {cdata}
    <script [async, src]>
    <script [type]>
      {cdata}
    <link [rel, type, href]>
    <link [rel, type, href]>
    <script>
      {cdata}
    <meta [name, content]>
    <meta [name, content]>
    <title>
      {text}
    <meta [charset]>
    <meta [http-equiv, content]>
    <meta [http-equiv, content]>
    <meta [name, content]>
    <meta [name, content]>
    <meta [name, content]>
    <meta [name, content]>
    <meta [name, content]>
    <meta [name, content]>
    <meta [name, content]>
    <link [href, rel]>
    <link [href, rel]>
    <link [href, rel]>
    <link [href, rel]>
    <link [rel, type, href, title]>
    <link [rel, href]>
    <link [rel, href]>
    <link [rel, href]>
    <meta [name, content]>
    <link [rel, href, hreflang]>
    <link [rel, href, hreflang]>
    <link [rel, href, hreflang]>
    <link [rel, href, hreflang]>
    <link 

What is that? How will we ever find what we need inside there?  
![](https://media.giphy.com/media/ZkEXisGbMawMg/giphy.gif)

To the browser!
Press control+shift+i, you will fire up the "inspector mode" (if you are in Firefox).  
Now press on the button on the left of "Inspector" (it's this button ![this button](https://i.imgur.com/g00sN2e.png)).
Then move your mouse over the title of a song. As you move it around you'll see that different part of the page get highlighted, and differnt part of the source code too! The highlighted source code of the web page is the code responsible for the object your pointing at.

Let's look more carefully at the text source. In particolar, the titles are inside a
```html
<span>Devils of Last Night</span>
```
block. And that block is inside a larger block:
```html
<a class="title" href="/lyrics/Beastwars/Devils-of-Last-Night">
<span>Devils of Last Night</span>
</a>`
```
and, surely enough, even that block is within a larger block and so on untile the largest block of all: the full page.

To get to the title we can use that "class" tag: they are called _css selectors_, and we will use them as handles to navigate into the extremely complex list that we get from a web page.

Sometimes, we can be lucky. For example, the css selector for the titles are in the class "title". Let's see.

In [62]:
page_title %>%
  html_nodes(".title") # write .title, with the dot, because we want all results with that tag. Try removing it and see what happens.

{xml_nodeset (15)}
 [1] <a class="title" href="/lyrics/Beastwars/Damn-the-Sky"><span>Damn the Sk ...
 [2] <a class="title" href="/lyrics/Beastwars/Omens"><span>Omens</span></a>
 [3] <a class="title" href="/lyrics/Beastwars/Imperium"><span>Imperium</span> ...
 [4] <a class="title" href="/lyrics/Beastwars/Call-Out-the-Dead"><span>Call O ...
 [5] <a class="title" href="/lyrics/Beastwars/Ruins"><span>Ruins</span></a>
 [6] <a class="title" href="/lyrics/Beastwars/Red-God"><span>Red God</span></a>
 [7] <a class="title" href="/lyrics/Beastwars/Rivermen"><span>Rivermen</span> ...
 [8] <a class="title" href="/lyrics/Beastwars/Cthulhu"><span>Cthulhu</span></a>
 [9] <a class="title" href="/lyrics/Beastwars/Mihi"><span>Mihi</span></a>
[10] <a class="title" href="/lyrics/Beastwars/Iron-Wolf"><span>Iron Wolf</spa ...
[11] <a class="title" href="/lyrics/Beastwars/Dune"><span>Dune</span></a>
[12] <a class="title" href="/lyrics/Beastwars/Raise-The-Sword"><span>Raise th ...
[13] <a class="title" href="/

That's still quite a mess: we have too much stuff, such as some links (called "href") and more text than we need. Let's clean it up with `html_text()`

In [63]:
page_title %>%
  html_nodes(".title") %>%
  html_text()


It looks much better! Now we have 15 song titles. Soon we will have the lyrics!

In [64]:
SLS_df <- data_frame(Band = "Beastwars",
                     Title = page_title %>%
                             html_nodes(".title") %>%
                             html_text()) 
# build the Title variable using the code we used above
SLS_df

Band,Title
<chr>,<chr>
Beastwars,Damn the Sky
Beastwars,Omens
Beastwars,Imperium
Beastwars,Call Out the Dead
Beastwars,Ruins
Beastwars,Red God
Beastwars,Rivermen
Beastwars,Cthulhu
Beastwars,Mihi
Beastwars,Iron Wolf


We want to recover the links. Look again at what we get when we select the `.title` you may see that the _actual_ link is there, coded as `href`. Can we extract that? Yes we can!

In [65]:
page_title %>%
  html_nodes(".title") %>%
  html_attrs()


In particular, we want the element called `href`. Hey, we can get that with (one of) the `map()` function from `purrr`!

`purrr` is an extremely powerful library, and we will read and use it more in the next lectures. If in your R experience you have used `apply()` or similar functions (any of the {slvmt}`apply()` or the `XYply()` from `plyr`) read [this](https://jennybc.github.io/purrr-tutorial/bk01_base-functions.html#why_not_base).  

If you have no idea what I just spoke about, don't worry.

The go to resource to learn about `purrr` and maps is Hadley's [iteration](http://r4ds.had.co.nz/iteration.html) chapter in his R 4 Data Science book. Read especially section 21.5 to 21.8. Another great resource is this tutorial by Rebecca Barter: http://www.rebeccabarter.com/blog/2019-08-19_purrr/

We are going to talk more about `purrr` so it's a good investment to read more.

Yet, we could have let `rvest` do the job for us:

In [66]:
SLS_df %<>% # the %<>% is from magrittr, it corresponds to SLS_df <- SLS_df %>% ...
  mutate(Link = page_title %>%
  html_nodes(".title") %>%
  html_attr("href"))
SLS_df

Band,Title,Link
<chr>,<chr>,<chr>
Beastwars,Damn the Sky,/lyrics/Beastwars/Damn-the-Sky
Beastwars,Omens,/lyrics/Beastwars/Omens
Beastwars,Imperium,/lyrics/Beastwars/Imperium
Beastwars,Call Out the Dead,/lyrics/Beastwars/Call-Out-the-Dead
Beastwars,Ruins,/lyrics/Beastwars/Ruins
Beastwars,Red God,/lyrics/Beastwars/Red-God
Beastwars,Rivermen,/lyrics/Beastwars/Rivermen
Beastwars,Cthulhu,/lyrics/Beastwars/Cthulhu
Beastwars,Mihi,/lyrics/Beastwars/Mihi
Beastwars,Iron Wolf,/lyrics/Beastwars/Iron-Wolf



Cool, we don't gain much in terms of line of code, but it will be usefull later!

### And `purrr`!

Cool, now we want to put grab all lyrics. Let's start with one at a time. What is the url we want?

In [67]:
url_song <- glue("https://www.musixmatch.com{SLS_df$Link[1]}")

url_song

Hey, ps, `glue()` is basically a much better version of `paste()`. Take a look [here](https://glue.tidyverse.org) if you are curious about it.

And let's grab the lyrics for that song.  
Open that page in you browser, control+shift+i and poit to the lyrics.  
The content is marked by a css selector called "p.mxm-lyrics__content". That stands for "p", an object of class paragraph, plus "mxm-lyrics__content", the specific class for the lyrics.

In [68]:
url_song %>%
  read_html() %>%
  html_nodes(".mxm-lyrics__content") %>%
  html_text()

Ach, notice that it comes in different blocks: one for each section of text (if you look the original page in your browser you'll see the lyrics is split by the advertisment). Well, we can just `glue_collapse()` them together with `glue`. As we are doing this, let's turn that flow into a function:

In [69]:
get_lyrics <- function(link){
  
  lyrics_chunks <- glue("https://www.musixmatch.com{link}#") %>%
   read_html() %>%
   html_nodes(".mxm-lyrics__content")
  
  # we do a sanity check to see that there's something inside the lyrics!
  stopifnot(length(lyrics_chunks) > 0)
  
  lyrics <- lyrics_chunks %>%
   html_text() %>%
   glue_collapse(sep =  "\n")
  
  return(lyrics)
}

Let's test it!


In [70]:
SLS_df$Link[1] %>%
  get_lyrics() %>%
  print()

Take me to the top of the hill
Where the birds refuse to fly
Raise your hands your hand to the damned sky

Is it the magic in you
In the land, this land of the dead
Raise my hands to the damned sky
Watch those twin moons collide

Take me to the top of the hill
Where the birds refuse to fly
Raise your hands your hand to the damned sky
Watch those twin moons collide

Is it the magic in you
In the land, this land of the dead
Raise my arms to the damned sky
Watch those twin moons collide

Give it all man
Give it all
Give it time man
To learn how to die
Give it all man
Give it time
To learn to fly
Give it all man
Give it time
To learn how to die


Now we can use purrr to map that function over our dataframe! Let's do that only for the top 4 tracks in our list.

In [71]:
SLS_df_top <- SLS_df %>%
  slice(1:4) %>%
  mutate(Lyrics = map_chr(Link, get_lyrics))
SLS_df_top

Band,Title,Link,Lyrics
<chr>,<chr>,<chr>,<chr>
Beastwars,Damn the Sky,/lyrics/Beastwars/Damn-the-Sky,"Take me to the top of the hill Where the birds refuse to fly Raise your hands your hand to the damned sky Is it the magic in you In the land, this land of the dead Raise my hands to the damned sky Watch those twin moons collide Take me to the top of the hill Where the birds refuse to fly Raise your hands your hand to the damned sky Watch those twin moons collide Is it the magic in you In the land, this land of the dead Raise my arms to the damned sky Watch those twin moons collide Give it all man Give it all Give it time man To learn how to die Give it all man Give it time To learn to fly Give it all man Give it time To learn how to die"
Beastwars,Omens,/lyrics/Beastwars/Omens,Empty House At the foot of the hill Gravel road To a Holy Mountain Pyramid in the hills Red sky See her omens come alive See it enter your mind Empty House At the foot of the hill Gravel road To a Holy Mountain Omens are red In the middle of your mind Roads of mist Women of time With hands pointed right Straight to the sky Omens are red In the middle of your mind Empty house At the foot of the hill Old gravel road To the Holy Mountain All omens are red In the middle of your mind Alive in the forest Alive in the sky Empty house At the foot of the hill Gravel road To a Holy Mountain
Beastwars,Imperium,/lyrics/Beastwars/Imperium,North by north by sea Set the black ship free Raise the golden sword To Jerusalem Born with a blind eye See these tombs of sand Raise my sword of light To Jerusalem Raise the burning branch Let the blood run free Take our golden light To Jerusalem This Imperium Raise the sword of life and death Raise the king by the skull To Jerusalem
Beastwars,Call Out the Dead,/lyrics/Beastwars/Call-Out-the-Dead,"Follow the ship, the blackest ocean Follow the time, the time of the dead Follow the time, the time of the battle Follow your heart, the eternal end When they call out the days When they call out your name Yes it is true I drank with religion Feasted on the bread too Yes it is true, I drank with the devils Worshipped the sky and screamed at the moon When they call out the name When they call out the dead Yes it is true I drank with religion Feasted on the bread too Yes it is true, I drank with the devils Worshipped the sky and howled at the moon When they call out the name When they call out the dead Remember only rock and roll Remember only rock and roll Remember only rock and roll"


Ok, here we were quite lucky, as all the links were right and all the first 4 songs had some lyrics. But that's not always the case. We can see that if we try to run the previous code on the full dataframe.

In [72]:
#SLS_df %>%
#  mutate(Lyrics = map_chr(Link, get_lyrics))

In general we may want to play safe. To be safe, we ca use a `possibly` "wrapper" (from`purrr`) so not to have to stop everything in case something bad happens.

In [73]:
get_lyrics_safe <- purrr::possibly(.f = get_lyrics, # the function that we want to make safer
                                   otherwise = NA_character_) # the value we get back if .f fails

Now we can try again, and this time we should handle the issues better.

In [74]:
SLS_df %<>%
  mutate(Lyrics = map_chr(Link, get_lyrics_safe))

SLS_df

Band,Title,Link,Lyrics
<chr>,<chr>,<chr>,<chr>
Beastwars,Damn the Sky,/lyrics/Beastwars/Damn-the-Sky,"Take me to the top of the hill Where the birds refuse to fly Raise your hands your hand to the damned sky Is it the magic in you In the land, this land of the dead Raise my hands to the damned sky Watch those twin moons collide Take me to the top of the hill Where the birds refuse to fly Raise your hands your hand to the damned sky Watch those twin moons collide Is it the magic in you In the land, this land of the dead Raise my arms to the damned sky Watch those twin moons collide Give it all man Give it all Give it time man To learn how to die Give it all man Give it time To learn to fly Give it all man Give it time To learn how to die"
Beastwars,Omens,/lyrics/Beastwars/Omens,Empty House At the foot of the hill Gravel road To a Holy Mountain Pyramid in the hills Red sky See her omens come alive See it enter your mind Empty House At the foot of the hill Gravel road To a Holy Mountain Omens are red In the middle of your mind Roads of mist Women of time With hands pointed right Straight to the sky Omens are red In the middle of your mind Empty house At the foot of the hill Old gravel road To the Holy Mountain All omens are red In the middle of your mind Alive in the forest Alive in the sky Empty house At the foot of the hill Gravel road To a Holy Mountain
Beastwars,Imperium,/lyrics/Beastwars/Imperium,North by north by sea Set the black ship free Raise the golden sword To Jerusalem Born with a blind eye See these tombs of sand Raise my sword of light To Jerusalem Raise the burning branch Let the blood run free Take our golden light To Jerusalem This Imperium Raise the sword of life and death Raise the king by the skull To Jerusalem
Beastwars,Call Out the Dead,/lyrics/Beastwars/Call-Out-the-Dead,"Follow the ship, the blackest ocean Follow the time, the time of the dead Follow the time, the time of the battle Follow your heart, the eternal end When they call out the days When they call out your name Yes it is true I drank with religion Feasted on the bread too Yes it is true, I drank with the devils Worshipped the sky and screamed at the moon When they call out the name When they call out the dead Yes it is true I drank with religion Feasted on the bread too Yes it is true, I drank with the devils Worshipped the sky and howled at the moon When they call out the name When they call out the dead Remember only rock and roll Remember only rock and roll Remember only rock and roll"
Beastwars,Ruins,/lyrics/Beastwars/Ruins,"The fallen stare upon the crown Whispering words I do not know Wings of change in the air See this fire, this fire below Let the black wind howl Let them walk upon the sea Let their holy moons explode let their troubled world be told To dream of space Between night and day Let the noise return Let the gods return to the holy hill Let the back winds howl Let them walk upon seas Let their moons explode Let their blood become FIRE"
Beastwars,Red God,/lyrics/Beastwars/Red-God,White white light It came to be Give it all Everything you have Take your time Yeah you gonna go mad Yeah the red red god Yeah the white white light Yeah the crystal scene Give it all Everything you have Take your time Yeah you gonna go mad Yeah the red red god Yeah the white white light Yeah the crystal scene Yeah the red red god Yeah the white white light Yeah the crystal scene
Beastwars,Rivermen,/lyrics/Beastwars/Rivermen,"I hear river Calling out names Loud and clear Some kind of strange Journey between Life and death In the end We return to the water Dirty old river It thrashes, it turns I hear it speak I know it's real This hill Born and raised Dirt you hold Tonight you never Felt so old But tonight you must return Across cold water Yeah dirty old river It thrashes, it turns I hear it speak I know it's real Yeah dirty old river It thrashes, it turns I hear it speak I know the words are real Why Call me now Why Call me back to the water"
Beastwars,Cthulhu,/lyrics/Beastwars/Cthulhu,"Every story has an ending The engine is broken Someone call the captain Seemed to have lost another soul The ship, the ship is sinking Into the cold water souls This is our last time All the air gone cold Convoy moving on Left us here to die alone Long way from land No ones ever going home Bottom of the sea Did I think I would die here? In the middle of the sea"
Beastwars,Mihi,/lyrics/Beastwars/Mihi,"I wait, I wait some more for you to take me back I pray, I pray so much I kneel at the holy mind I see the color, color red before my holy eyes I play with the fire in the hand of the time of the crazy one Burn the land, burn the ground, burn the house Burn the memories that we all had Burn the time, burn the place, burn the soul Of the crazy one, the crazy one Find the color, color red in the time of the righteous one Let me scream like the Lightning God Let me burn the land to the sea Burn the lady Burn the land"
Beastwars,Iron Wolf,/lyrics/Beastwars/Iron-Wolf,Iron wolf Silver snake Hear the howl Feel it's pain Iron wolf Silver Chain Feel the howl Feel it's pain With love With soul Feel it now In the howl Iron wolf Sliver snake Hear the howl Feel it's pain With love With soul Hear the howl Hear its soul Aw yeah Alright yeah Aw yeah Alright yeah come on Aw yeah Allright Oh yeah Iron wolf Sliver snake Hear the howl Feel the pain


### The flow

**Explore, try, test, automatize, test.**

Scraping data from the web will require a lot of trial and error. In general, I like this flow: I explore the pages that I want to scrape, trying to identify patterns that I can exploit. Then I try, on a smaller subset, and I test if it worked. Then I automatize it, using `purrr` or something similar. And finally some more testing.

#### Exercise: Another Artist

Let's do this for some other artist, like "Angel Haze". Notice that in this case we **must** use the attributes ("href") from the web page, as the name of the authors of the lyrics is not always the same (the `glue` approach would fail).


### Challenge: pack it in a function

We have been scraping using a the same flow at least two artists. Our motto is that if we do something twice, we turn it into a function. So... let's turn the flow into a function. You have all the bits already there in the previous cells, and here I give you a boilerplate that you will have to fill in. When you see `----` in the code, that means it's up to you to do the job!

In [75]:
get_words <- function(band_name){

  # remove white space from band name and substitute them with a dash
  collapsed_name <- str_replace_all(band_name, " ", "-") # this line uses a function from stringr
  # define url to get the title and links
  url <- glue("https://www.musixmatch.com/artist/{collapsed_name}")
  
  # read title page and extract the title chunks 
 list_of_titles <- url %>%
                  read_html()%>%
                  html_nodes(".title")
                  #  How do we extract the list of titles from the first page?

  #and we build the dataframe
  lyrics <- data_frame(Band = band_name,
                       # extract text title
                       Titles = list_of_titles%>%
                       html_text(),
                       # we want only the text of the title, not all the html overhead,
                       # extract title link
                       Link =  list_of_titles %>%
                       html_attr("href"),
                       
                       # and we need the link to the page
                       # map to get lyrics
                       Lyrics = map_chr(Link, get_lyrics_safe) # here is where we do the main job, using get_lyrics_safe()
                     ) 
  
  return(lyrics)
}

Once you a function that you think it's working, test it on a couple of artists and check whether it's working properly.

In [76]:
ATCR_words <- "Breaking Benjamin" %>% get_words()
ATCR_words

Band,Titles,Link,Lyrics
<chr>,<chr>,<chr>,<chr>
Breaking Benjamin,I Will Not Bow,/lyrics/Breaking-Benjamin/I-Will-Not-Bow,"Fall! Now the dark begins to rise Save your breath, it's far from over Leave the lost and dead behind Now's your chance to run for cover I don't want to change the world I just want to leave it colder Light the fuse and burn it up Take the path that leads to nowhere All is lost again But I'm not giving in I will not bow! I will not break! I will shut the world away I will not fall! I will not fade! I will take your breath away... Fall! Watch the end through dying eyes Now the dark is taking over Show me where forever dies Take the fall and run to Heaven All is lost again But I'm not giving in... I will not bow! I will not break! I will shut the world away I will not fall! I will not fade! I will take your breath away... And I'll survive Paranoid I have lost the will to change And I am not proud Cold-blooded fake I will shut the world away... Open your eyes! I will not bow! I will not break! I will shut the world away I will not fall! I will not fade! I will take your breath away... And I'll survive Paranoid I have lost the will to change And I am not proud Cold-blooded fake I will shut the world away... Fall!"
Breaking Benjamin,Breath,/lyrics/Breaking-Benjamin/Breath,"I see nothing in your eyes And the more I see the less I like Is it over yet? In my head I know nothing of your kind And I won't reveal your evil mind Is it over yet? I can't win So sacrifice yourself And let me have what's left I know that I can find the fire in your eyes I'm going all the way Get away, please You take the breath right out of me You left a hole where my heart should be You got to fight just to make it through 'Cause I will be the death of you This will be all over soon Pour the salt into the open wound Is it over yet? Let me in So sacrifice yourself And let me have what's left I know that I can find the fire in your eyes I'm going all the way Get away, please You take the breath right out of me You left a hole where my heart should be You got to fight just to make it through 'Cause I will be the death of you I'm waiting I'm praying Realize Start hating You take the breath right out of me You left a hole where my heart should be You got to fight just to make it through 'Cause I will be the death of you"
Breaking Benjamin,The Diary of Jane,/lyrics/Breaking-Benjamin/The-Diary-of-Jane,"If I had to I would put myself right beside you So let me ask Would you like that? Would you like that? And I don't mind If you say this love is the last time So now I'll ask Do you like that? Do you like that? No! Something's getting in the way Something's just about to break I will try to find my place In the diary of Jane So tell me how it should be Try to find out what makes you tick As I lie down sore and sick Do you like that? Do you like that? There's a fine line between love and hate And I don't mind Just let me say that I like that I like that Something's getting in the way Something's just about to break I will try to find my place In the diary of Jane As I burn another page As I look the other way I still try to find my place In the diary of Jane So tell me how it should be! Desperate, I will crawl Waiting for so long No love, there is no love Die for anyone What have I become? Something's getting in the way Something's just about to break I will try to find my place In the diary of Jane As I burn another page As I look the other way I still try to find my place In the diary of Jane"
Breaking Benjamin,Dance With the Devil,/lyrics/Breaking-Benjamin/Dance-With-the-Devil,Here I stand Helpless and left for dead Close your eyes So many days gone by Easy to find what's wrong Harder to find what's right I believe in you I can show you that I can see right through All your empty lies I won't stay long In this world so wrong Say goodbye As we dance with the devil tonight Don't you dare look at him in the eye As we dance with the devil tonight Trembling Crawling across my skin Feeling your cold dead eyes Stealing the life of mine I believe in you I can show you that I can see right through All your empty lies I won't last long In this world so wrong Say goodbye As we dance with the devil tonight Don't you dare look at him in the eye As we dance with the devil tonight Hold on Hold on Say goodbye As we dance with the devil tonight Don't you dare look at him in the eye As we dance with the devil tonight Hold on Hold on (Goodbye...)
Breaking Benjamin,Angels Fall,/lyrics/Breaking-Benjamin/Angels-Fall,"I try to face the fight within But it's over I'm ready for the riot to begin And surrender I walked the path that led me to the end Remember I'm caught beneath with nothing left to give Forever When angels fall with broken wings I can't give up, I can't give in When all is lost and daylight ends I'll carry you and we will live forever Forever... Grey skies will chase the light away No longer I fought the fight now only dark remains Forever Divided I will stand And I will let this end When angels fall with broken wings I can't give up, I can't give in When all is lost and daylight ends I'll carry you and we will live forever Forever... The sun begins to rise And wash away the sky The turning of the tide Don't leave it all behind I will never say goodbye When angels fall with broken wings I can't give up, I can't give in When all is lost and daylight ends I'll carry you and we will live forever Forever Forever Forever..."
Breaking Benjamin,So Cold,/lyrics/Breaking-Benjamin/So-Cold,"Crowded streets are cleared away one by one Hollow heroes seperate as they run You're so cold keep your hand in mine Wise men wonder while strong men die Show me how it ends, it's alright Show me how defenseless you really are Satisfied and empty inside Well that's alright Let's give this another try If you find your family don't you cry In this land of make believe dead and dry You're so cold but you feel alive Lay your head on me one last time Show me how it ends it's alright Show me how defenseless you really are Satisfied and empty inside Well that's alright Let's give this another try Show me how it ends it's alright Show me how defenseless you really are Satisfied and empty inside Well that's alright Let's give this another try It's alright It's alright It's alright It's alright It's alright It's alright It's alright It's alright It's alright"
Breaking Benjamin,Ashes of Eden,/lyrics/Breaking-Benjamin/Ashes-of-Eden,Will the faithful be rewarded When we come to the end Will I miss the final warning From the lie that I have lived Is there anybody calling I can see the soul within And I am not worthy I am not worthy of this Are you with me after all Why can't I hear you Are you with me through it all Then why can't I feel you Stay with me don't let me go Because there's nothing left at all Stay with me don't let me go Until the ashes of Eden fall Will the darkness fall upon me When the air is growing thin Will the light begin to pull me To its everlasting will I can hear the voices haunting There is nothing left to fear And I am still calling I am still calling to you Are you with me after all Why can't I hear you Are you with me through it all Then Why can't I feel you Stay with me don't let me go Because there's nothing left at all Stay with me don't let me go Until the ashes of Eden fall Don't let go Don't let go Don't let go Don't let go Don't let go Don't let go Why can't I hear you Stay with me don't let me go Because there's nothing left at all Stay with me don't let me go Until the ashes of Eden fall Heaven above me Take my hand Shine until there's nothing left but you Heaven above me Take my hand Shine until there's nothing left but you
Breaking Benjamin,Red Cold River,/lyrics/Breaking-Benjamin/Red-Cold-River,"It's reborn Fight with folded hands Pain left below The lifeless live again Roar, roar, roar Red cold river Roar, roar, roar Red cold river I can't feel anything at all This life has left me cold and down I can't feel anything at all This love has led me to the end Stay reformed Erase this perfect world Hate left below The dark stream war Roar, roar, roar Red cold river I can't feel anything at all This life has left me cold and down I can't feel anything at all This love has led me to the end Roar, roar, roar Red cold river Try to find a reason to live Try to find a reason to live Try to find a reason to live I can't feel anything at all (Try to find a reason to live) This life has left me cold and down (Try to find a reason to live) I can't feel anything at all This love has let me to the end Roar, roar, roar Red cold river"
Breaking Benjamin,Give Me a Sign,/lyrics/Breaking-Benjamin/Give-Me-a-Sign,Dead star shine Light up the sky I'm all out of breath My walls are closing in Days go by Give me a sign Come back to the end The shepherd of the damned I can feel you falling away No longer the lost No longer the same And I can see you starting to break I'll keep you alive If you show me the way Forever and ever The scars will remain I'm falling apart Leave me here forever in the dark Daylight dies Blackout the sky Does anyone care? Is anybody there? Take this life Empty inside I'm already dead I'll rise to fall again I can feel you falling away No longer the lost No longer the same And I can see you starting to break I'll keep you alive If you show me the way Forever and ever The scars will remain I'm falling apart Leave me here forever in the dark God help me I've come undone Out of the light of the sun God help me I've come undone Out of the light of the sun I can feel you falling away No longer the lost No longer the same And I can see you starting to break I'll keep you alive If you show me the way Forever and ever The scars will remain Give me a sign There's something buried in the words Give me a sign Your tears are adding to the flood Just give me a sign There's something buried in the words Give me a sign Your tears are adding to the flood Just give me a sign There's something buried in the words Give me a sign Your tears are adding to the flood Forever and ever The scars will remain
Breaking Benjamin,Blood,/lyrics/Breaking-Benjamin/Blood,Every endless word I have nothing here Sick of all that was Tired of all that is Every hated love I am broken in Sick of fucking up Tired of falling in And all that I regret I have before and will again It's over now (Are you running away?) I come apart (As I lay awake!) It's in my blood (Let the sky fall down!) I won't let go (My oblivion!) Counting every breath I am my own fear Nothing ever was Nothing ever is Every halo lost I am worn within Nothing left to harm Nothing left to live And all that I regret I have before and will again It's over now (Are you running away?) I come apart (As I lay awake!) It's in my blood (Let the sky fall down!) I won't let go (My oblivion!) Face the monster I've become (And fight! I will not become!) In the ground we rise to burn (Maybe your life will let me love!) Forgive my heart It's over now I come apart It's in my blood (Let the sky fall down!) I won't let go (My oblivion!)


In [77]:
AW_words <- "Alien Weaponry" %>% get_words()
AW_words

Band,Titles,Link,Lyrics
<chr>,<chr>,<chr>,<chr>
Alien Weaponry,Raupatu,/lyrics/Alien-Weaponry/Raupatu,"Nā te Tiriti Te tino, tino rangatiratanga O o ratou whenua Tino, tino rangatiratanga O ratou kainga Tino rangatiratanga Me o ratou taonga katoa Waikato Awa He piko, he taniwha Kingi Tawhiao Me Wiremu Tamihana Ki Rangiriri e tū ana Ko Te Whiti o Rongomai Ki Parihaka e noho ana Raupatu! Nā te Tiriti Te tino, tino rangatiratanga O o ratou whenua Tino, tino rangatiratanga O ratou kainga Tino rangatiratanga Me o ratou taonga katoa Raupatu! Rangiriri Raupatu! Pukehinahina Raupatu! Taurangaika Raupatu! Parihaka You take and take But you cannot take from who we are You cannot take our mana You cannot take our māoritanga You cannot take our people You cannot take our whakapapa You cannot take, you cannot take Raupatu!"
Alien Weaponry,Kai Tangata,/lyrics/Alien-Weaponry/Kai-Tangata,"He taua, He taua! Waewae tapu takahi te ara taua Ka hopungia e maha nga upoko Ka hopungia e maha taurekareka E mahi nga mahi a Tūmatauenga Anei rā Te uhi o Mataora Pai tuarā Te kokongapere Nga rape Te kitemaimairu Tatua taua Nga tā moko puhoro Anei nga tohu a Tūmatauenga He pakanga nunui mo te whakautu Tae mai nga tūpuna mo te whakaāwhina Kia mau nga Tohunga mo te whakakarakia E mahi nga mahi a Tūmatauenga Anei rā Te uhi o Mataora Pai tuarā Te kokongapere Nga rape Te kitemaimairu Tatua taua Nga tā moko puhoro Anei nga tohu a Tūmatauenga A Tūmatauenga A Tūmatauenga A Tūmatauenga A Tūmatauenga Mahi nga mahi a Tūmatauenga Mahi nga mahi a Tūmatauenga Mahi nga mahi a Tūmatauenga Mahi nga mahi a Tūmatauenga Whakatangi o nga pū, whakapatu nga taiaha Te kikokiko rekareka ō aku hoariri Nga umu whakakīa tātau kōpū ki te utu E mahi nga mahi a Tūmatauenga Anei rā Te uhi o Mataora Pai tuarā Te kokongapere Nga rape Te kitemaimairu Tatua taua Nga tā moko puhoro Anei nga tohu a Tūmatauenga Waewae tapu takahi te ara taua Waewae tapu takahi te ara taua Waewae tapu takahi te ara taua Waewae tapu takahi te ara taua."
Alien Weaponry,Whispers,/lyrics/Alien-Weaponry/Whispers,"The government's words Are like whispers in our ears Telling us lies To hide away our fears Hikoi taku tai moana te take ō te wā Tāhae whenua anō te kāwanatanga Te pūkana ō tariana, te pū ō tame iti He tāonga mō nga iwi māori nā te kupu ō te tiriti The government's words Are like whispers in our ears Telling us lies To hide away our fears He tiriti nā amerika, john key te waha mōkai Ehara I te koha, tppa he tūtae He hui toropuku, he kōrero huna Nga iwi māori awere, te tiriti takahia The government's words Are like whispers in our ears Telling us lies To hide away our fears Hide away the truth, playing on our fears The government's words are like whispers in our ears The people will find out the truth about our nation A greedy system that shuts down our voice with legislation"
Alien Weaponry,Holding My Breath,/lyrics/Alien-Weaponry/Holding-My-Breath,Before you judge me take a good hard look at yourself You don't know me but you're draining me of mental health A lie based on popular opinion I want to die 'cos' I can't be forgiven Locked in a room Void of humanity I'm in a black hole Suffering endlessly Opening my eyes is worse than death That's why I keep on holding my breath The world is caving in all around me I see myself as a vulgar monstrosity My mind collapsed into a technical mess I can't deal with the guilt that I have to ingest Locked in a room Void of humanity I'm in a black hole Suffering endlessly Opening my eyes is worse than death That's why I keep on holding my breath Opening my eyes worse than death ... Why I keep on holding my breath Worse than death ... Holding my breath Opening my eyes worse than death ... Why I keep on holding my breath
Alien Weaponry,Ahi Kā,/lyrics/Alien-Weaponry/Ahi-K%C4%81,Whakatau Pōtiki Māhuhu-ki-te-rangi Nga uri o Te Kawau Rongomai Te Ariki Tū ana mātou Ki runga Takaparawhā Tahu toku kāinga Tahu toku kāinga Toku ahi kā Tāmaki manuhiri Te kuini o Engarani Whakamā kāwana Tāhae Whenua Ka tangi nga Kuia Ka noho Kaumātua Tahu toku kāinga Tahu toku kāinga Toku ahi kā E maha nga pirihimana E whakawhiti ana ki Takaparawha Tū kaha Tū kaha Pūkana i mua o te kanohi o te kāwanatanga Ahi kā Ahi kā Ahi kā Ahi kā Toku ahi kā Toku ahi kā Toku ahi kā Tahu toku kāinga Toku ahi kā
Alien Weaponry,Blinded,/lyrics/Alien-Weaponry/Blinded,"A never-ending cycle, locked in a fantasy Blinded by your frame of mind and insecurity Acting like there's nothing wrong, when everyone can see Trying to control, when you've got no authority over me Don't take me for granted Don't take me for a fool You don't know me in the slightest And I don't think that I know you There's something on your mind, just fucking tell me Can't look you in your eyes and read your mind, you see Sorry that you feel this way, but honestly You ain't right for someone with my tendencies Set you free! Set you free! Don't take me for granted Don't take me for a fool You don't know me in the slightest And I don't think that I know you Don't take me for granted Don't take me for a fool You don't know me in the slightest And I don't think that I know you Set you free!"
Alien Weaponry,Hypocrite,/lyrics/Alien-Weaponry/Hypocrite,"Hypocrite Hypocrite Hy-po-crite You're just a god-damn hypocrite Decaying your own rules As you continue to disobey yourself Deceiver ... Dissembler ... Pretender ... You god-damn hypocrite Abusing your authority A power you were not given The rule doesn't apply any more You started the offence against the rule that was your own Deceiver ... Dissembler ... Pretender ... You god-damn hypocrite Deceiver ... Dissembler ... Pretender ... You god-damn hypocrite Why listen to the importance of obeying the rule Your actions make lies of your words Yeah, 'cause what we see is the rule maker Being ... being a rule breaker Hypocrite!"
Alien Weaponry,Whaikōrero,/lyrics/Alien-Weaponry/Whaik%C5%8Drero,
Alien Weaponry,The Things That You Know,/lyrics/Alien-Weaponry/The-Things-That-You-Know,Start the game all over again Redo it and regain myself Perfection is not without consequence Put Your Mediocrity on the shelf Normal things are safe and easy Get a life and go Do what you're comfortable with Do the things that you know The things that you know The things that you know The things that you know It's hard to let go of the things that you know Life must be boring be good and get your pay All people who succeed must listen to what The government has to say We must go cleanse ourselves Of happiness It's the government To whom we pray The things that you know (the things that you know) The things that you know (the things that you know) The things that you know (the things that you know) The things that you know (It's hard to let go of the things that you know) It's hard to let go of the things that you know
Alien Weaponry,Nobody Here,/lyrics/Alien-Weaponry/Nobody-Here,Everyone watching but there's nobody here Everyone watching but there's nobody here Everyone watching but there's nobody here Everyone watching but there's nobody here Nau mai ki te rua ipurangi au Ehara i te wāhi whakaruruhau He karu pōtete Ka whakahīhī tonu koe Te ao mārama ngaro O rite ki te pō Te rorohiko The computer He tāku kāinga pono Sometimes I wonder if the world is real ... real ...real Everyone watching but there's nobody here ... nobody here We all just close our eyes and look inside ... reality sinks away ... reality sinks away Everyone watching but there's nobody here Everyone watching but there's nobody here Everyone watching but there's nobody here Everyone watching but there's nobody here Nobody here Nobody here Nobody here


In [78]:
# your tests here.

## The words, the soul

Now that we have a collection of lyrics, it would be a pity not doing anything with them ;-)

So, we will do some quick and dirty sentiment analysis. The idea is to attribute to each word a score, expressing wether it's more negative and positive, and then to sum up all the word values in a song: the result will give us a first approximation to the song general mood. To do this, we are going to use Julia Silge's and David Robinson's great [_Tidytext_](https://github.com/juliasilge/tidytext) library and a _vocabulary_ of words for which we have the scores (there are different options, we are using "afinn").

In [79]:
#install.packages("tidytext")
library(tidytext)

The positivity/negativity values for the words are in a dictionary called "afinn".

In [80]:
afinn <- get_sentiments("afinn")
afinn %>% sample_n(5)

ERROR: Error: The textdata package is required to download the AFINN lexicon. 
Install the textdata package to access this dataset.


Now, a bit of data wrangling: we breaks the lyrics into words, remove the words that are considered not interesting (they are called "stop words"), stitch the dataframe to the scoress from afinn, and do the math for each song.

This workthrough is loosely inspired by Max Humber's [post](https://www.r-bloggers.com/fantasy-hockey-with-rvest-and-purrr/) and my former student David Laing's post [here](https://laingdk.github.io/kendrick-lamar-data-science/). Great things are from them, errors are mine. Read those posts, there is a lot to learn! If you finished too early, try to reproduce David's sentiment analysis on the song lyrics you collected.

The stop words are words we consider not important for the analysis because they are too common. They are in a dictionary (a dataframe indeed) called `stop_words`. Whether or not to remove them when doing any analysis, is up to the researcher best judgement.

In [83]:
stop_words %>% sample_n(10)


word,lexicon
<chr>,<chr>
quite,SMART
like,onix
willing,SMART
there,snowball
sides,onix
cant,SMART
tried,SMART
thus,SMART
while,onix
because,snowball


### Challenge, fill in the code below

In [84]:
SLS_df %>%
  unnest_tokens(word, Lyrics) %>% #split words
  strsplit(word)
# we use a _join function to remove dull words. Which function?
  join(stop_words, by = word) %>% # which column will we use to do the join?
# we use another _join function to stitch the scores from afinn to words in our lyrics.
# Which function?
  join(afinn, by = lexicon) %>% # which column will we use to do the join?
#and for each song we do the math
  group_by(word) %>% # we group by... by what?
  paste(Length = n(), # which functions allows you to reduce the all the rows in a dataframe into a single one?
    Score = sum(score)/Length) %>%
  arrange(-Score)

ERROR: Error in as.character(split): cannot coerce type 'closure' to vector of type 'character'


In [87]:
SLS_df %>%
  unnest_tokens(word, Lyrics) %>%
  strsplit(Lyrics)

ERROR: Error in strsplit(., Lyrics): object 'Lyrics' not found


Once you have done it, try it on a couple of lyrics collections. Then, turn the flow into a reusable flow. Once again, I give you the boilerplate:

In [None]:
get_soul <- . %>%
  unnest_tokens(word, Lyrics) %>% 
  
  ----(stop_words, by = ---) %>% 
  ----(afinn, by = ---) %>% 
  group_by(---) %>% 
  ----(Length = n(), 
    Score = sum(score)/Length) %>%
  arrange(-Score)

And try that on some artists:

In [None]:
"Billie Holiday" %>% get_words() %>% get_soul()
"Nina Simone" %>% get_words() %>% get_soul()

Try to assess your results: do they seem to make sense?

# APIs

In the previous lab we saw how you can use the WWW as a source of data, even when the information presented is not there for the purpose of data analysis (i.e., it is there for you as a browser to read it).

Sometimes, though, you are lucky and the data source you're after will have an **API** set up so that you don't need to scrape it. An **API** is a detailed and rigorous set of rules that you must use in order to get something from a server.

In particular, many APIs in the web gives you back if you write the _right_ url.

Consider searching "something" on http://www.duckduckgo.com. If you take a look at the result page, you will see that the url is something like:  
https://duckduckgo.com/?q=something&t=h_&ia=about

The part before the `?` is the _base url_, the human readable name of the _server_ that is giving you the data. what is after the `?` defines the various argument of the duckduckgo API. `q` probably stands for "query", and after the `=` is where you put the terms you are looking for; `&` separates various arguments; `t` and `ia` define other aspects of the behaviour of the search engine.

You can pass more arguments, and change the behaviour of the server. For example, in duckduckgo we can appen a `format=xml` to get the data in an xml format (instead of the html format with all the fancy visualization stuff). We may do it because the xml format **is** intended for programmatic data extraction and we are trying to get that data. Try to browse to:

https://duckduckgo.com/?q=something&t=h_&ia=about&format=xml

Try a couple of other websites, you will notice that the `?argument=values` format is very common. Websites offering API for accessing their data often have a lot of information about how to do it.

For example, duckduckgo's API is explained here: https://duckduckgo.com/api

A great resources to learn about what are APIs is https://zapier.com/learn/apis/.

### httr

The tool of the trade for APIs interaction in R is the library `httr`. Get familiar with it reading the [introduction](http://httr.r-lib.org/articles/quickstart.html). If you are unfamiliar with how HTTP works (the common underlaying network protocol that rules the web) read also the two resources suggested at the start of the introduction. Or maybe ask your peer to explain you a bit.

### API in the zoo

The website [numbersapi](http://www.numbersapi.com) offers a funny example to try and use a RESTful API. (You can use *different* APIs, take a look at [programmableweb](http://programmableweb.com) for a selection of available ones, and I actually encourage you to choose a different example if you are already familiar with the concept.)

Using some tool to deal with strings (I like `glue`, but you can do this stuff with the base `paste` if you are more comfy) write some examples of interaction with numbersapi (tl;dr, write some query string and feed it to `httr`'s `GET`: e.g., `GET("http://www.colourlovers.com/api/color/6B4106?format=xml")`).

_kudos if you use an APIs that allows you to POST, PUT, DELETE, ... instead of just GETting._

### API in the zoo but tamed

Dealing with strings is not ideal: you need to tweak theme everytime you want to perform a different query, and that opens the door to errors. Also, our credo is that everytime we have to repeat some task, it's better to write a function for it.

Thus, write wrapper functions to perform the queries you wrote in the previous exercise. Try and write both very specific, _atomic_, functions that do just one very specific thing, and some more general function that can do more than one thing combining the more atomic functions together.

For example, if you are using the numbersapi, write both something like `get_integer_math()` that only allows you to query `[integer]/math` (e.g., Ramanujan's taxi plate [1729](http://numbersapi.com/1729/math)); and something like `get_number_type()` which allows for different types specifications (trivia, math, date, or year).

_kudos if these functions check the inputs (e.g., for `get_integer_math`, the function checks wether the input is indeed an integer and returns an error otherwise) and handles the eventual errors risen by the APIs._


### API authorized

Not all APIs are free and completely open to everybody. Some of them require an autentification/authorization step: the server that is responding to your query wants to know who you are, because certain services are available only to some users.

A common method of authentification is called `OAuth` or `OAuth2.0` (if you are very curious, see [here](https://en.wikipedia.org/wiki/OAuth). `httr` has functions to create and handle ouaths. See this [paragraph](http://httr.r-lib.org/articles/api-packages.html#authentication) or read more [here](https://support.rstudio.com/hc/en-us/articles/217952868-Generating-OAuth-tokens-for-a-server-using-httr).

Register on https://developer.twitter.com/, obtain the _secrets_, register an app (it does not matter which website you provide) and write the code to connect to it (see [here](https://developer.twitter.com/) for more details).

### API in the wild

Sometimes you are fortunate enought that the information you are looking for is provided by a website through an API. This is a somewhat open ended exercise. The first one you encounter in this labs, but now you are grown up. We ask you to _pull_ data from an API of your choice, do some simple wrangling (and some super simple analysis if you really want) and visualize your result.

The focus of the exercise is on the interaction with the API you chose, not so much the visualization. We would like to see that you did some complex query in a programmatic fashion (that is, not by hand-writing the query but using a function to do that for you).

Some possible APIs you can use are (in order of what-Giulio-likes):

* digitalnz [API](https://digitalnz.org/developers) get cultural, education and gov data from NZ. It contains many different API. Some of them require authentification and/or subscription (a good one is the DOC [campsites and huts](https://api.doc.govt.nz) api).
* [Geonet](https://api.geonet.org.nz/) for geological hazard information
* [Trademe](https://developer.trademe.co.nz/) very nice one, but not supereasy to get the autentification done (this [queries](https://developer.trademe.co.nz/api-reference/catalogue-methods/) do not require auth).
* all of thise [references](https://www.programmableweb.com/category/reference/api) APIs are (probably, I did not check ALL of them) good examples.
* [GDELT](https://blog.gdeltproject.org/gdelt-geo-2-0-api-debuts/) This is a rewarding, yet **tough**, API. Take a look at [leaflet](https://rstudio.github.io/leaflet/json.html) for an idea on how to use what you get back from GDELT. Also, Alex Bresler has started working on an [R wrapper](https://github.com/abresler/gdeltr2) that may get you inspired.
* [car2go](https://github.com/car2go/openAPI) (requires [registration](https://github.com/car2go/openAPI/wiki/Access-protected-Functions-via-OAuth-1.0#registration-as-consumer))
* [Quandl](https://www.quandl.com/docs/api)
* [Lufthansa](https://developer.lufthansa.com/docs) most methods are open, so good one if you don't want to deal with authentification.

##### Let's do the JSON dance

Many website returns JSON files when you query them. Json is similar to the XML (of which HTML is an example) in that it is a **tree** (not tabular nor relational) data format. Roughly speaking, it is a list of lists of lists of ... `purrr` is very handy when you want to extract information from a JSON file. But do see also the `jsonlite` package, [intro here](https://cran.r-project.org/web/packages/jsonlite/vignettes/json-aaquickstart.html). If you are lost trying to wrangle the results you get from these APIs, consider working through Jenny Brian's [tutorial on purrr](https://jennybc.github.io/purrr-tutorial/ex26_ny-food-market-json.html). Additional material here: https://www.zevross.com/blog/2019/06/11/the-power-of-three-purrr-poseful-iteration-in-r-with-map-pmap-and-imap/