# Discussion 5

## Notes

In [1]:
print len("abcdef") # len() is a function

print "abcdef".startswith("ab") # .startswith() is a method

6
True


Why can some functions be called as `foo()` directly, but others can only be called as `.foo()` on a value?

The `def` keyword makes an ordinary function:

In [1]:
def add(x, y):
    return x + y

add(3, 6)

9

In programming jargon, a _class_ is a formal definition of a data structure. Classes have _fields_ to store data and _methods_ to perform tasks; it makes sense to think of these as variables and functions bundled up inside the data structure.

Python allows you to define your own data structures with the `class` keyword:

In [31]:
class Point:
    # The __init__ method is a special method that gets called when an object is created.
    def __init__(self):
        # Create fields x and y, initialized to 0.
        self.x = 0
        self.y = 0
    
    # Another method.
    def print_xy(self):
        print "({}, {})".format(self.x, self.y)
        
    def move(self, xdist, ydist):
        self.x += xdist
        self.y += ydist

A `def` inside of a class creates a method instead of a function.

An _object_ is a specific instance of a class. You can create a new object by calling the class as if it were a function:

In [32]:
# Make a new Point object.
pt = Point()

You can use `.` to access fields and methods within an object:

In [33]:
pt.print_xy()
pt.move(10, -1)
pt.print_xy()

(0, 0)
(10, -1)


You probably won't need to write your own classes this quarter. Writing classes tends to be more useful for developing applications than for data analysis.

### Web APIs

An _application programming interface_ (API) is a set of functions and data structures for communicating with other software. For instance, whenever you use a Python package, you're using the API created by the package's developers.

Web sites sometimes provide an API so that programmers can access content without web scraping. A _representational state transfer_ (REST) API is a web API that uses different URLs for different functions. There are other kinds of web APIs, but REST APIs are popular and the easiest kind to use.

Spotify provides a REST API for getting information about artists and albums, with documentation [here](https://developer.spotify.com/web-api/). All of the functions start with the base URL:

```
https://api.spotify.com
```

The function name and arguments are appended to the end of the base url. The function to search for an artist is at the URL:

```
https://api.spotify.com/v1/search
```

A _hypertext transfer protocol_ (HTTP) request to this URL searches for an artist. HTTP is also what your web browser uses to download web pages. [Several different kinds of HTTP requests are possible](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods). Spotify says that the artist search function should be used with a GET request.

The `requests` package has functions for making HTTP requests and is [well-documented](http://docs.python-requests.org/en/master/). Even [the Python documentation for urllib](https://docs.python.org/2/library/urllib.html) recommends using `requests` (rather than `urllib`) for most tasks.

In [2]:
import requests
# In terminal: conda install requests
import requests_cache
# In terminal: pip install requests_cache

# Set up a cache for requests.
requests_cache.install_cache("cache")

response = requests.get("http://jsharpna.github.io/")
# Throw an error if status isn't okay.
response.raise_for_status()

response.text # get text
response.content # get raw bytes

'<!DOCTYPE html>\n<!-- This page was generated by GitHub Pages using the Cayman theme by Jason Long. -->\n<html lang="en-us">\n  <head>\n    <meta charset="UTF-8">\n    <title>Sharpnack Lab</title>\n    <meta name="viewport" content="width=device-width, initial-scale=1">\n    <link rel="stylesheet" type="text/css" href="stylesheets/normalize.css" media="screen">\n    <link href=\'https://fonts.googleapis.com/css?family=Open+Sans:400,700\' rel=\'stylesheet\' type=\'text/css\'>\n    <link rel="stylesheet" type="text/css" href="stylesheets/stylesheet.css" media="screen">\n    <link rel="stylesheet" type="text/css" href="stylesheets/github-light.css" media="screen">\n        <style type="text/css">\n      .page-header {\n      background-image:url("stylesheets/waterwise.jpg");\n      color: #fff;\n      text-shadow: 0px 0px 4px #ccccff;\n      }\n    </style>\n  </head>\n  <body>\n    <section class="page-header">\n      <h1 class="project-name">UC Davis Sharpnack Lab</h1>\n      <h2 class

Let's write a Python function that acts as a wrapper for Spotify's search function.

In [42]:
def spotify_search(term, search_type = "artist", verbose = False):
    url = "https://api.spotify.com/v1/search"
    response = requests.get(url, params = {
        "q": term,
        "type": search_type
    })
    response.raise_for_status() # check for errors
    if verbose:
        print response.url

    return response.json() # parse JSON

Most of the time, it's a good strategy to "wrap" each web API fuction with a Python function. This lets you to use the web API as if it's just another Python module.

If you're using the web API to get data, the next step is to convert the data to a Pandas data frame:

In [45]:
import pandas as pd
artists = ("sgagegehffb", "Lady Gaga", "Migos", "Taylor Swift", "Slayer", "Radiohead", "Mr. Bungle", "Tame Impala", "Queen")

#results = [spotify_search(x)["artists"]["items"] for x in artists]
def search_artist(artist):
    items = spotify_search(artist, verbose = False)["artists"]["items"]
    # Check whether there were any search results.
    if len(items) == 0:
        # No results, so return the only data we have.
        return {"name": artist}
    
    return items[0]
  
# Build a data frame from a list of dicts.
pd.DataFrame([search_artist(x) for x in artists])

Unnamed: 0,external_urls,followers,genres,href,id,images,name,popularity,type,uri
0,,,,,,,sgagegehffb,,,
1,{u'spotify': u'https://open.spotify.com/artist...,"{u'total': 2418347, u'href': None}","[dance pop, pop, pop christmas, post-teen pop]",https://api.spotify.com/v1/artists/1HY2Jd0NmPu...,1HY2Jd0NmPuamShAr6KMms,[{u'url': u'https://i.scdn.co/image/c2e26db97f...,Lady Gaga,86.0,artist,spotify:artist:1HY2Jd0NmPuamShAr6KMms
2,{u'spotify': u'https://open.spotify.com/artist...,"{u'total': 735403, u'href': None}","[dwn trap, pop rap, rap, trap music]",https://api.spotify.com/v1/artists/6oMuImdp5Zc...,6oMuImdp5ZcFhWP0ESe6mG,[{u'url': u'https://i.scdn.co/image/6f77fdcfcb...,Migos,92.0,artist,spotify:artist:6oMuImdp5ZcFhWP0ESe6mG
3,{u'spotify': u'https://open.spotify.com/artist...,"{u'total': 5019778, u'href': None}","[dance pop, pop, pop christmas, post-teen pop]",https://api.spotify.com/v1/artists/06HL4z0CvFA...,06HL4z0CvFAxyc27GXpf02,[{u'url': u'https://i.scdn.co/image/8e985a4c3a...,Taylor Swift,87.0,artist,spotify:artist:06HL4z0CvFAxyc27GXpf02
4,{u'spotify': u'https://open.spotify.com/artist...,"{u'total': 505119, u'href': None}","[alternative metal, death metal, groove metal,...",https://api.spotify.com/v1/artists/1IQ2e1buppa...,1IQ2e1buppatiN1bxUVkrk,[{u'url': u'https://i.scdn.co/image/8c81130db7...,Slayer,67.0,artist,spotify:artist:1IQ2e1buppatiN1bxUVkrk
5,{u'spotify': u'https://open.spotify.com/artist...,"{u'total': 1954780, u'href': None}","[alternative rock, indie rock, melancholia, pe...",https://api.spotify.com/v1/artists/4Z8W4fKeB5Y...,4Z8W4fKeB5YxbusRsdQVPb,[{u'url': u'https://i.scdn.co/image/afcd616e1e...,Radiohead,80.0,artist,spotify:artist:4Z8W4fKeB5YxbusRsdQVPb
6,{u'spotify': u'https://open.spotify.com/artist...,"{u'total': 46726, u'href': None}","[alternative metal, experimental rock, funk me...",https://api.spotify.com/v1/artists/2zq0uqN9Wq1...,2zq0uqN9Wq12tqrQQt1ozw,[{u'url': u'https://i.scdn.co/image/f108bec8a5...,Mr. Bungle,44.0,artist,spotify:artist:2zq0uqN9Wq12tqrQQt1ozw
7,{u'spotify': u'https://open.spotify.com/artist...,"{u'total': 1064763, u'href': None}","[australian alternative rock, indie pop, indie...",https://api.spotify.com/v1/artists/5INjqkS1o8h...,5INjqkS1o8h1imAzPqGZBb,[{u'url': u'https://i.scdn.co/image/11ea8d2291...,Tame Impala,78.0,artist,spotify:artist:5INjqkS1o8h1imAzPqGZBb
8,{u'spotify': u'https://open.spotify.com/artist...,"{u'total': 3181223, u'href': None}","[album rock, classic rock, glam rock, hard roc...",https://api.spotify.com/v1/artists/1dfeR4HaWDb...,1dfeR4HaWDbWqFHLkxsg1d,[{u'url': u'https://i.scdn.co/image/b040846ceb...,Queen,83.0,artist,spotify:artist:1dfeR4HaWDbWqFHLkxsg1d


Now let's write a function that wraps Spotify's album list function.

In [48]:
def spotify_albums(artist_id):
    url = "https://api.spotify.com/v1/artists/{id}/albums".format(id = artist_id)
    response = requests.get(url, params = {
        "album_type": "album",
        "market": "US"
    })
    response.raise_for_status()

    return response.json()

In [50]:
spotify_albums("1dfeR4HaWDbWqFHLkxsg1d") # Queen

{u'href': u'https://api.spotify.com/v1/artists/1dfeR4HaWDbWqFHLkxsg1d/albums?offset=0&limit=20&album_type=album&market=US',
 u'items': [{u'album_type': u'album',
   u'artists': [{u'external_urls': {u'spotify': u'https://open.spotify.com/artist/1dfeR4HaWDbWqFHLkxsg1d'},
     u'href': u'https://api.spotify.com/v1/artists/1dfeR4HaWDbWqFHLkxsg1d',
     u'id': u'1dfeR4HaWDbWqFHLkxsg1d',
     u'name': u'Queen',
     u'type': u'artist',
     u'uri': u'spotify:artist:1dfeR4HaWDbWqFHLkxsg1d'}],
   u'available_markets': [u'CA', u'US'],
   u'external_urls': {u'spotify': u'https://open.spotify.com/album/60TXSuzXQoEy3p5cQEkLu7'},
   u'href': u'https://api.spotify.com/v1/albums/60TXSuzXQoEy3p5cQEkLu7',
   u'id': u'60TXSuzXQoEy3p5cQEkLu7',
   u'images': [{u'height': 640,
     u'url': u'https://i.scdn.co/image/83bc8d97de6e16e901409ea9a9d18982bac472e7',
     u'width': 640},
    {u'height': 300,
     u'url': u'https://i.scdn.co/image/f6ee3edced72f92be8c947cd0d149b5a89d15d21',
     u'width': 300},
    {u

Next week we'll look at how to merge the album data into the artists data frame (in the first 10 minutes), and then I'll take questions.