# Spotify Workflow Lab

### Introduction

In this lesson, we'll practice setting up a prefect workflow by making use of the Spotify API.  Let's get started.

### Signing up for Spotify

Before we get going, we'll need to get a `client_id` and `client_secret` from spotify.  The first step is to go to the Spotify dashboard, which we can do by clicking [here](https://developer.spotify.com/dashboard).

In creating a new app, it will ask you to fill in a name, and a callback url.  We won't be using the callback url, so you can just provide `jigsawlabs.io`, or any other website you prefer.

> **What's it for?** In the case that an end user was providing their spotify login credentials to our webpage, after logging in, spotify would redirect the user to that new page.  

When the new app is created, you can click on that app, and then click on Settings in the top right.

<img src="./app-settings.png">

From there, we'll see the client_id, and can expand the client secret.

<img src="./client-id.png">

> You'll need to click on `View client secret` to view the secret.

### Introducing the Spotipy Library

Ok, so to connect to the Spotify API, we'll be using the `Spotipy` library.  You can see that we specified that library in the `requirements.txt` file. 

So let's first create our Python environment with the following. 

* `python -m venv ./venv`
* `source ./venv bin/activate`

And then run the following to install the required dependencies.

`pip install -r requirements.txt`

And we make use of the library with something like the following.

```python
credentials_manager = SpotifyClientCredentials(client_id=client_id, client_secret = client_secret)
client = spotipy.Spotify(client_credentials_manager=credentials_manager)
```

So you can see that the `client_id` and `client_secret` are passed into the `SpotifyClientCredentials`.  

And from there, we can retrieve items like so.

```python
client.playlist_items(playlist_id)
```

### Looking at our codebase

So now it's time for you to connect your `client_id` and `client_secret` to the existing codebase.

If you look at the codebase, inside the `spotify_extractor` folder, you'll see that there is already a `.env` file, and a `settings.py` file that imports from the `.env` file.

Then we import the variables in `settings.py` file into the `spotify_client` file.

If you look at the `console.py`, you can aget a sense of how the pipeline will work.  


```python
# console.py

playlist_id = "37i9dQZEVXbLRQDuF5jeBp"
playlist_tracks = get_playlist_tracks(playlist_id)
# selected_tracks = extract_tracks_info(playlist_tracks, playlist_id)
# write_to_csv(tracks)
```

So we pull data from the playlist, the extract the relevant information, and then write the tracks to csv.

### Pulling our data

Ok, so now run the `console.py` file, and look at the first element in `playlist_tracks`.

It may look like a lot, but remember that we can get a sense of the dictionary by using the `.keys()` method.

If we look closely, we'll see that each playlist record has information about the track itself (track_id, name, etc), the track's album (album name, total tracks), and the artist (artist name).

Instead of pulling down all of this information, let's just create a list of dictionaries that will have:

* `playlist_id`,
* `current_date`,
* `track_id`
* `ranking`

> **Why just those attributes?** The idea is to just pull the information related to a `top_chart_listing`.  And then, we can separately retrieve additional information related to the track by reading through all of the `track_ids` and going to the tracks endpoint to pull down the information, and create a csv of tracks.  And we can do the same for the artist and album.

> This will reduce us duplicating this information by repeatedly pulling down the track name every time it is a top track.

So let's get going.  We wrote a test in the `tests` folder.  All you need to do is get this one function to pass, and then we'll move onto prefect.

We'll even help you out a little bit.

* `playlist_id`: "37i9dQZEVXbLRQDuF5jeBp"
* `current_date`: Generate this from Python
* `track_id`: 
* `ranking`:  This is generated from the `index` in the list of tracks

### Moving to Prefect

Ok, so now that we got this code working without prefect, all we need to do is use prefect to trigger these functions.

Add the prefect code to the `spotify_extractor` file.  The file has placeholders for three tasks, and one flow.  We even filled in the first task for you.

> We call `adapter.get_playlist_tracks` because we want to clarify that this function is defined in the adapter file -- but we want the prefect task to have the same name.

Once the code is filled in, make sure there is nothing in the `./data` folder and then test out the code.  Remember that we can run our prefect flow simply by running the python script it's defined in.

You can confirm that the code worked by seeing that a new file is created in the `./data` folder.

### Viewing the server

Ok, so now let's see our flow run history.  You can view your flow run with a call to `prefect server start`.

Or if you have already connected your computer to prefect, you can just login to prefect.io.

> If you have not logged in, you can do so now, by typing `prefect cloud login` into your terminal.

Ok, once you login, you should see the `extract-and-write` flow in your dashboard.

<img src="./flows-dashboard.png">

And if you click on the flow, you can see the various flow runs.  And if you click on the flow run, you can see the individual task runs.

### Summary

In this lesson we developed a workflow in prefect.  Notice that our prefect code was relatively thin.  Instead, we just had each prefect task call a single function in our codebase.

In the discussion that follows we'll talk about why we did it this way.  But you should come up with your own reasons.  Try to answer the following questions.

1. What are the benefits of isolating our prefect code from the rest of our codebase?

2. What are some things to consider when deciding how many tasks to break up our flow into?  Why not just have one giant task?