## Remodel the data to search albums by genre
We would like another view of the data where we can query by genre.  It's easy to move data from one table to another with Spark

### First create a table called albums_by_genre


In [140]:
%%Cql create table if not exists music.albums_by_genre(album_genre text,
album_title text, performer text, album_year int, primary key(album_genre, album_title, album_year))

### Create an RDD of tuples based on the existing table

In [141]:
case class Album (album_title: String,
                   album_genre: Option[String],
                   performer: String,
                   album_year: Int)

In [142]:
var albums = sc.cassandraTable("music","tracks_by_album").select("album_title","album_genre","performer","album_year").as(Album)

In [143]:
albums.first

### Some genres are null so lets map null to the string `<null>`
Let's prove there are some nulls


In [144]:
albums.filter(a => a.album_genre == None).count

Now use pattern matching to convert None to "null"
The case class has a copy function, where we just pass functions in describing what changed

In [145]:
val nonull_albums = albums.map( a => a.copy( album_genre = a.album_genre match {
   case None => Some("<null>")
   case _ => a.album_genre }
   ))

### Save it to the new table
- We are using Cassandra's UPSERT behavior to de-duplicate the data

In [147]:
nonull_albums.saveToCassandra("music","albums_by_genre",SomeColumns("album_genre","album_title","performer", "album_year"))

## View the results

In [149]:
%%Cql select * from music.albums_by_genre limit 5

album_genre,album_title,album_year,performer
,,,
Disco,ABBA,1975.0,ABBA
Disco,ABBA - Live,1986.0,ABBA
Disco,ABBA Gold - Greatest Hits,1992.0,ABBA
Disco,Arrival,1976.0,ABBA
Disco,Disco Fox Party Classics,2005.0,C.C. Catch
