# Association Rules from Nine Inch Nails Setlists
We extract association rules for the songs that Nine Inch Nails played inside the same setlist

First, we load the library for association rule mining.

https://stackoverflow.com/questions/17313450/how-to-convert-data-frame-to-transactions-for-arules

In [43]:
library(arules)

# beware we turned off warnings (do this only when you have checked everythin)
# options(warn=0) will turn them on again
options(warn=-1)

Next we load the setlists we downloaded from setlist.fm

In [44]:
trans = read.transactions("NINSetListBasket.txt", format="basket", sep=";")
inspect(head(trans))

    items                        
[1] {1,000,000,                  
     Came Back Haunted,          
     Closer,                     
     Copy of A,                  
     Disappointed,               
     Eraser,                     
     Find My Way,                
     Gave Up,                    
     Head Like a Hole,           
     March of the Pigs,          
     Piggy,                      
     Sanctified,                 
     Terrible Lie,               
     The Great Destroyer,        
     The Hand That Feeds,        
     Wish}                       
[2] {1,000,000,                  
     Came Back Haunted,          
     Closer,                     
     Copy of A,                  
     Disappointed,               
     Eraser,                     
     Find My Way,                
     Gave Up,                    
     Head Like a Hole,           
     March of the Pigs,          
     Reptile,                    
     Sanctified,                 
     Terrible 

Let's derive some frequent item sets, let's put minimum supporto to 0.6 and at least two items.

In [45]:
fi <- apriori(trans, parameter = list(support=0.5, minlen=2, target="frequent"))

Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen
         NA    0.1    1 none FALSE            TRUE       5     0.5      2
 maxlen            target   ext
     10 frequent itemsets FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 471 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[311 item(s), 943 transaction(s)] done [0.00s].
sorting and recoding items ... [7 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 done [0.00s].
writing ... [23 set(s)] done [0.00s].
creating S4 object  ... done [0.00s].


How many frequent itemsets did we find

In [46]:
length(fi)

Let's check the top 10 by support.

In [47]:
inspect(head(sort(fi,by = "support"),10))

     items                                        support   count
[1]  {Head Like a Hole,March of the Pigs}         0.7126193 672  
[2]  {Gave Up,Head Like a Hole}                   0.6415695 605  
[3]  {Gave Up,March of the Pigs}                  0.6373277 601  
[4]  {Gave Up,Head Like a Hole,March of the Pigs} 0.6044539 570  
[5]  {Closer,March of the Pigs}                   0.6033934 569  
[6]  {Closer,Head Like a Hole}                    0.6033934 569  
[7]  {Head Like a Hole,Terrible Lie}              0.6023330 568  
[8]  {Closer,Head Like a Hole,March of the Pigs}  0.5938494 560  
[9]  {Head Like a Hole,Hurt}                      0.5885472 555  
[10] {Head Like a Hole,Wish}                      0.5715801 539  


Now let's go for some rules and check how many we find.

In [48]:
ar <- apriori(trans, parameter = list(support=0.5, confidence=0.7, target="rules"))
length(ar)

Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen
        0.7    0.1    1 none FALSE            TRUE       5     0.5      1
 maxlen target   ext
     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 471 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[311 item(s), 943 transaction(s)] done [0.00s].
sorting and recoding items ... [7 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 done [0.00s].
writing ... [55 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].


What are the top rules by confidence?

In [49]:
ar_by_confidence <- sort(ar, by = "confidence")
inspect(head(ar_by_confidence))

    lhs                    rhs                   support confidence     lift count
[1] {Gave Up,                                                                     
     Hurt,                                                                        
     March of the Pigs} => {Head Like a Hole}  0.5037116  0.9937238 1.058849   475
[2] {Hurt,                                                                        
     March of the Pigs} => {Head Like a Hole}  0.5524920  0.9923810 1.057418   521
[3] {Gave Up,                                                                     
     Hurt}              => {Head Like a Hole}  0.5302227  0.9920635 1.057080   500
[4] {Closer,                                                                      
     Gave Up,                                                                     
     Head Like a Hole}  => {March of the Pigs} 0.5291622  0.9861660 1.315353   499
[5] {Closer,                                                                      
    

Let's relax the constraints and use a lower support and confidence but let's go for the "surprise" factor, using lift.

In [50]:
ar <- apriori(trans, parameter = list(support=0.3, confidence=0.5, target="rules"))
ar_by_lift <- sort(ar, by = "lift")

Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen
        0.5    0.1    1 none FALSE            TRUE       5     0.3      1
 maxlen target   ext
     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 282 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[311 item(s), 943 transaction(s)] done [0.00s].
sorting and recoding items ... [15 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 4 5 6 done [0.00s].
writing ... [478 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].


In [51]:
inspect(head(ar_by_lift,10))

     lhs                    rhs                     support confidence     lift count
[1]  {Gave Up,                                                                       
      Head Like a Hole,                                                              
      Wish}              => {The Hand That Feeds} 0.3170732  0.6704036 1.832436   299
[2]  {Gave Up,                                                                       
      Head Like a Hole,                                                              
      March of the Pigs,                                                             
      Wish}              => {The Hand That Feeds} 0.3096501  0.6651481 1.818071   292
[3]  {Gave Up,                                                                       
      Wish}              => {The Hand That Feeds} 0.3191941  0.6323529 1.728431   301
[4]  {Gave Up,                                                                       
      March of the Pigs,                              