## Spotify Query

In this notebook we perform some query on our property graph about Spotify. In particular the queries will be divided into three parts:
1. ***Example queries***: we perform two example queries where we show how it is possibile to use the new added information such as *record label* and *instruments*
1. ***Italian tracks and Italian artists from 2017 to 2020:*** we perfom some queries about italian tracks and artists present in the TOP 100 Italy.
1. ***Italian tracks abroad:*** we want to discover if italian tracks are listened also outside Italy.

In [None]:
# required libraries
import pandas as pd
import os
from pathlib import Path
import datetime

### Connection to Neo4j

In [None]:
# Neo4J params class
class Neo4jParams:
  def __init__(self, user, psw,dbname,db_psw,uri):
    self.user = user
    self.psw = psw
    self.dbname = dbname
    self.dbpsw = dbpsw
    self.uri = uri

In [None]:
#DB parameters
user="neo4j"
psw="neo4j"
dbname="SpotifyDB"
dbpsw="SpotifyDB"
uri = "bolt://localhost:7687"

params = Neo4jParams(user,psw,dbname,dbpsw,uri)

In [None]:
from neo4j import GraphDatabase

# test class

class Driver:

    def __init__(self, uri, user, password):
        self.driver = GraphDatabase.driver(uri, auth=(user, password))

    def close(self):
        self.driver.close()

    def print_greeting(self, message):
        with self.driver.session() as session:
            greeting = session.write_transaction(self._create_and_return_greeting, message)
            print(greeting)

    @staticmethod
    def _create_and_return_greeting(tx, message):
        result = tx.run("CREATE (a:Greeting) "
                        "SET a.message = $message "
                        "RETURN a.message + ', from node ' + id(a)", message=message)
        return result.single()[0]


if __name__ == "__main__":
    greeter = Driver("bolt://localhost:7687", "neo4j", "SpotifyDB")
    greeter.print_greeting("hello, world")
    greeter.close()

## Queries

#### Example query
1. Show artists of the same discographic house

2. Show the most common played instrument in rock groups

#### Italian tracks and Italian artists from 2017 to 2020
1. On average how many tracks from italian artists are present in Top 100 for each year. (Grafico a barre)

1. Who is the Artist with the highest number of tracks present in Top 100 Italy for each year. (Nomi degli artisti)

1. How many different italian artist enter at least once in Top 100 Italy for each Year (Grafico a barre)

1. On average how many tracks from italian artists through the different months (Grafico a linea)

1. Show the top 3 artists with more tracks in Top 100 at the same time (nomi artisti)

#### Italian tracks abroad
1. How many tracks from italian artist are present in a Top 100 of a different Country (grafico barre)

1. Show the top 3 countries that listen the most to italiang tracks (show Names and numbers)

1. Who is the artist with more tracks ppresent in a Top 100 of a different Country (show Name)

### Example Queries

#### Query 1 

#### Query 2

### Italian Tracks and Italian Artists

#### On average how many tracks from italian artists are present in Top 100 for each year. (Grafico a barre)

In [None]:
# On average how many tracks from italian artists are present in Top 100 for each year. (Grafico a barre)

# connect to the DB
driver = GraphDatabase.driver(params.uri, auth=(params.user, params.dbpsw))
# create a session
session = driver.session()

result = session.run("""
    MATCH (c1:Country{id:"IT"})<-[:hasNationality]-(p:Person)-[:isMemberOf]->(a:Artist)-[:partecipateIn]->(t:Track)-[:isPositionedIn]->(ch:Chart)-[:isReferredTo]->(c2:Country{id:"IT"})
    WITH ch,ch.date.year AS year, COUNT(DISTINCT t) as numTracks
    RETURN year,avg(numTracks)
    ORDER BY year
""")

for r in result:
    returnedData = r.values()
    print("Year: {}".format(returnedData[0]))
    print("avgNumItalianTracks: {}".format(returnedData[1]))
    print("")

session.close()
driver.close()

#### How many tracks were released in Italy from 2017 to 2020

In [None]:
# How many tracks were produced in Italy from 2017 to 2020

# connect to the DB
driver = GraphDatabase.driver(params.uri, auth=(params.user, params.dbpsw))
# create a session
session = driver.session()

result = session.run("""
    MATCH (c1:Country{id:"IT"})<-[:hasNationality]-(p:Person)-[:isMemberOf]->(a:Artist)-[:partecipateIn]->(t:Track)-[:isPartOf]->(alb:Album)
    WHERE alb.releaseDate.year >= 2017
    WITH alb.releaseDate.year AS year, COUNT(DISTINCT t) as numTracks
    RETURN year,numTracks
    ORDER BY year
""")

for r in result:
    returnedData = r.values()
    print("Year: {}".format(returnedData[0]))
    print("numItalianTracks: {}".format(returnedData[1]))
    print("")

session.close()
driver.close()


#### How many different italian artist enter at least once in Top 100 Italy for each Year (Grafico a barre)

In [None]:
# How many different italian artist enter at least once in Top 100 Italy for each Year 

# connect to the DB
driver = GraphDatabase.driver(params.uri, auth=(params.user, params.dbpsw))
# create a session
session = driver.session()

result = session.run("""
    MATCH (c1:Country{id:"IT"})<-[:hasNationality]-(p:Person)-[:isMemberOf]->(a:Artist)-[:partecipateIn]->(t:Track)-[:isPositionedIn]->(ch:Chart)-[:isReferredTo]->(c2:Country{id:"IT"})
    WITH ch.date.year AS year, COUNT(DISTINCT a) as numArtists
    RETURN year,numArtists
    ORDER BY year
""")

for r in result:
    returnedData = r.values()
    print("Year: {}".format(returnedData[0]))
    print("numItalianArtist: {}".format(returnedData[1]))
    print("")

session.close()
driver.close()


#### Ratio between #numItalianTracks / #numItalianArtist 

In [None]:
years = [2017,2018,2019,2020]
numItalianTracks = [171,324,330,310]
numItalianArtists = [75,99,112,125]

for i in range(0,len(years)):
    print("Year: {}".format(years[i]))
    print("Ratio: {:.2f}".format(numItalianTracks[i]/numItalianArtists[i]))
    print("")

#### On average how many tracks from italian artists through the different months (Grafico a linea)

In [None]:
# How many different italian artist enter at least once in Top 100 Italy for each Year 

# connect to the DB
driver = GraphDatabase.driver(params.uri, auth=(params.user, params.dbpsw))
# create a session
session = driver.session()

result = session.run("""
    MATCH (c1:Country{id:"IT"})<-[:hasNationality]-(p:Person)-[:isMemberOf]->(a:Artist)-[:partecipateIn]->(t:Track)-[:isPositionedIn]->(ch:Chart)-[:isReferredTo]->(c2:Country{id:"IT"})
    WITH ch,ch.date.month AS month, COUNT(DISTINCT t) as numTracks
    RETURN month,avg(numTracks)
    ORDER BY month
""")

for r in result:
    returnedData = r.values()
    print("Year: {}".format(returnedData[0]))
    print("numItalianArtist: {}".format(returnedData[1]))
    print("")

session.close()
driver.close()


#### Who is the Artist with the highest number of tracks present in Top 100 Italy for each year. (Nomi degli artisti)


In [None]:
# Who is the Artist with the highest number of tracks present in Top 100 Italy for each year. (Nomi degli artisti)


#### Who is the Artist with the highest number of tracks present in Top 100 Italy for each year. (Nomi degli artisti)

In [None]:
# Show the top 3 artists with more tracks in Top 100 at the same time (nomi artisti)

# connect to the DB
driver = GraphDatabase.driver(params.uri, auth=(params.user, params.dbpsw))
# create a session
session = driver.session()

result = session.run("""
MATCH (c1:Country{id:"IT"})<-[:hasNationality]-(p:Person)-[:isMemberOf]->(a:Artist)-[:partecipateIn]->(t:Track)-[:isPositionedIn]->(ch:Chart)-[:isReferredTo]->(c2:Country{id:"IT"})
WITH a,ch, COUNT(DISTINCT t) as numTracks
ORDER BY numTracks DESC
LIMIT 3
RETURN a,ch,numTracks
""")

for r in result:
    returnedData = r.values()
    print("Artist: {}".format(returnedData[0]["name"]))
    print("Chart: {}".format(returnedData[1]["id"]))
    print("numTracks: {}".format(returnedData[2]))
    print("")
session.close()
driver.close()



### Italian tracks abroad

#### How many tracks from italian artist are present in a Top 100 of a different Country (grafico barre)

In [None]:
#How many tracks from italian artist are present in a Top 100 of a different Country (grafico barre)
"""
MATCH (c1:Country)<-[:hasNationality]-(p:Person)-[:isMemberOf]->(a:Artist)-[:partecipateIn]->(t:Track)-[:isPositionedIn]->(c:Chart)-[:isReferredTo]->(c2:Country)
WHERE c1.id="IT" AND c2.id<>"IT"
RETURN a,t,c,c2
"""