# Musicians - Easy

## Tables used in the musicians database

band(**band_no**, band_name, band_home, band_type, b_date, band_contact)

composer(**comp_no**, comp_is, comp_type)

composition(**c_no**, comp_date, c_title, c_in)

concert(**concert_no**, concert_venue, concert_in, con_date, concert_orgniser)

has_composed(**_cmpr_no_**, **_cmpn_no_**)

musician(**m_no**, m_name, born, died, born_in, living_in)

performance(**_pfrmnc_no_**, gave, performed, conducted_by, performed_in)

performer(**perf_no**, perf_is, instrument, perf_type)

place(**place_no**, place_town, place_country)

plays_in(**_player_**, **band_id**)

- **musician**

m_no | m_name | born | died | born_in | living_in
-:|-----------|------|------|--------:|-----:
1 | Fred Bloggs | 02/01/48 |  | 1 | 2
2 | John Smith | 03/03/50 |  | 3 | 4
3 | Helen Smyth | 08/08/48 |  | 4 | 5
4 | Harriet Smithson | 09/05/1909 | 20/09/80 | 5 | 6
5 | James First | 10/06/65 |  | 7 | 7
... | | | | | 

- **place**

place_no | place_town | place_country
--------:|------------|---------
1 | Manchester | England
2 | Edinburgh | Scotland
3 | Salzburg | Austria
4 | New York | USA
5 | Birmingham | England
... | | 

- **performer**

perf_no | perf_is | instrument | perf_type
-:|--:|--------|---------------------
1 | 2 | violin | classical
2 | 4 | viola | classical
3 | 6 | banjo | jazz
4 | 8 | violin | classical
5 | 12 | guitar | jazz
...| | |

- **composer**

comp_no | comp_is | comp_type
-:|--:|----------------
1 | 1 | jazz
2 | 3 | classical
3 | 5 | jazz
4 | 7 | classical
5 | 9 | jazz
... | |

- **band**

band_no | band_name | band_home | band_type | b_date | band_contact
-:|-----|--:|-----------|----------|----:
1 | ROP | 5 | classical | 30/01/01 | 11
2 | AASO | 6 | classical |  | 10
3 | The J Bs | 8 | jazz |  | 12
4 | BBSO | 9 | classical |  | 21
5 | The left Overs | 2 | jazz |  | 8
... | | | | |

- **plays_in**

player | band_id
-:|---:
1 | 1
1 | 7
3 | 1
4 | 1
4 | 7
... |

- **composition**

c_no | comp_date | c_title | c_in
-:|----------|--------|-------:
1 | 17/06/75 | Opus 1 | 1
2 | 21/07/76 | Here Goes | 2
3 | 14/12/81 | Valiant Knight | 3
4 | 12/01/82 | Little Piece | 4
5 | 13/03/85 | Simple Song | 5
... | | |

- **has_composed**

cmpr_no | cmpn_no
-:|---:
1 | 1
1 | 8
2 | 11
3 | 2
3 | 13
... |

- **concert**

concert_no | concert_venue | concert_in | con_date | concert_orgniser
-:|------------------|--:|----------|----:
1 | Bridgewater Hall | 1 | 06/01/95 | 21
2 | Bridgewater Hall | 1 | 08/05/96 | 3
3 | Usher Hall | 2 | 03/06/95 | 3
4 | Assembly Rooms | 2 | 20/09/97 | 21
5 | Festspiel Haus | 3 | 21/02/95 | 8
... | | | |

- **performance**

pfrmnc_no | gave | performed | conducted_by | performed_in
-:|--:|--:|---:|-----------------------:
1 | 1 | 1 | 21 | 1
2 | 1 | 3 | 21 | 1
3 | 1 | 5 | 21 | 1
4 | 1 | 2 | 1 | 2
5 | 2 | 4 | 21 | 2
... | | | |

Here is the data model of the musician tables

![](../src/img/musician_str.svg)

In [1]:
import findspark
import pandas as pd
findspark.init()

SVR = '192.168.31.31'
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.sql import Window

sc = (SparkSession.builder.appName('app16-1') 
      .master(f'spark://{SVR}:7077') 
      .config('spark.sql.warehouse.dir', f'hdfs://{SVR}:9000/user/hive/warehouse') 
      .config('spark.cores.max', '4') 
      .config('spark.executor.instances', '1') 
      .config('spark.executor.cores', '2') 
      .config('spark.executor.memory', '10g') 
      .enableHiveSupport().getOrCreate())

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).


In [2]:
composer = sc.read.table('sqlzoo.composer')
composition = sc.read.table('sqlzoo.composition')
concert = sc.read.table('sqlzoo.concert')
has_composed = sc.read.table('sqlzoo.has_composed')
musician = sc.read.table('sqlzoo.musician')
performance = sc.read.table('sqlzoo.performance')
performer = sc.read.table('sqlzoo.performer')
place = sc.read.table('sqlzoo.place')
plays_in = sc.read.table('sqlzoo.plays_in')

## 1.
**Give the organiser's name of the concert in the Assembly Rooms after the first of Feb, 1997.**

In [3]:
(concert.filter((concert['concert_venue']=='Assembly Rooms') & 
                (concert['con_date']>'1997-02-01'))
 .join(musician, on=(concert['concert_orgniser']==musician['m_no']))
 .select('m_name')
 .toPandas())

                                                                                

Unnamed: 0,m_name
0,James Steeple


## 2.
**Find all the performers who played guitar or violin and were born in England.**

In [4]:
(performer.filter(performer['instrument'].isin(['violin', 'guitar']))
 .join(musician, on=(performer['perf_is']==musician['m_no']))
 .join(place.filter(place['place_country']=='England'),
       on=(musician['born_in']==place['place_no']))
 .select('m_name')
 .orderBy('m_name')
 .toPandas())

Unnamed: 0,m_name
0,Alan Fluff
1,Davis Heavan
2,Harriet Smithson
3,Harry Forte
4,James First
5,Theo Mengel


## 3.
**List the names of musicians who have conducted concerts in USA together with the towns and dates of these concerts.**

In [5]:
(concert.join(place.filter(place['place_country']=='USA'),
              on=(concert['concert_in']==place['place_no']))
 .join(performance, on=(concert['concert_no']==performance['performed_in']))
 .join(musician, on=(performance['conducted_by']==musician['m_no']))
 .select('m_name', 'place_town', 'con_date')
 .distinct()
 .orderBy('m_name')
 .toPandas())

Unnamed: 0,m_name,place_town,con_date
0,James Steeple,New York,1997-06-15


## 4.
**How many concerts have featured at least one composition by Andy Jones? List concert date, venue and the composition's title.**

In [6]:
(concert.join(composition, on=(concert['concert_no']==composition['c_in']))
 .join(has_composed, on=(composition['c_no']==has_composed['cmpn_no']))
 .join(composer, on=(has_composed['cmpr_no']==composer['comp_no']))
 .join(musician.filter(musician['m_name']=='Andy Jones'), 
       on=(composer['comp_is']==musician['m_no']))
 .select('con_date', 'concert_venue', 'c_title')
 .toPandas())

Unnamed: 0,con_date,concert_venue,c_title
0,1996-05-08,Bridgewater Hall,A Simple Piece


## 5.
**List the different instruments played by the musicians and avg number of musicians who play the instrument.**

In [7]:
t = (performer.join(musician, on=(performer['perf_is']==musician['m_no'])))
(t.groupBy('instrument')
 .pivot('m_name')
 .agg(count('m_no'))
 .union(t.withColumn('instrument', lit('ALL'))
        .groupBy('instrument')
        .pivot('m_name')
        .agg(count('m_no')))
 .join(t.groupBy('instrument')
       .agg(count('m_no').alias('ALL'))
       .union(t.withColumn('instrument', lit('ALL'))
              .groupBy('instrument')
              .agg(count('m_no').alias('ALL'))),
       on='instrument')
 .orderBy('instrument')
 .fillna(0)
 .toPandas())

Unnamed: 0,instrument,Alan Fluff,Davis Heavan,Elsie James,Harriet Smithson,Harry Forte,Helen Smyth,James First,James Quick,Jeff Dawn,John Smith,Louise Simpson,Sue Little,Theo Mengel,ALL
0,ALL,2,3,3,2,3,1,1,2,2,3,3,1,3,29
1,banjo,0,0,0,0,0,0,0,0,0,0,0,0,1,1
2,bass,0,0,0,0,0,0,0,0,1,1,1,0,0,3
3,cello,0,1,1,0,0,0,0,0,0,0,0,1,0,3
4,clarinet,0,0,0,0,1,0,0,0,0,0,0,0,0,1
5,cornet,0,0,0,0,0,0,0,0,0,0,1,0,0,1
6,drums,0,0,0,0,1,0,0,0,0,0,0,0,1,2
7,flute,0,0,0,0,0,0,0,1,0,1,0,0,0,2
8,guitar,0,1,1,0,0,0,0,0,0,0,0,0,0,2
9,horn,0,0,0,0,0,1,0,0,0,0,0,0,0,1


In [8]:
sc.stop()