The following query matches recording labels with recording locations by their shared area IDs. A left join is used so that all recording labels are present in the results, regardless of whether there are any recording locations in their area. Furthermore, recording places that do not have recording labels in their area are not included in the results. Only records formed after the year 2000 are included.

In [2]:
%%bigquery
select p.string_field_5 as p_area, p.string_field_2 as place_name,
        l.string_field_11 as l_area, l.string_field_2 as label_name, l.string_field_12 as l_comment
from musicbrainz_staging.Label as l 
left join musicbrainz_staging.Place as p
on l.string_field_11 = p.string_field_5
where l.string_field_12 is not null
limit 12

Unnamed: 0,p_area,place_name,l_area,label_name,l_comment
0,10,El Quincho,10,Revista Caras,Celebrities Magazine
1,10,Digisound Mastering,10,Revista Caras,Celebrities Magazine
2,10,Pichón Digital,10,Revista Caras,Celebrities Magazine
3,10,The Garden Mastering,10,Revista Caras,Celebrities Magazine
4,10,Colombia Records,10,Revista Caras,Celebrities Magazine
5,10,Conservatorio Nacional de Música,10,Revista Caras,Celebrities Magazine
6,10,El Quincho,10,Le Musique,Le Musique S.R.L.
7,10,Digisound Mastering,10,Le Musique,Le Musique S.R.L.
8,10,Pichón Digital,10,Le Musique,Le Musique S.R.L.
9,10,The Garden Mastering,10,Le Musique,Le Musique S.R.L.


The query below identifies the releases present in the Releases table that do not have a corresponding release group in the Release_Group table. Originally, rel_gr_id was supposed to be a forein key that refered back to the release group ID primary key. However, during data ingestion the rel_gr_ids shown in these results were omited from the Release_Group primary key, thus breaking the PK-FK relationship.

In [5]:
%%bigquery
select rel.int64_field_4 as rel_gr_id, rel.string_field_2 as rel_name,
        rg.int64_field_0 as gr_id, rg.string_field_2 as rel_group_name
from musicbrainz_staging.Release as rel
left join musicbrainz_staging.Release_Group as rg
on rel.int64_field_4 = rg.int64_field_0
where rg.int64_field_0 is null
limit 12

Unnamed: 0,rel_gr_id,rel_name,gr_id,rel_group_name
0,2290703,和自己對話,,
1,28818,Starci a klarinety,,
2,1099186,"The Royal Edition, Volume 15: ""On the Town"" Da...",,
3,1270242,3 Originals,,
4,1270337,2 Great Pop Classics,,
5,891881,Winter Songs,,
6,1186279,Saint-Saëns: Great Orchestral Showpieces,,
7,1105774,"CBS Great Performances, Volume 21: ""Eroica"" Sy...",,
8,315634,"Live at the Star-Club, Hamburg",,
9,668970,Coffee Cantata / Peasant Cantata,,


The following query hopes to mitigate the problem portrayed by the prior query. The following query includes a portion that is the same as that above. It finds releases that do not have corresponding release groups. This query attempts to pair each release without a release group to the artist that made the release. Then that resulting table is joined with the release groups in which the artist has participated. Hopefully, this will produce a collection of release groups that may be distantly associated with the release missing a release group. This result can then be used in later analysis to see if the release originated in a particular geographic area or recording studio.

In [9]:
%%bigquery
select rel.int64_field_4 as rel_gr_id, rel.string_field_2 as rel_name, rel.int64_field_4 as rel_artist_id,
        rg.int64_field_0 as gr_id, rg.string_field_2 as rel_group_name, rg.int64_field_3 as rg_artist_id,
        a.int64_field_0 as artist_id, a.string_field_2 as artist_name
from musicbrainz_staging.Release as rel
    left join musicbrainz_staging.Release_Group as rg
        on rel.int64_field_4 = rg.int64_field_0
    join musicbrainz_staging.Artist as a
        on rel.int64_field_4 = a.int64_field_0
    join musicbrainz_staging.Release_Group as rg_2
        on rg_2.int64_field_3 = a.int64_field_0
where rg.int64_field_0 is null
limit 12

Unnamed: 0,rel_gr_id,rel_name,rel_artist_id,gr_id,rel_group_name,rg_artist_id,artist_id,artist_name
0,1270242,3 Originals,1270242,,,,1270242,Martin van den Berg
1,1186279,Saint-Saëns: Great Orchestral Showpieces,1186279,,,,1186279,KYO〜YA
2,1186279,Saint-Saëns: Great Orchestral Showpieces,1186279,,,,1186279,KYO〜YA
3,1186279,Saint-Saëns: Great Orchestral Showpieces,1186279,,,,1186279,KYO〜YA
4,1186279,Saint-Saëns: Great Orchestral Showpieces,1186279,,,,1186279,KYO〜YA
5,1186279,Saint-Saëns: Great Orchestral Showpieces,1186279,,,,1186279,KYO〜YA
6,1186279,Saint-Saëns: Great Orchestral Showpieces,1186279,,,,1186279,KYO〜YA
7,1186279,Saint-Saëns: Great Orchestral Showpieces,1186279,,,,1186279,KYO〜YA
8,1105774,"CBS Great Performances, Volume 21: ""Eroica"" Sy...",1105774,,,,1105774,Georg Obermayer
9,315634,"Live at the Star-Club, Hamburg",315634,,,,315634,Lawineboys


This query finds Australian artists by comparing the Area id to the Area foreign key in the Artists table. The results are ordered by the oldest to the newest artists.

In [11]:
%%bigquery
select art.int64_field_0 as id, art.string_field_2 as name, art.string_field_4 as begin_date, art.string_field_7 as end_date,
area.string_field_2 as place
from musicbrainz_modeled.Artist as art
join musicbrainz_modeled.Area as area on area.int64_field_0 = safe_cast(art.string_field_11 as int64)
where area.int64_field_0 = 13
order by (safe_cast(art.string_field_7 as int64) - safe_cast(art.string_field_4 as int64)) DESC
limit 12

Unnamed: 0,id,name,begin_date,end_date,place
0,1216097,Linda Phillips,1899,2002,Australia
1,1552359,Howard Leyton-Brown,1918,2017,Australia
2,1484148,Kurt Jensen,1913,2011,Australia
3,815522,Graeme Bell,1914,2012,Australia
4,1239546,Norman Erskine,1913,2010,Australia
5,478915,Merv Lilley,1919,2016,Australia
6,1455187,Mirrie Hill,1889,1986,Australia
7,1211056,Mary Gilmore,1865,1962,Australia
8,500803,P.L. Travers,1899,1996,Australia
9,1419254,Esther Rofe,1904,2000,Australia


This query finds all the recordings by artists who started after the year 2000, ordered by the distinct recording id's. 

In [19]:
%%bigquery
select art.int64_field_0 as art_id, art.string_field_2 as name, art.string_field_4 as begin_date, r.string_field_2 as recording, r.int64_field_0 as id
from musicbrainz_modeled.Artist as art
join musicbrainz_modeled.Recording as r
on art.int64_field_0 = r.int64_field_3
where safe_cast(art.string_field_4 as int64) > 2000
order by r.int64_field_0
limit 12

Unnamed: 0,art_id,name,begin_date,recording,id
0,1060498,Fly Golden Eagle,2007,Deserted Soldier,6818
1,1060499,Polarbeers,2007,Ye Rambling Boys of Pleasure,6819
2,1301452,Harsh,2011,Raida,23957
3,1568366,Archætype,2017,The Magick Bird of Chomo-Lung-Ma,24047
4,1698379,Beachcomber,2017,Mosquito Machine,25308
5,102556,Slumber,2002,Calling All Cars,29576
6,844005,Fixion,2001,Storm (On & On),33276
7,844005,Fixion,2001,Missing You,33278
8,98844,The Wombats,2003,Utter Frustration,40851
9,997289,Absvrdist,2011,Kill City,41232


This query finds places in the United States ordered by the address. It is potentially useful to exclude certain addresses if they are not specific enough, for example if they do not contain numbers. Places that do not exist in an area will be excluded. 

In [49]:
%%bigquery
select a.int64_field_0 as a_id, a.string_field_2 as area,
p.int64_field_0 as p_id, p.string_field_2 as place, p.string_field_4 as address
from musicbrainz_modeled.Area as a
right join musicbrainz_modeled.Place as p on safe_cast(p.string_field_5 as int64) = a.int64_field_0
where a.int64_field_0 = 222
order by p.string_field_4 DESC
limit 12

Unnamed: 0,a_id,area,p_id,place,address
0,222,United States,32890,REDCAT Center,"Walt Disney Concert Hall, 631 W 2nd St, Los An..."
1,222,United States,14429,Thrill Hill Recording,USA
2,222,United States,41612,Bristol Sessions Recordings,"State Street, Bristol, Tennessee"
3,222,United States,35394,Mantra Studios,"San Mateo, CA"
4,222,United States,41346,Oregon State Penitentiary,"Salem, Oregon"
5,222,United States,33461,RB Productions,"Pacific Grove, CA"
6,222,United States,33785,The Barn,"North Salem, NY"
7,222,United States,24093,Annandale Recording,New Jersey
8,222,United States,30849,New Jersey City University,"Jersey City, New Jersey"
9,222,United States,31649,Asylomar Productions,"Huntington Beach, CA"
