# Self join

## Edinburgh Buses
[Details of the database](https://sqlzoo.net/wiki/Edinburgh_Buses.) Looking at the data

```
stops(id, name)
route(num, company, pos, stop)
```

In [1]:
library(tidyverse)
library(DBI)
library(getPass)
drv <- switch(Sys.info()['sysname'],
             Windows="PostgreSQL Unicode(x64)",
             Darwin="/usr/local/lib/psqlodbcw.so",
             Linux="PostgreSQL")
con <- dbConnect(
  odbc::odbc(),
  driver = drv,
  Server = "localhost",
  Database = "sqlzoo",
  UID = "postgres",
  PWD = getPass("Password?"),
  Port = 5432
)
options(repr.matrix.max.rows=20)

-- [1mAttaching packages[22m --------------------------------------- tidyverse 1.3.0 --

[32mv[39m [34mggplot2[39m 3.3.0     [32mv[39m [34mpurrr  [39m 0.3.4
[32mv[39m [34mtibble [39m 3.0.1     [32mv[39m [34mdplyr  [39m 0.8.5
[32mv[39m [34mtidyr  [39m 1.0.2     [32mv[39m [34mstringr[39m 1.4.0
[32mv[39m [34mreadr  [39m 1.3.1     [32mv[39m [34mforcats[39m 0.5.0

-- [1mConflicts[22m ------------------------------------------ tidyverse_conflicts() --
[31mx[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31mx[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()



Password? ·········


## 1.
How many **stops** are in the database.

In [2]:
stops <- dbReadTable(con, 'stops')
route <- dbReadTable(con, 'route')

In [3]:
stops %>% tally

n
<int>
246


## 2.
Find the **id** value for the stop 'Craiglockhart'

In [4]:
stops %>% 
    filter(name=='Craiglockhart') %>% 
    select(id)

id
<int>
53


## 3.
Give the **id** and the **name** for the **stops** on the '4' 'LRT' service.

In [5]:
stops %>% 
    inner_join(route, by=c(id="stop")) %>%
    filter(num=='4' & company=='LRT') %>%
    select(id, name)

id,name
<int>,<chr>
19,Bingham
53,Craiglockhart
85,Fairmilehead
115,Haymarket
117,Hillend
149,London Road
177,Northfield
179,Oxgangs
194,Princes Street


## 4. Routes and stops

The query shown gives the number of routes that visit either London Road (149) or Craiglockhart (53). Run the query and notice the two services that link these stops have a count of 2. Add a HAVING clause to restrict the output to these two routes.

In [6]:
route %>% 
    filter(stop==149 | stop==53) %>%
    group_by(company, num) %>% 
    summarise(n_route=n()) %>%
    filter(n_route==2)

company,num,n_route
<chr>,<chr>,<int>
LRT,4,2
LRT,45,2


## 5.
Execute the self join shown and observe that b.stop gives all the places you can get to from Craiglockhart, without changing routes. Change the query so that it shows the services from Craiglockhart to London Road.

In [7]:
route %>% 
    inner_join(route, by=c(company="company", num="num")) %>%
    filter(stop.x==53 & stop.y==149) %>%
    select(company, num, stop.x, stop.y)

company,num,stop.x,stop.y
<chr>,<chr>,<int>,<int>
LRT,4,53,149
LRT,45,53,149


## 6.
The query shown is similar to the previous one, however by joining two copies of the **stops** table we can refer to **stops** by **name** rather than by number. Change the query so that the services between 'Craiglockhart' and 'London Road' are shown. If you are tired of these places try 'Fairmilehead' against 'Tollcross'

In [8]:
route %>% 
    inner_join(stops, by=c(stop="id")) %>% 
    inner_join(route %>%
               inner_join(stops, by=c(stop="id")),
               by=c(company="company", num="num")
              ) %>%
    filter(name.x=='Craiglockhart' &
           name.y=='London Road') %>%
    select(company, num, name.x, name.y)

company,num,name.x,name.y
<chr>,<chr>,<chr>,<chr>
LRT,4,Craiglockhart,London Road
LRT,45,Craiglockhart,London Road


## 7. [Using a self join](https://sqlzoo.net/wiki/Using_a_self_join)

Give a list of all the services which connect stops 115 and 137 ('Haymarket' and 'Leith')

In [9]:
route %>% 
    inner_join(route, by=c(company="company", num="num")) %>%
    filter(stop.x==115 & stop.y==137) %>%
    distinct(company, num)

company,num
<chr>,<chr>
LRT,12
LRT,2
LRT,22
LRT,25
LRT,2A
SMT,C5


## 8.
Give a list of the services which connect the stops 'Craiglockhart' and 'Tollcross'

In [10]:
route %>% 
    inner_join(stops, by=c(stop="id")) %>% 
    inner_join(route %>%
               inner_join(stops, by=c(stop="id")),
               by=c(company="company", num="num")
              ) %>%
    filter(name.x=='Craiglockhart' & 
           name.y=='Tollcross') %>%
    distinct(company, num)

company,num
<chr>,<chr>
LRT,10
LRT,27
LRT,45
LRT,47


## 9.
Give a distinct list of the **stops** which may be reached from 'Craiglockhart' by taking one bus, including 'Craiglockhart' itself, offered by the LRT company. Include the company and bus no. of the relevant services.

In [11]:
route %>% 
    inner_join(stops, by=c(stop="id")) %>% 
    inner_join(route %>%
               inner_join(stops, by=c(stop="id")),
               by=c(company="company", num="num")
              ) %>%
    filter(name.x=='Craiglockhart' &
           company=='LRT') %>%
    distinct(name.y, company, num)

name.y,company,num
<chr>,<chr>,<chr>
Colinton,LRT,10
Craiglockhart,LRT,10
Leith,LRT,10
Leith Walk,LRT,10
Muirhouse,LRT,10
Newhaven,LRT,10
Princes Street,LRT,10
Silverknowes,LRT,10
Tollcross,LRT,10
Torphin,LRT,10


## 10.
Find the routes involving two buses that can go from **Craiglockhart** to **Lochend**.
Show the bus no. and company for the first bus, the name of the stop for the transfer,
and the bus no. and company for the second bus.

> _Hint_    
> Self-join twice to find buses that visit Craiglockhart and Lochend, then join those on matching stops.

In [12]:
bus1 <- route %>%
    inner_join(stops, by=c(stop="id")) %>% 
    inner_join(route %>%
               inner_join(stops, by=c(stop="id")),
               by=c(company="company", num="num")
              ) %>%
    filter(name.x=='Craiglockhart')
bus2 <- route %>%
    inner_join(stops, by=c(stop="id")) %>% 
    inner_join(route %>%
               inner_join(stops, by=c(stop="id")),
               by=c(company="company", num="num")
              ) %>%
    filter(name.y=='Lochend')
bus1 %>% 
    inner_join(bus2, by=c(stop.y="stop.x")) %>%
    select(num.x, company.x, name.y.x, num.y, company.y) %>%
    `names<-`(c('num1', 'company1', 'transfer', 'num2', 'company2'))

num1,company1,transfer,num2,company2
<chr>,<chr>,<chr>,<chr>,<chr>
10,LRT,Leith,34,LRT
10,LRT,Leith,35,LRT
10,LRT,Leith,87,LRT
10,LRT,Leith,C5,SMT
10,LRT,Princes Street,65,LRT
10,LRT,Princes Street,C5,SMT
27,LRT,Canonmills,34,LRT
27,LRT,Canonmills,35,LRT
27,LRT,Crewe Toll,20,LRT
4,LRT,Haymarket,65,LRT


In [13]:
dbDisconnect(con)