# Information Systems for Engineers Fall 2022 - Cheat Sheet

During the exam, you will be required to write SQL queries using a Jupyter notebook.

This notebook is designed to help you start writing your queries by providing you an environment with the datasets loaded and a simple query that you can use to recap the syntax of SQL.

Feel free to extend this notebook and use it for preparing the answers you need for the exam. Take into account that the content of this notebook will not be considered for grading.

## SQL

There is a local PostgreSQL 14.5 installation with a dataset loaded into a database. Run the next cell to connect to it.

In [1]:
%load_ext sql
%sql  postgresql://postgres:example@db 

To print the tables currently loaded in the database run:

In [2]:
%%sql

SELECT * 
FROM INFORMATION_SCHEMA.TABLES 
WHERE TABLE_TYPE = 'BASE TABLE' and TABLE_CATALOG = 'postgres' and TABLE_SCHEMA = 'public';

 * postgresql://postgres:***@db
39 rows affected.


table_catalog,table_schema,table_name,table_type,self_referencing_column_name,reference_generation,user_defined_type_catalog,user_defined_type_schema,user_defined_type_name,is_insertable_into,is_typed,commit_action
postgres,public,categories,BASE TABLE,,,,,,YES,NO,
postgres,public,customers,BASE TABLE,,,,,,YES,NO,
postgres,public,nwemployees,BASE TABLE,,,,,,YES,NO,
postgres,public,employeeterritories,BASE TABLE,,,,,,YES,NO,
postgres,public,order_details,BASE TABLE,,,,,,YES,NO,
postgres,public,orders,BASE TABLE,,,,,,YES,NO,
postgres,public,products,BASE TABLE,,,,,,YES,NO,
postgres,public,region,BASE TABLE,,,,,,YES,NO,
postgres,public,shippers,BASE TABLE,,,,,,YES,NO,
postgres,public,suppliers,BASE TABLE,,,,,,YES,NO,


To print the attributes of a particular table (`players`, for example) run:

In [3]:
%%sql

SELECT column_name, data_type, character_maximum_length
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE table_name = 'players';

 * postgresql://postgres:***@db
10 rows affected.


column_name,data_type,character_maximum_length
player_id,integer,
dob,date,
player_id,smallint,
dob,date,
country_id,character,3.0
country_id,character,3.0
first_name,character varying,16.0
first_name,character varying,64.0
last_name,character varying,64.0
last_name,character varying,32.0


## Complex query example

More complex PostgreSQL queries would look like:

In [4]:
%%sql
SELECT players.country_id, COUNT(players.player_id)
    FROM players INNER JOIN ranking ON players.player_id = ranking.player_id
    WHERE ranking.rank < 10
GROUP BY players.country_id
ORDER BY players.country_id;

 * postgresql://postgres:***@db
20 rows affected.


country_id,count
ARG,541
AUS,265
AUT,76
BEL,11
BUL,29
CAN,120
CHI,62
CRO,119
CYP,9
CZE,296


## Exam database − data from the rankings of the Association of Tennis Professionals (ATP)

The dataset consists of relations containing real world information such as players, their statistics, rankings and tournaments. A shortened version of the dataset was used in the quiz throughout the course.

Here is some basic information on the database tables.

### 1) `players` table - shows the players born after 1980 and that have ever reached the top 500 in the rankings

| attribute name | description| 
|:---|:---|
|   `player_id`|   uniquely identifies a player|
|   `first_name`|   player's first name|
|   `last_name`|   player's last name|
|   `dob`|   player's date of birth|
|   `country_id`|   country code of player's nationality|

In [5]:
%%sql
SELECT * FROM players LIMIT 4;

 * postgresql://postgres:***@db
4 rows affected.


player_id,first_name,last_name,dob,country_id
4369,Yu,Wang,1984-05-19,CHN
32860,Omar,El Gazzar,1992-08-13,EGY
27789,Vincent,Schutte,1993-11-10,NED
36490,Leroy,Miller,1993-08-17,AUS


### 2) `ranking` table - shows the top 10 rankings over time

| attribute name | description| 
|:---|:---|
|   `player_id`|   uniquely identifies a player|
|   `rank`|   player's ranking|
|   `rank_date`|   date of the reported ranking|
|   `rank_points`|   player's ranking points|

In [6]:
%%sql
SELECT * FROM ranking LIMIT 4;

 * postgresql://postgres:***@db
4 rows affected.


rank_date,player_id,rank,rank_points
1998-06-08,3498,68,661
1998-06-15,3498,68,661
1998-06-22,3498,72,661
1998-07-06,3498,73,661


### 3) `tournaments` table - shows the tournaments

| attribute name | description| 
|:---|:---|
|   `tournament_id`|   uniquely identifies a tournament|
|   `name`|   name of the tournament|
|   `surface`|   type of surface at a tournament|

In [7]:
%%sql
SELECT * FROM tournaments LIMIT 4;

 * postgresql://postgres:***@db
4 rows affected.


tournament_id,name,surface
366,Leicester,G
285,New Haven,H
5,Bournemouth,C
59,Fort Worth,H


### 4) `matches` table - shows matches after and including 2015 of the players that ever occurred in the top 10 list

| attribute name | description| 
|:---|:---|
|   `match_id`|   uniquely identifies a record of a match|
|   `player_id`|   uniquely identifies a player|
|   `tournament_id`|   uniquely identifies a tournament|
|   `season`|   year the match was played|
|   `outcome`|   outcome of the match: *1* if player won, *0* if player lost|
|   `bp_fc`|   number of break points the player faced|
|   `bp_sv`|   number of break points the player saved|
|   `round`|   round of the match during a tournament|
|   `o_rank`|   rank of opponent|
|   `minutes`|   duration of the match in minutes|

In [8]:
%%sql
SELECT * FROM matches LIMIT 4;

 * postgresql://postgres:***@db
4 rows affected.


match_id,player_id,tournament_id,round,season,opponent_rank,p_matches,p_bp_sv,p_bp_fc,minutes
108213,3598,291,R64,1999,47,1,4,6,86
108210,3720,291,R64,1999,50,1,8,13,109
108329,3498,287,R32,1999,11,1,2,2,84
108938,3720,290,SF,1999,66,1,2,3,69


### 5) `stats` table - shows statistics of the players who are in the top players table

| attribute name | description| 
|:---|:---|
|   `player_id`|   uniquely identifies a player|
|   `p_matches`|   number of matches the player won (e.g., 3)|
|   `o_matches`|   number of matches the opponent won (e.g., 5)|
|   `p_sets`|   number of sets the player won (e.g. 2)|
|   `o_sets`|   number of sets the opponent won (e.g. 1)|

In [9]:
%%sql
SELECT * FROM stats LIMIT 4;

 * postgresql://postgres:***@db
4 rows affected.


player_id,p_matches,o_matches,p_sets,o_sets
3565,37,72,107,179
44778,0,1,0,3
3762,0,1,1,2
4000,2,9,12,26


**PLEASE NOTE TO TRY TO LIMIT YOUR OUTPUT. SOME OF THE TABLES ARE LARGE AND PRINTING ALL OF THEM MIGHT SLOW YOUR WORK ENVIRONMENT.**

##### Note: the examples provided above do not contain all the query operations you might need during the exam.

Now its your turn, you can write all your queries in new cells below. Feel free to add as many cells as needed.

In [10]:
%%sql
SELECT COUNT(player_id)
FROM players p
LIMIT 10;

 * postgresql://postgres:***@db
1 rows affected.


count
28686


In [11]:
%%sql 

UsageError: %%sql is a cell magic, but the cell body is empty. Did you mean the line magic %sql (single %)?


In [12]:
%%sql

UsageError: %%sql is a cell magic, but the cell body is empty. Did you mean the line magic %sql (single %)?


In [13]:
%%sql 

UsageError: %%sql is a cell magic, but the cell body is empty. Did you mean the line magic %sql (single %)?


In [14]:
%%sql 

UsageError: %%sql is a cell magic, but the cell body is empty. Did you mean the line magic %sql (single %)?


In [None]:
%%sql 

In [None]:
%%sql 

In [None]:
%%sql

In [None]:
%%sql

In [None]:
%%sql

In [None]:
%%sql