# Databases

## Rank Scores

Write a SQL query which given an arbitrary range will assign a unique rank to each rank in a column
named Score.
```
+----+-------+
| Id | Score |
+----+-------+
| 1  | 3.50  |
| 2  | 3.65  |
| 3  | 4.00  |
| 4  | 3.85  |
| 5  | 4.00  |
| 6  | 3.65  |
+----+-------+
```

Given the above table, your query should generate the following report:
```
+-------+------+
| Score | Rank |
+-------+------+
| 4.00  | 1    |
| 4.00  | 2    |
| 3.85  | 3    |
| 3.65  | 4    |
| 3.65  | 5    |
| 3.50  | 6    |
+-------+------+
```

#### Environment
* Developed with Python 3.5.5, installed via Anaconda

### Thought process
* Re-read the problem, and I can now see that the arbitrary range is an 'arbitrary range of **existing** values'! Now it makes sense.. I was thinking of writing a function to generate random data
    * After solving the problem using MySQL 5.6 on SQLFiddle, I am hesitant to put more time into this challenge - especially as SQLLite3 has limited support in some areas.
* Assuming for this exercise, the score type is DOUBLE with 4 digits, 2 after the point (00.00) - imagining a percentage range.
* Found this example, https://stackoverflow.com/a/7747342, row_number probably what I want for this but that seems so wrong. Maybe Rank/Dense_rank takes parameters
* Found another comparison of rank/dense_rank/row_number: https://blog.jooq.org/2013/10/09/sql-trick-row_number-is-to-select-what-dense_rank-is-to-select-distinct/
* I'll go ahead with row_number for now anyway just to have a solution
* Hmmm. Sqlite doesn't support RANK or ROW_NUMBER...
* Tried several other bizarre queries to get it working, but no solution. In the end, I used SQLFiddle to get a solution which works with MySQL 5.6 


#### A summary of my search history:
1. `sql query range of numbers` - find out how to generate a range through a query. Found https://stackoverflow.com/a/33146869, which pointed to **SQLFiddle** (handy)
2. `sql rank unique` - found dense_rank mentioned, seems required.
3. Found this example, https://stackoverflow.com/a/7747342, row_number probably what I want for this but that seems so wrong. 
4. Several frantic repeated searches for SQLite3 row_number work-arounds. Gave up eventually and ran MySQL 5.6 online.

### Solution
* **Solution can not be provided in the notebook**
* http://sqlfiddle.com/#!9/e6180c/3/0
* Had to do it online - cells below show some intermediate attempts and a possible untested solution

#### Prerequisite imports and declarations

In [None]:
try:
    import sqlite3
    from tabulate import tabulate
except ImportError:
    print("Pre-requisites are missing, you can probably install them in a cell by running: '!pip install <name>'")
    raise

conn = sqlite3.connect('ex5_ranking.db')
table_name = "scores"
initial_scores = [3.5, 3.65, 4., 3.85, 4., 3.65]

#### Setup

In [None]:
sql_comm_droptable = """
DROP TABLE IF EXISTS {table}
"""

sql_comm_createtable = """
CREATE TABLE IF NOT EXISTS {table} (
 id integer PRIMARY KEY,
 score DOUBLE(4,2) NOT NULL
);
"""

sql_comm_insertvals = """
INSERT INTO {table}
(score)
VALUES {values};
"""

try:
    c = conn.cursor()
    c.execute(sql_comm_droptable.format(table=table_name))
    c.execute(sql_comm_createtable.format(table=table_name))
    c.execute(sql_comm_insertvals.format(
        table=table_name,
        values=",".join("(%s)" % s for s in initial_scores)
    ))
except Exception as e:
    print(e)    

#### Render the table

In [None]:
c = conn.cursor()
rows = c.execute("SELECT * from %s;" % table_name)
print(tabulate(rows, ['id', 'score'], tablefmt='psql'))

In [None]:
#####################
# ATTEMPT 1
# Theoretical
#####################
# I think that this will work, but it doesn't work in Sqlite3 
# (might need to move or change the DESC part)

theoretical_solution="""
SELECT
  score
  ROW_NUMBER() OVER (ORDER BY score DESC) AS rank
FROM {table};
"""

c = conn.cursor()
rows = c.execute(sql_comm_rank.format(table=table_name))
print(tabulate(rows, ['rank', 'score'], tablefmt='psql'))

In [None]:
#####################
# ATTEMPT 2
# RANK() emulated
#####################
# At least this works, but it's RANK() output, not ROW_NUMBER()

sql_comm_rank="""
SELECT id, score,
  (SELECT count(*)+1 FROM {table} AS r 
      WHERE r.score > {table}.score) as rank
  from {table} ORDER by rank;
"""

c = conn.cursor()
rows = c.execute(sql_comm_rank.format(table=table_name))
print(tabulate(rows, ['score', 'rank'], tablefmt='psql'))

In [None]:
#####################
# SOLUTION (MySQL 5.6)
# row_number using MySQL
#####################
# Working but only available on sqlfiddle

# --------------------------------
# http://sqlfiddle.com/#!9/e6180c/3/0
# --------------------------------

sql_comm_rank="""
    SET @row_number = 0;
    SELECT 
        score, (@row_number:=@row_number + 1) AS rank
    FROM scores ORDER by score DESC;
"""

In [None]:
conn.close()