# Weather Observation Station 20
https://www.hackerrank.com/challenges/weather-observation-station-20/problem

A median is defined as a number separating the higher half of a data set from the lower half. Query the median of the Northern Latitudes (LAT_N) from STATION and round your answer to 4 decimal places.

The STATION table is described as follows:

| Field | Type |
|----|-----|
| ID  | NUMBER   |
| CITY  | VARCHAR2(21)   |
| STATE  | VARCHAR2(2)   |
| LAT_N  | NUMBER   |
| LONG_W  | NUMBER   |



In [1]:
import re
import pandas as pd
from sqlalchemy import create_engine

In [2]:
path_to_file = "../../../data/sql_data/median_problem_input.txt"

In [3]:
def read_in_txt(filepath):
    pattern = "(?<=\\d)*(?=[A-Z]{2})"
    columns = ["id", "city", "state", "lat_n", "long_w"]
    records = []
    with open(path_to_file, "r") as f:
        for line in f:
            if line == "\n":
                pass
            else:
                split_str = re.split(pattern, line)
                first_part = split_str[0].strip().split(" ", 1)
                record = {k: v.strip() for k, v in zip(columns, first_part)}
                record["id"]= int(record["id"])
                record_2 = {k: v for k, v in zip(columns[2:], split_str[1].split())}
                record_2["long_w"] = float(record_2["long_w"])
                record_2["lat_n"] = float(record_2["lat_n"])
                record.update(record_2)
                records.append(record)
    return records

In [4]:
records = read_in_txt(filepath=path_to_file)

In [5]:
df = pd.DataFrame(records)

In [6]:
df.head()

Unnamed: 0,id,city,state,lat_n,long_w
0,1,Pfeifer,KS,37.44478,65.684913
1,3,Hesperia,CA,106.056929,71.118767
2,4,South Britain,CT,65.584219,33.605044
3,11,Crescent City,FL,58.039642,117.904074
4,14,Forest,MS,120.283076,50.228834


In [7]:
df.shape

(499, 5)

In [8]:
engine = create_engine('sqlite:///sql_problems.db')
# conn = engine.connect()

In [9]:
df.head()

Unnamed: 0,id,city,state,lat_n,long_w
0,1,Pfeifer,KS,37.44478,65.684913
1,3,Hesperia,CA,106.056929,71.118767
2,4,South Britain,CT,65.584219,33.605044
3,11,Crescent City,FL,58.039642,117.904074
4,14,Forest,MS,120.283076,50.228834


In [10]:
df.to_sql(name="station", con=engine, if_exists="replace")

In [11]:
# Find median on with odd number of records
statement = \
"""
SELECT CAST(ROUND(ordered.LAT_N, 4) AS DECIMAL(8, 4))
FROM 
    (SELECT LAT_N, ROW_NUMBER() OVER (ORDER BY LAT_N ASC) AS Row
     FROM STATION) AS ordered
WHERE ordered.Row =  (SELECT COUNT(*)/2 + 1 FROM STATION)
"""

In [12]:
res_df = pd.read_sql(statement, con=engine)

In [13]:
res_df

Unnamed: 0,"CAST(ROUND(ordered.LAT_N, 4) AS DECIMAL(8, 4))"
0,83.8913


In [14]:
df.lat_n.median().round(4)

83.8913

In [18]:
statement = \
"""
declare @n_rows int
set @n_rows = (select count(*) from STATION)
if @n_rows % 2 = 0
    select cast(ROUND(AVG(LAT_N), 4) as decimal(18,4))
    from (select LAT_N, ROW_NUMBER() over (order by LAT_N) as RN from STATION) as t2
    where t2.RN = @n_rows/2 or t2.RN = @n_rows/2 + 1 --even
else --odd
    select cast(ROUND(LAT_N, 4) as decimal(18,4))
    from (select LAT_N, ROW_NUMBER() over (order by LAT_N) as RN from STATION) as t2
    where t2.RN = ROUND(@n_rows/2 + 1, 1)
"""  

In [19]:
res_df = pd.read_sql(statement, con=engine)

OperationalError: (sqlite3.OperationalError) near "declare": syntax error
[SQL: 
declare @n_rows int
set @n_rows = (select count(*) from STATION)
if @n_rows % 2 = 0
    select cast(ROUND(AVG(LAT_N), 4) as decimal(18,4))
    from (select LAT_N, ROW_NUMBER() over (order by LAT_N) as RN from STATION) as t2
    where t2.RN = @n_rows/2 or t2.RN = @n_rows/2 + 1 --even
else --odd
    select cast(ROUND(LAT_N, 4) as decimal(18,4))
    from (select LAT_N, ROW_NUMBER() over (order by LAT_N) as RN from STATION) as t2
    where t2.RN = ROUND(@n_rows/2 + 1, 1)
]
(Background on this error at: http://sqlalche.me/e/13/e3q8)

In [24]:
res_df

Unnamed: 0,LAT_N,Row
0,85.761347,125


# Consecutive numbers

https://leetcode.com/problems/consecutive-numbers/

SQL Schema
Table: Logs

| Column Name | Type    |
|-------------|---------|
| id          | int     |
| num         | varchar |


id is the primary key for this table.
 

Write an SQL query to find all numbers that appear at least three times consecutively.

Return the result table in any order.

The query result format is in the following example:

 

Logs table:

| Id | Num |
|----|-----|
| 1  | 1   |
| 2  | 1   |
| 3  | 1   |
| 4  | 2   |
| 5  | 1   |
| 6  | 2   |
| 7  | 2   |


Result table:

| ConsecutiveNums |
|-----------------|
| 1               |

1 is the only number that appears consecutively for at least three times.

In [10]:
create_table_statment = \
"""
Create table If Not Exists Logs (Id int, Num int)
Truncate table Logs
insert into Logs (Id, Num) values ('1', '1')
insert into Logs (Id, Num) values ('2', '1')
insert into Logs (Id, Num) values ('3', '1')
insert into Logs (Id, Num) values ('4', '2')
insert into Logs (Id, Num) values ('5', '1')
insert into Logs (Id, Num) values ('6', '2')
insert into Logs (Id, Num) values ('7', '2')
"""

In [12]:
statement = \
"""
select distinct a.num as ConsecutiveNums 
from logs a 
inner join logs b on a.id = b.id-1
inner join logs c on b.id = c.id-1
where a.num = b.num and b.num = c.num
"""