## HackerRank SQL Problems
solved in both SQL and pandas

## 1. Revising the Select Query I

Query all columns for all American cities in **CITY** with populations larger than 100000. The CountryCode for America is USA.

**Input Format**

The **CITY** table is described as follows: ![Problem1Table](https://s3.amazonaws.com/hr-challenge-images/8137/1449729804-f21d187d0f-CITY.jpg)

mysql solution
```mysql
select * 
from CITY
where countrycode='USA' and population>100000
```

pandas
```python
df=read.csv('table.csv')
df.loc[(df['population']>100000)&(df['countrycode']=='USA')]
```

## 2. Revising the Select Query II

Query the names of all American cities in **CITY** with populations larger than 120000. The CountryCode for America is USA.

**Input Format**

The **CITY** table is described as follows: ![Problem1Table](https://s3.amazonaws.com/hr-challenge-images/8137/1449729804-f21d187d0f-CITY.jpg)

mysql solution
```mysql
select distinct name
from CITY
where countrycode='USA' and population>120000
```

pandas
```python
df=read.csv('table.csv')
df.loc[(df['population']>120000)&(df['countrycode']=='USA'),'name'].unique()
```

## 3. Select By ID

Query all columns for a city in CITY with the ID 1661.

**Input Format**

The **CITY** table is described as follows: ![Problem1Table](https://s3.amazonaws.com/hr-challenge-images/8137/1449729804-f21d187d0f-CITY.jpg)

mysql solution
```mysql
select *
from CITY
where id=1661
```

pandas solution
```python
df=read.csv('table.csv',index_col='id')
df.loc[1661]
```

## 4. Japanese Cities' Attributes

Query all attributes of every Japanese city in the CITY table. The COUNTRYCODE for Japan is JPN

**Input Format**

The **CITY** table is described as follows: ![Problem1Table](https://s3.amazonaws.com/hr-challenge-images/8137/1449729804-f21d187d0f-CITY.jpg)

mysql solution
```mysql
select *
from CITY
where countrycode='JPN'
```

pandas
```python
df=read.csv('table.csv')
df.loc[(df['countrycode']=='JPN')]
```

## 5. Japanese Cities' Names

Query the names of all the Japanese cities in the CITY table. The COUNTRYCODE for Japan is JPN.

**Input Format**

The **CITY** table is described as follows: ![Problem1Table](https://s3.amazonaws.com/hr-challenge-images/8137/1449729804-f21d187d0f-CITY.jpg)

mysql solution
```mysql
select distinct name
from CITY
where countrycode='JPN'
```

pandas
```python
df=read.csv('table.csv')
df.loc[(df['countrycode']=='JPN'),'name'].unique()
```

## 6. Weather Observation Station 1

Query a list of CITY and STATE from the **STATION** table.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
select city,state
from station
```

## 7. Weather Observation Station 3

Query a list of CITY names from STATION with even ID numbers only. You may print the results in any order, but must exclude duplicates from your answer.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
select distinct city
from station
where id%2=0
```

## 7. Weather Observation Station 4

Let N be the number of CITY entries in STATION, and let N' be the number of distinct CITY names in STATION; query the value of N-N' from STATION. In other words, find the difference between the total number of CITY entries in the table and the number of distinct CITY entries in the table.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
select count(city)-count(distinct city)
from station
```

pandas
```python
df=read.csv('table.csv',index_col='id')
df.shape[0]-df['city'].nunique()
```

## 8. Weather Observation Station 5

Query the two cities in STATION with the shortest and longest CITY names, as well as their respective lengths (i.e.: number of characters in the name). If there is more than one smallest or largest city, choose the one that comes first when ordered alphabetically.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
(select city, char_length(city) as length from station order by char_length(city),city asc limit 1)
union all
(select city, char_length(city) as length from station order by char_length(city) desc,city asc limit 1)
```

pandas
```python
df=read.csv('table.csv',index_col='id')
df['namelength']=df['city'].str.len()
df=df.sort_values(['namelength','city'],ascending=[False,True])
df.iloc[[0,-1]]
```

## 9. Weather Observation Station 6

Query the list of CITY names starting with vowels (i.e., a, e, i, o, or u) from STATION. Your result cannot contain duplicates.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
select distinct city
from station
where left(city,1) in ('a','e','i','o','u')
```

pandas
```python
df=read.csv('table.csv',index_col='id')
df.loc[df['city'].str[0].isin(['a','e','i','o','u']),'city'].unique()
```

## 10. Weather Observation Station 7

Query the list of CITY names ending with vowels (a, e, i, o, u) from STATION. Your result cannot contain duplicates.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
select distinct city
from station
where right(city,1) in ('a','e','i','o','u')
```

pandas
```python
df=read.csv('table.csv',index_col='id')
df.loc[df['city'].str[-1].isin(['a','e','i','o','u']),'city'].unique()
```

## 11. Weather Observation Station 8

Query the list of CITY names from STATION which have vowels (i.e., a, e, i, o, and u) as both their first and last characters. Your result cannot contain duplicates.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
select distinct city
from station
where right(city,1) in ('a','e','i','o','u') and left(city,1) in ('a','e','i','o','u')
```

pandas
```python
df=read.csv('table.csv',index_col='id')
startswithvowel=df['city'].str[0].isin(['a','e','i','o','u'])
endswithvowel=df['city'].str[-1].isin(['a','e','i','o','u'])
df.loc[(startswithvowel)&(endswithvowel,'city'].unique()
```

## 12. Weather Observation Station 9

Query the list of CITY names from STATION that do not start with vowels. Your result cannot contain duplicates.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
select distinct city
from station
where left(city,1) not in ('a','e','i','o','u')
```

pandas
```python
df=read.csv('table.csv',index_col='id')
startswithvowel=df['city'].str[0].isin(['a','e','i','o','u'])
df.loc[(~startswithvowel),'city'].unique()
```

## 13. Weather Observation Station 10

Query the list of CITY names from STATION that do not end with vowels. Your result cannot contain duplicates.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
select distinct city
from station
where right(city,1) not in ('a','e','i','o','u')
```

pandas
```python
df=read.csv('table.csv',index_col='id')
endswithvowel=df['city'].str[-1].isin(['a','e','i','o','u'])
df.loc[(~endswithvowel),'city'].unique()
```

## 14. Weather Observation Station 11

Query the list of CITY names from STATION that either do not start with vowels or do not end with vowels. Your result cannot contain duplicates.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
select distinct city
from station
where right(city,1) not in ('a','e','i','o','u') or left(city,1) not in ('a','e','i','o','u')
```

pandas
```python
df=read.csv('table.csv',index_col='id')
startswithvowel=df['city'].str[0].isin(['a','e','i','o','u'])
endswithvowel=df['city'].str[-1].isin(['a','e','i','o','u'])
df.loc[(~startswithvowel)|(~endswithvowel,'city'].unique()
```

## 15. Weather Observation Station 12

Query the list of CITY names from STATION that do not start with vowels and do not end with vowels. Your result cannot contain duplicates.

**Input Format**

The **STATION** table is described as follows: ![StationTable](https://s3.amazonaws.com/hr-challenge-images/9336/1449345840-5f0a551030-Station.jpg)
where LAT_N is the northern latitude and LONG_W is the western longitude.

mysql
```mysql
select distinct city
from station
where right(city,1) not in ('a','e','i','o','u') and left(city,1) not in ('a','e','i','o','u')
```

pandas
```python
df=read.csv('table.csv',index_col='id')
startswithvowel=df['city'].str[0].isin(['a','e','i','o','u'])
endswithvowel=df['city'].str[-1].isin(['a','e','i','o','u'])
df.loc[(~startswithvowel)&(~endswithvowel,'city'].unique()
```

## 16. Higher Than 75 Marks

Query the Name of any student in STUDENTS who scored higher than 75 Marks. Order your output by the last three characters of each name. If two or more students both have names ending in the same last three characters (i.e.: Bobby, Robby, etc.), secondary sort them by ascending ID.

**Input Format**

The STUDENTS table is described as follows: ![StudentsTable](https://s3.amazonaws.com/hr-challenge-images/12896/1443815243-94b941f556-1.png)

The Name column only contains uppercase (A-Z) and lowercase (a-z) letters.

**Sample Input**
![SampleInput](https://s3.amazonaws.com/hr-challenge-images/12896/1443815209-cf4b260993-2.png)

mysql
```mysql
select Name
from students
where Marks>75
order by right(Name,3),id
```

pandas
```python
df=read.csv('table.csv',index_col='id')
df['substring']=df['Name'].str[-3:]
df.index.rename('indexname',inplace=True)
df=df.sort_values(['substring','indexname'])
df.loc[(df['Marks']>75),'Name']
```

## 17. Employee Names

Write a query that prints a list of employee names (i.e.: the name attribute) from the Employee table in alphabetical order.

Input Format

The Employee table containing employee data for a company is described as follows: ![EmployeeTable](https://s3.amazonaws.com/hr-challenge-images/19629/1458557872-4396838885-ScreenShot2016-03-21at4.27.13PM.png)

where employee_id is an employee's ID number, name is their name, months is the total number of months they've been working for the company, and salary is their monthly salary.

**Sample Input**
![SampleInput](https://s3.amazonaws.com/hr-challenge-images/19629/1458558202-9a8721e44b-ScreenShot2016-03-21at4.32.59PM.png)

**Sample Output**
```
Angela
Bonnie
Frank
Joe
Kimberly
Lisa
Michael
Patrick
Rose
Todd
```

mysql
```mysql
select name
from employee order by name asc
```

## 18. Employee Salaries

Write a query that prints a list of employee names (i.e.: the name attribute) for employees in Employee having a salary greater than $20000 per month who have been employees for less than 10 months. Sort your result by ascending employee_id.

Input Format

The Employee table containing employee data for a company is described as follows: ![EmployeeTable](https://s3.amazonaws.com/hr-challenge-images/19629/1458557872-4396838885-ScreenShot2016-03-21at4.27.13PM.png)

where employee_id is an employee's ID number, name is their name, months is the total number of months they've been working for the company, and salary is their monthly salary.

**Sample Input**
![SampleInput](https://s3.amazonaws.com/hr-challenge-images/19629/1458558202-9a8721e44b-ScreenShot2016-03-21at4.32.59PM.png)

**Sample Output**
```
Angela
Bonnie
Frank
Joe
Kimberly
Lisa
Michael
Patrick
Rose
Todd
```

In [40]:
df.loc[(df[0].str[-1].isin(['a','e','i','o','u']))&(df[0].str[0].isin(['a','e','i','o','u'])),0].unique()

array([], dtype=object)

In [47]:
df.loc[df[0].str[-3:].sort_values().index]

Unnamed: 0,0,namelength
1,asdkjhg,7
0,asdlkjg,7
2,asdlkfjwlektjlhk,16
3,California,10


In [50]:
df['substring']=df[0].str[-3:]

In [61]:
df.sort_values(['substring','indexname'])

Unnamed: 0_level_0,0,namelength,substring
indexname,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,asdkjhg,7,jhg
0,asdlkjg,7,kjg
2,asdlkfjwlektjlhk,16,lhk
3,California,10,nia


In [60]:
df.index

Int64Index([2, 3, 1, 0], dtype='int64', name='indexname')