# Setting up the Weather database

We need to download the database first by using the following code in a *code block* and also upgrade the version of the SQL database (SQLite).
We can run the code by pressing the "▶" play button. This needs to be run once at the begining of the session.




In [None]:
!wget https://essexuniversity.box.com/shared/static/c3vee0c2iclzc9wouhblr9jp5v7lix0o.db -O weather.db &> /dev/null

In order to issue SQL commands we will use the SQLite capabilities of Google Colab by loading the SQL extension with the statement `%load_ext sql`:







In [None]:
# now we can use the magic extension to connect to our SQLite DB
# use %sql to write an inline SQL command
%load_ext sql
# Loads the downloaded database (weather data) inthis case
%sql sqlite:///weather.db
# Shows the sqlite version
%sql SELECT sqlite_version();

 * sqlite:///weather.db
Done.


sqlite_version()
3.37.2


Then we can run SQL queries by using:
- the `%sql` expression for a single line query
- the `%%sql` expression for a multiple line query

## Display all tables
As an example we cal display all the tabled loadded in the database by using the
following statement:

In [None]:

# Display all the table names
# omiting the ones starting with "sqlite_" (internal not to be used directly)
%%sql
SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%';

 * sqlite:///weather.db
Done.


name
cat_locations
cat_postcode_latlong
cat_regions
country
metoffice_dailyweatherdata
metoffice_forecast_text
postcodelatlng
tempW
timezone
weatherType


# Information

## Arithmetic operators
- Add: +
- Subtract/minus: -
- Multiply: *
- Divide: /
- Modulo: %


## Example 1
Working on the table “metoffice_dailyweatherdata”, converting the
temperature from Celsius to Fahrenheit. The output should contain one column with name “Fahrenheit_temperature”.
### Hint
- Tables:
  + metoffice_dailyweatherdata --- temperature
- Information:
  + New column "Fahrenheit_temperature"
- Conditions (filters) in WHERE:
  + None
- Any groups?
  + None

In [None]:
%%sql
SELECT temperature * 1.8 + 32 AS Fahrenheit_temperature
FROM metoffice_dailyweatherdata LIMIT 10;

 * sqlite:///weather.db
Done.


Fahrenheit_temperature
45.5
45.5
46.22
45.5
46.4
46.94
44.42
44.42
44.42
45.32


## Logical operators
- Logical AND: `AND`
- Negates value: `NOT`
- Logical OR: `OR`
- Logical XOR: `XOR`
We will use the `OR` operator to filter the data. Syntax:
```sql
SELECT Field1, Field2
FROM Table_name
WHERE Condition1 OR Condition2
```

## Example 2
 Working on the table `metoffice_dailyweatherdata`, Picking out those
records with either visibility smaller than 2000 or windspeed larger than 20.

In [None]:
%%sql
SELECT *
FROM metoffice_dailyweatherdata
WHERE visibility < 2000 OR windspeed > 20 LIMIT 5;

 * sqlite:///weather.db
Done.


LocationId,obs_dateTime,obs_date,obs_time,temperature,windspeed,humidity,dewpoint,pressure,windgust,visibility,winddirection,pressuretendency,timestamp,rainy,windy,snow,weatherType
3002,2020-01-01 00:00:00,2020-01-01,00:00:00,7.5,21,84.0,5.0,1018,32,13000,,F,2020-01-01 06:00:03,0,0,0,8
3002,2020-01-01 01:00:00,2020-01-01,01:00:00,7.5,22,81.7,4.6,1018,34,12000,,F,2020-01-01 06:00:03,0,0,0,8
3002,2020-01-01 02:00:00,2020-01-01,02:00:00,7.9,24,79.9,4.7,1017,36,11000,,F,2020-01-01 06:00:03,0,0,0,8
3002,2020-01-01 03:00:00,2020-01-01,03:00:00,7.5,23,82.3,4.7,1016,40,13000,,F,2020-01-01 06:00:03,0,0,0,8
3002,2020-01-01 05:00:00,2020-01-01,05:00:00,8.3,24,85.3,6.0,1015,33,11000,,F,2020-01-01 06:00:03,0,0,0,8


## Comparison Operators
- Equal to: `=`
- Greater than: `>`
- Less than: `<`
- Greater than or equal to `>=`
- Less than or equal to `<=`
- Not equal to `<>` or `!=`
- Whether a value is within a range of values: `BETWEEN... And...`
- Test a value: `IS` and `IS NOT` which work as `=` and `!=`. If both null return 1 and 0 respectively.
- NULL/Not NULL value test: `IS NULL` and `IS NOT NULL`
- Simple pattern matching: `LIKE`
- Whether a value is not within a range of values: `NOT Between <value 1> AND <value 2>`
- Negation of simple pattern matching `NOT LIKE`.

We will use the NOT BETWEEN… AND … operator to filter the data outside an interval:
```sql
SELECT Field1, Field2
FROM Table_name
WHERE
Condition NOT BETWEEN value1 AND value2
```

## Example 3
 Working on the table `metoffice_dailyweatherdata`, picking out those
records with the temperature beyond the range of `[-1, 11]`.
### Hint
- Tables: metoffice_dailyweatherdata -- temperature
- Information:
  + All columns in `metoffice_dailyweatherdata`
- Conditions(filters) in WHERE:
  + Temperature is not between -1 and 11 .
- Any groups?
  + No

In [None]:
%%sql
SELECT *
FROM metoffice_dailyweatherdata
WHERE temperature NOT BETWEEN -1 AND 11
LIMIT 5;

 * sqlite:///weather.db
Done.


LocationId,obs_dateTime,obs_date,obs_time,temperature,windspeed,humidity,dewpoint,pressure,windgust,visibility,winddirection,pressuretendency,timestamp,rainy,windy,snow,weatherType
3068,2020-01-01 01:00:00,2020-01-01,01:00:00,-1.3,7,80.4,-3.7,1024,,50000,,F,2020-01-01 06:00:04,0,0,0,2
3091,2020-01-01 01:00:00,2020-01-01,01:00:00,-1.1,7,82.1,-3.3,1025,,45000,,F,2020-01-01 06:00:04,0,0,0,0
3091,2020-01-01 02:00:00,2020-01-01,02:00:00,-1.9,2,80.8,-4.2,1024,,40000,,F,2020-01-01 06:00:04,0,0,0,0
3091,2020-01-01 03:00:00,2020-01-01,03:00:00,-1.1,3,82.6,-3.2,1023,,50000,,F,2020-01-01 06:00:04,0,0,0,0
3162,2020-01-01 00:00:00,2020-01-01,00:00:00,-2.5,1,96.1,-2.7,1031,,7000,,F,2020-01-01 06:00:04,0,0,0,5


## Regular Expressions
Regular expressions follow a special syntax (outside sql) to describe a search pattern in **text**. This allows for more complicated patterns compared to the `LIKE` operator. Syntax:
```
Select <REGEXP Pattern> FROM <table>
```
Some examples of common REGEXP patterns:
- "." Mathces to any character/number/symbol
- "*" Mathes zero/many repetitions of the previous character, e.g. ".*" matches all characters.
- "+" Same as "*" but matches one/many repetitions: e.g. "1+" mathces 1, 11, 111, etc.
- "[123]" Matches the characters in the brackets


## Example 4
Working on the table `metoffice_forecast_text`, return all the records with an
added column `check_for_digits`. The new column `check_for_digits` returns an 0, 1 flag for whether the text in the “forecastText” contains one or more digits.
### Hint
- Tables:  forecastText
- Information:
  + All columns
  + `Check_for_digits`
- Conditions in WHERE:
  + Contain a digit, (at least one number 0,1,...,9)
- Any Groups?
  + no

In [None]:
%%sql
SELECT *, forecastText REGEXP '[0-9]' AS check_for_digits
FROM metoffice_forecast_text
where check_for_digits = 1 LIMIT 5;

 * sqlite:///weather.db
Done.


regionID,region,issuedAt,periodID,period,title,forecastText,createdOn,check_for_digits
500,os,2020-01-01 04:00:00,1,day1to2,Headline:Today:Tonight:Thursday:,"Very windy but mainly dry.A cloudy day. Perhaps a little light rain this morning over Shetland otherwise dry. A few brighter interludes. Quite mild, but fresh to strong southwesterly winds, perhaps touching gale at times over Shetland. Maximum Temperature 8C.Staying dry. Windy at times especially across Shetland where gales will become widespread later in the night. A mild night. Minimum Temperature 6C.Rather cloudy with a litlte rain at times, this mainly light but with some heavier outbreaks for a few hours later in the day. Southwesterly gales. Colder, showery in evening. Maximum Temperature 9C.",2020-01-01 09:00:03,1
501,he,2020-01-01 04:00:00,1,day1to2,Headline:Today:Tonight:Thursday:,"Windy and mild. Patchy rain west, otherwise dry.Mainly cloudy with patchy rain in the west but a few bright spells in the east. Strong south to southwestely winds, with gales developing over the Hebrides, perhaps severe later. Maximum Temperature 10C.Generally dry, a few clear spells in east, cloudier in west. Some rain reaching the Hebrides and Northwest Highlands towards morning. Mild, but windy, severe gales in far northwest. Minimum Temperature 7C.Rain, heavy in west, spreading southeast in morning then brightening up for a while. Another band of rain will follow in the afternoon, then clearer and colder in evening, Gales. Maximum Temperature 12C.",2020-01-01 09:00:03,1
502,gr,2020-01-01 04:00:00,1,day1to2,Headline:Today:Tonight:Thursday:,"Dry with sunny spells.A dry start to New Year with some sunny spells. A breezy day along the east coast and over the hills but lighter winds in shelter with a few colder spots inland. Maximum Temperature 7C.Staying dry with some clear spells though southern Aberdeenshire may turn cloudier later. A windy night especially along east coast. Lighter winds at first may allow inland frost. Minimum Temperature 2C.Early brightness in north fading as rain crosses from the west, the rain mainly affecting southern and western areas. Brightening up again later but some evening rain. Windy, mild. Maximum Temperature 11C.",2020-01-01 09:00:03,1
503,st,2020-01-01 04:00:00,1,day1to2,Headline:Today:Tonight:Thursday:,"Mainly dry, breezy.A mostly dry start to New Year with bright or sunny intervals but also cloudy at times. A little light rain across northwest Argyll. Moderate to fresh southwesterly winds. Maximum Temperature 8C.Staying mainly dry, though cloud may thicken enough for some spots of rain towards Argyll. A fairly mild, with strengthening southwesterly winds, with coastal gales. Minimum Temperature 5C.A cloudy, mainly dry start, then rain will cross east during the morning. A short brighter spell before another rainband crosses in the evening. Mild, windy. Maximum Temperature 11C.",2020-01-01 09:00:04,1
504,ta,2020-01-01 04:00:00,1,day1to2,Headline:Today:Tonight:Thursday:,"Rather cloudy but dry.A dry day. A lot of cloud for most but a few bright or sunny intervals will break through in places, most likely along the east coast in the afternoon. Quite breezy along the east coast. Maximum Temperature 8C.Staying mainly dry, though cloud may thicken enough for some spots of rain towards Killin and Aberfoyle. A fairly mild, breezy night. Minimum Temperature 6C.A cloudy, mainly dry start, then rain will cross east around the middle of the day. A short brighter spell before another rainband crosses in the evening. Mild, windy. Maximum Temperature 11C.",2020-01-01 09:00:04,1


## Subqueries in SELECT statement
We can have nested SELECT statements.
```sql
SELECT Field1,(SELECT Field2
  FROM Table_name2
  )
FROM Table_name1
```

## Example 5
Working on the `metoffice_dailyweatherdata`. Return a resulting table with
two columns: one column contains all the windspeed, another one known as `max_wind` contains the maximum windspeed computed from this table.
###Hint
- Tables:
  + metoffice_dailyweatherdata--- windspeed
- Information:
  + windspeed
  + Max_wind
- Conditions(filters) in WHERE:
  + no
- Any groups?
  + No

In [None]:
%%sql
SELECT windspeed,(SELECT MAX(windspeed)
  FROM  metoffice_dailyweatherdata
  ) AS max_wind
FROM metoffice_dailyweatherdata LIMIT 10;

 * sqlite:///weather.db
Done.


windspeed,max_wind
21,72
22,72
24,72
23,72
18,72
24,72
33,72
36,72
32,72
34,72


## Subqueries in FROM statement
```sql
SELECT Field1
FROM (SELECT Field1 ,Field2
      FROM Table_name )
New_table_name
```

## Example 6
Working on the table `metoffice_dailyweatherdata`. For each `locationID`,
compute the maximum temperature (`max_temperature`). Based on this result, filtering those locationIDs and maximum temperatures with max_temperature larger than 10.
### Hint
- Tables:
  + metoffice_dailyweatherdata--- temperature,
  + locationID
- Information:
  + locationID
  + Max_temperature
- Conditions(filters) in WHERE:
  + Max_temperature >10
- Any groups?
  + `locationID` (for each locationID compute its corresponding maximum temperature)

In [None]:
%%sql
SELECT * FROM
    (SELECT LocationID,
      MAX(temperature) AS max_temperature
      FROM metoffice_dailyweatherdata
      GROUP BY LocationID) new_table
  WHERE max_temperature > 10
  LIMIT 10;

 * sqlite:///weather.db
Done.


LocationID,max_temperature
3023,10.2
3031,11.4
3034,10.2
3044,12.7
3066,10.7
3080,10.7
3091,10.6
3100,10.1
3105,10.7
3111,11.3


## Subqueries in WHERE statement
```sql
SELECT Field1
  FROM Table_name1
  WHERE Field1 IN any (SELECT Field2
  FROM Table_name2 )
```

## Example 7
Working on tables `zones` and `timezone`. Picking out records from `zones`
whose `zone_id` are within the range of unique `zone_id` numbers in table `timezone`.
### Hint
- Tables:
  + zones
  + timezone
- Information:
  + All columns from zone
- Conditions(filters) in WHERE:
  + “Zone_id” IN the range of all the unique “zone_id” picked out from “timezone”
- Any groups?
  + no

In [None]:
%%sql
SELECT * FROM zones
WHERE zone_id IN (SELECT DISTINCT
  zone_id
  FROM timezone) LIMIT 10;

 * sqlite:///weather.db
Done.


zone_id,country_code,zone_name
1,AD,Europe/Andorra
2,AE,Asia/Dubai
3,AF,Asia/Kabul
4,AG,America/Antigua
5,AI,America/Anguilla
6,AL,Europe/Tirane
7,AM,Asia/Yerevan
8,AO,Africa/Luanda
9,AQ,Antarctica/McMurdo
10,AQ,Antarctica/Rothera


# Exercises
MySQL can open multiple databases so we need to specify the name of the database (e.g. `DATABASE_NAME.TABLE_NAME`).
But `SQLite` (the SQL database that we use in this notebook) opens only one database at a time and we don't use the name of the database. As an example the table `cat_regions` is:
- **MySQL**: `weather_db.cat_regions`
- **SQLite**: `cat_regions`

## Exercise 1
Working on the table “cat_locations”. Picking out all the records with “Latitude” smaller than those records with location to be “London”.

###Hint
- Step 1: pick out all those latitude values with location assigned as ‘London’. (nested to WHERE statement)
- Step 2: From cat_location (FROM...), pick out all the records (SELECT), with condition such that `Latitude` < London’s latitude (WHERE ... <need a nested sql command get from step 1>)

In [None]:
%%sql
select * from cat_locations where Latitude < (select Latitude from cat_locations where Location='London');

 * sqlite:///weather.db
Done.


LocationID,Location,PostCode,Country,Region,region_description,Latitude,Longitude,Continent,Elevation,nationalPark,unitaryAuthArea
3,Southampton Airport,,,se,South East,50.9503,-1.3567,Europe,11.0,,Hampshire
5,London City Airport,,,se,South East,51.5048,0.058,Europe,5.0,,Greater London
6,Lydd,,,se,South East,50.9561,0.9392,Europe,4.0,,Kent
349,Dungeness B,,,se,South East,50.9134,0.9597,Europe,5.0,,Kent
3707,Chivenor,,,sw,South West,51.089,-4.149,Europe,6.0,,Devon
3710,Liscombe,,,sw,South West,51.087,-3.608,Europe,348.0,Exmoor National Park,Somerset
3716,St-Athan,,,wl,Wales,51.405,-3.44,Europe,49.0,,Vale of Glamorgan
3740,Lyneham,,,sw,South West,51.5031,-1.9924,Europe,145.0,,Wiltshire
3743,Larkhill,,,sw,South West,51.201,-1.805,Europe,132.0,,Wiltshire
3746,Boscombe Down,,,sw,South West,51.161,-1.754,Europe,126.0,,Wiltshire


## Exercise 2
Working on the table `metoffice_dailyweatherdata`. Return a resulting table that contain 5 columns: `LocationID`, `obs_datetime`, `temperature`, `windspeed`, and `windchill`.

The first four columns can be taken from `metoffice_dailyweatherdata` directly. The `windchill` should computed following from the formula below:

$Windchill$ = $13.12 + 0.6215 \times  Temperature − 11.37 \times (Wind Speed)^{0.16} + 0.3965 \times Temperature \times Wind speed ^{0.16}$
Note: The function for x raised to the power of y $x^y$ is `POW(x,y)`

### Hint
- Tables: “metoffice_dailyweatherdata”
- Information:
  + LocationID, obs_datetime, temperature, windspeed, and windchill (SELECT ...)
- Conditions: no


In [None]:
%%sql
SELECT LocationID,
obs_datetime,
temperature,
windspeed,
(13.12 + 0.6215 * temperature - 11.37 * POW(windspeed,0.16) + 0.3965 * temperature * POW(windspeed,0.16)) AS windchill
FROM metoffice_dailyweatherdata;

 * sqlite:///weather.db
Done.


LocationId,obs_dateTime,temperature,windspeed,windchill
3002,2020-01-01 00:00:00,7.5,21.0,4.115280457706902
3002,2020-01-01 01:00:00,7.5,22.0,4.013182382060215
3002,2020-01-01 02:00:00,7.9,24.0,4.332481666637042
3002,2020-01-01 03:00:00,7.5,23.0,3.914910949590676
3002,2020-01-01 04:00:00,8.0,18.0,5.073782751260369
3002,2020-01-01 05:00:00,8.3,24.0,4.844797971368533
3005,2020-01-01 00:00:00,6.9,33.0,2.30122510339998
3005,2020-01-01 01:00:00,6.9,36.0,2.0894355980342016
3005,2020-01-01 02:00:00,6.9,32.0,2.3754217078335405
3005,2020-01-01 03:00:00,7.4,34.0,2.8881814551016296


## Exercise 3
Working on the table "zones". Find out those zone names (`zone_name`) which contain more than two details (delimited by /).
Examples records: `America/Indiana/Marengo`

### Hint
- Step 1: find out those records with “zone_name” contain more than two details and tag a 0/1 flag (checkup). The resulting table contains the “zone_name” and “checkup”. (You can use REGEXP '^[_a-zA-Z]+/[_a-zA-Z]+/' search the whole alphabet and the underscore character, e.g. North_Dakota.)
- Step 2: working on the table obtained from step 1. (Put the sql command of step 1 in the FROM statement.) filtering those “zone_name” with “checkup” equal to 1.

In [None]:
%%sql
SELECT zone_name FROM (SELECT *, zone_name REGEXP '^[_a-zA-Z]+/[_a-zA-Z]+/' AS checkup
FROM zones
WHERE checkup = 1);

 * sqlite:///weather.db
Done.


zone_name
America/Argentina/Buenos_Aires
America/Argentina/Catamarca
America/Argentina/Cordoba
America/Argentina/Jujuy
America/Argentina/La_Rioja
America/Argentina/Mendoza
America/Argentina/Rio_Gallegos
America/Argentina/Salta
America/Argentina/San_Juan
America/Argentina/San_Luis


## Exercice 4
Working on the table “metoffice_dailyweatherdata”. Return one column which computes the difference between temperature and the average temperature. (temperature minus the average temperature)

### Hint
- Step 1: compute the average temperature based on the table metoffice_dailyweatherdata. (nest the sql command in the SELECT statement)
- Step 2: compute the temperature minus the average temperature (obtained from step 1), and return the final values.


In [None]:
%%sql
SELECT temperature - (SELECT AVG(temperature) FROM metoffice_dailyweatherdata) AS Temp_Diff
FROM metoffice_dailyweatherdata;

 * sqlite:///weather.db
Done.


Temp_Diff
0.7983399999999747
0.7983399999999747
1.198339999999975
0.7983399999999747
1.2983399999999747
1.5983399999999754
0.1983399999999751
0.1983399999999751
0.1983399999999751
0.6983399999999751
